Skip to content
View mixstam1821's full-sized avatar
🌍
🌍

Block or report mixstam1821

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mixstam1821/README.md

Typing SVG


👨‍💻 About Me

I'm a Data Engineer based in Greece, working remotely at NIKI Digital Engineering on automotive data pipelines for BMW and AUDI— processing millions of sensor records daily using AWS and PySpark.

My background combines production data engineering with 5 years of scientific research, where I built ETL pipelines processing 20+ TB of satellite climate data during my PhD at the University of Ioannina.

  • 🏢 Currently: Data Engineer @ NIKI Digital Engineering (BMW and AUDI external partner) — remote
  • ☁️ Stack: AWS Glue · Athena · S3 · PySpark · Python · SQL
  • 🔬 Background: PhD researcher — large-scale satellite data pipelines (ERA5, NASA, EUMETSAT)
  • 📍 Location: Greece

💼 Experience

Data Engineer & Test Automation Developer

NIKI Digital Engineering · Remote · Jun 2024 – Present

Data Engineering:

  • Build production ETL pipelines for automotive data using AWS Glue and PySpark, processing millions of sensor records daily
  • Design and maintain data tables for analytics used in production environments
  • Develop SQL transformations and data quality validation frameworks for time-series vehicle data
  • Collaborate with stakeholders to optimize pipeline performance and define data schemas
  • Tech: Python, PySpark, SQL, AWS (Glue, S3, Athena), Docker, Git

Test Automation:

  • Develop automated test frameworks for ECU validation using EXAM and Python
  • Hardware-in-the-loop (HIL) testing

Scientific PhD Researcher

University of Ioannina · Ioannina, Greece · Oct 2020 – Sep 2025

  • Built CLARISC — a cloud database integrating EUMETSAT and NASA satellite datasets, processing 20+ TB of climate data
  • Engineered ETL pipelines for large-scale satellite and reanalysis datasets (ERA5, CERES, MERRA-2) using Python, xarray, Dask, and SQL
  • Developed EarthSense — an interactive web application to visualize research results
  • Performed statistical analysis, data validation, and quality control on multi-dimensional time-series datasets
  • Published 4 peer-reviewed papers in high-impact journals (Atmospheric Research, Climatic Change)
  • Tech: Python (pandas, xarray, Dask, NumPy, SciPy), SQL, Flask, FastAPI, JavaScript, HPC, Linux, Docker, Git, Fortran

Key Projects:

  • 🌍 EarthSense — Interactive web app for PhD climate research results
  • 🛰️ NATEX — Satellite data viewer built with Python, Satpy, and Bokeh
  • 🌐 Aether — Quick netCDF explorer
  • 📊 ERMES — ERA5 and CAMS reanalysis data explorer

🛠️ Tech Stack

Category Tools
Languages Python SQL Java
Big Data PySpark Dask
Cloud (AWS) AWS Glue S3 Athena
Data Formats Parquet NetCDF JSON
DevOps Docker Git GitHub
Currently Learning Airflow dbt

📂 Featured Projects

PySpark tool that automatically profiles and validates any dataset before it enters a pipeline.

  • Detects nulls, duplicates, outliers, schema mismatches
  • Uses Spark SQL for column statistics and aggregations
  • Outputs a structured JSON quality report
  • Maps directly to AWS Glue → S3 → Athena production workflows

A clean, production-style ETL pipeline: Extract → Transform → Load using PySpark + SQL.

  • Cleans messy raw data (nulls, casing, invalid values)
  • Enriches with calculated columns and business logic
  • Spark SQL aggregations for analytics-ready output
  • Writes partitioned Parquet — same pattern as AWS S3/Athena

Pinned Loading

  1. ERMES ERMES Public

    ERA5 Meteorology Explorer System

    Python

  2. Cyclops Cyclops Public

    Live weather conditions via OpenWeatherMap.

    Python

  3. Aether Aether Public

    Aether: A fast netCDF Explorer

    Python 14 2

  4. NATEX NATEX Public

    .nat files explorer

    Python 1

  5. EarthSense EarthSense Public

    My PhD was made interactive through this web app! Thesis: Detailed assessment of the global dimming and brightening of the Earth under all-sky and clear sky conditions using modern tools and long-t…

    Python

  6. Hyperion Hyperion Public

    A Random Forest SSR Emulator trained with the EarthSenseData (Stamatis et al., 2025).

    Python