Skip to content
View akshaysatyam2's full-sized avatar

Block or report akshaysatyam2

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
akshaysatyam2/README.md

Akshay Kumar



About Me

Computer Vision Engineer | Data Engineer | ML Researcher
Real-time CV on edge devices • Scalable Data Pipelines • Deep Learning • MLOps

Passionate about building production-grade solutions using Data Engineering, Computer Vision, and Deep Learning. Expertise in architecting high-performance pipelines and deploying AI at scale.


GitHub Activity & Stats

GitHub Stats




Skills

Category Technologies
Core ML/AI Computer Vision • NLP • Deep Learning • LLMs • RAG
Data Engineering PySpark • dbt (data build tool) • Airflow • Kafka • ETL/ELT • Lakehouse
Frameworks TensorFlow • PyTorch • OpenCV • YOLO • Hugging Face • LangChain
Languages Python • SQL
Tools & Platforms Docker • Kubernetes • Terraform • Kafka • MQTT • FastAPI • Git • Linux
Cloud & Edge AWS (S3, Glue, Redshift, Bedrock, Lambda) • Azure Edge • MLOps • CI/CD

Featured Projects

🐾 PupsN Vision System - AI-Powered Pet Analytics

Tech Stack: Python, YOLO26 Nano, OSNet, ONNX Runtime, OpenCV, Flask-SocketIO, HTML5 Canvas

The Objective: Traditional pet monitoring systems are often limited by high latency and static analysis. My goal was to build a high-performance, edge-optimized system capable of real-time detection, persistent tracking, and behavior classification, ensuring a smooth 30 FPS visual experience even on resource-constrained hardware.

Technical Architecture & Implementation:

  • Multi-Threaded Decoupling: I architected a dual-thread system to prevent frame drops. The stream_worker manages the 30 FPS visual layer via WebSockets, while the ai_worker executes the heavy inference pipeline asynchronously at maximum CPU potential.
  • 4-Stage ONNX Pipeline: I implemented an optimized pipeline using onnxruntime featuring YOLO26 Nano for detection, a stateful IoU Tracker for persistent ID assignment, OSNet for pet re-identification, and MobileNetV2 for real-time behavior classification (Sit, Stand, Lie).
  • Edge-First Optimizations: To ensure long-term stability on devices like Raspberry Pi, I integrated manual memory management (gc.collect), RAM-cached SQLite lookups for zero-latency weight serialization, and HTML5 Canvas rendering to bypass DOM-based memory leaks.

⚙️ Scalable Data Engineering Pipeline (Lakehouse Architecture)

Tech Stack: PySpark, dbt (data build tool), Terraform, AWS (S3, Glue, Redshift), Airflow, Docker

Designed and implemented a robust, end-to-end data engineering pipeline to handle large-scale data processing and analytics. This project demonstrates expertise in building modern data stacks using the Lakehouse architecture.

Key Technical Features:

  • Infrastructure as Code (IaC): Leveraged Terraform to provision and manage AWS resources (S3 buckets, Redshift clusters, Glue crawlers), ensuring reproducible and scalable environments.
  • Data Orchestration: Utilized Apache Airflow to schedule and monitor complex ETL workflows, ensuring data consistency and timely availability.
  • Big Data Processing: Implemented high-performance data transformations using PySpark, handling multi-terabyte datasets with optimized partition strategies.
  • Data Modeling & Transformation: Employed dbt for modular, version-controlled SQL transformations within the data warehouse, implementing rigorous testing and documentation.
  • Lakehouse Architecture: Built a unified platform for both BI and AI by combining the low-cost storage of S3 with the high-performance querying of Redshift, managed by AWS Glue Catalog.

🧠 Building a Generative Diffusion Engine from Scratch

Tech Stack: Python, PyTorch, OpenCV, Generative AI, Deep Learning

The Objective: While modern APIs make it easy to generate images with a single line of code, I wanted to deeply understand the fundamental mathematics powering state-of-the-art models like DALL-E and Sora. My goal was to build, debug, and train a Denoising Diffusion Probabilistic Model (DDPM) entirely from scratch using raw PyTorch.

Technical Architecture & Implementation:

  • The U-Net Bottleneck: I built a custom YOLO-style U-Net equipped with skip connections to preserve spatial integrity. To allow the network to track its exact position in the 1000-step denoising process, I engineered and injected Sinusoidal Position Embeddings directly into the bottleneck.
  • The Math Schedule: I implemented the forward diffusion process (adding noise) and the reverse diffusion process (removing noise) using a mathematically rigorous linear beta schedule.
  • Scaling to Production: I scaled the model to train on the full 60,000-image MNIST dataset, adjusting tensor operations to handle 32x32 dimensional scaling and massively increasing batch sizes for gradient stability.

View Code


📷 YOLO Object Detection Suite

Tech Stack: YOLOv11, Object Detection, Computer Vision

This project focuses on object detection using YOLOv11 and other YOLO models. It automates the process of identifying objects in images, making it useful for various real-world applications like surveillance, quality control, and smart monitoring.

View Code


🐾 Multi-Class Object & Text Recognition Pipeline

Tech Stack: Object Detection, OCR, Computer Vision, Python

This project solves a very practical problem by detecting humans and animals in images and videos, classifying them correctly, and optionally running Optical Character Recognition (OCR) on the same media.

View Code


🚁 Real-Time Drone Tracking System

Tech Stack: YOLO, SSD, Object Tracking, Computer Vision

An advanced system to detect and track drones effectively using YOLO and SSD (Single Shot MultiBox Detector) object detection models.


View Code


🔳 Robust QR Code Detection

Tech Stack: OpenCV, QR Detection, Computer Vision, Python

A robust QR code detection and decoding pipeline using OpenCV, designed to handle challenging real-world conditions including noisy, rotated, or low-contrast images.


View Code


🔬 Drone Image Classification Research (CNN vs ResNet50)

Tech Stack: CNN, ResNet50, Deep Learning, Image Classification

This research provides a detailed comparative analysis of two deep learning approaches for drone image classification: standard CNNs and the ResNet50 architecture.


View Code


Portfolio

Check out my live projects →
https://akshaysatyam2.github.io/akshaysatyam2/


Let's Connect

Always open to collaborations, research discussions, or just a quick chat!

"Code. Deploy. Repeat."

Pinned Loading

  1. diffusion-model diffusion-model Public

    This repository contains a modular PyTorch implementation of a Denoising Diffusion Probabilistic Model (DDPM) trained on the MNIST dataset. The model generates 32x32 images of hand-drawn digits con…

    Python

  2. Test-Cheating-Detection Test-Cheating-Detection Public

    This project uses AI and computer vision to detect academic dishonesty in exams. With YOLO for object detection and MediaPipe's pose estimation, it tracks head orientation, identifies people, and f…

    Python 1

  3. Traffic-Monitoring Traffic-Monitoring Public

    This repository contains project for monitoring of road traffic. It could be considered POC and many optimization are needed.

    Jupyter Notebook 8

  4. yolo-for-detection yolo-for-detection Public

    This project focuses on object detection using YOLOv11 or other YOLO models. It automates the process of identifying objects in images, making it useful for various real-world applications like sur…

    Python 1

  5. QRReader QRReader Public

    A robust QR code detection and decoding pipeline using OpenCV, designed to handle challenging real-world conditions including noisy, rotated or low-contrast.

    Python

  6. Deep-ML Deep-ML Public

    Documenting my journey solving Deep-ML problems. Contains Python solutions, mathematical reasoning, and algorithm explanations for Machine Learning Engineers.

    Python