Skip to content
View leonematt's full-sized avatar
πŸ–₯️
coding
πŸ–₯️
coding

Highlights

  • Pro

Organizations

@kernelize-ai

Block or report leonematt

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
leonematt/README.md

Matthew Leone

Website LinkedIn

πŸ‘¨β€πŸ’» About Me

I am an ML Infrastructure Engineer focused on the systems that make AI/ML workloads fast, reliable, and scalable in production from accelerator kernels and distributed inference platforms including the CI/CD, networking, and cloud infrastructure they run on.

Currently I work as a Founding AI/ML Infrastructure/Performance Engineer that leads infrastructure at Kernelize, where I build heterogeneous hardware tooling, Triton backends infrastructure, and custom PyTorch runtimes. My core strength is ML infrastructure and operations(MLOps) but I draw on a broad and deep systems background: high-performance networking (DPDK, RDMA, InfiniBand, sub-microsecond latency), Linux/FreeBSD kernel development, hypervisors and virtualization (KVM, Open vSwitch), cloud architecture (AWS), and security engineering in order to design ML platforms that hold up and deliver under heavy production demands.

πŸŽ–οΈ Certifications

NVIDIA Certified Professional: AI Infrastructure NVIDIA Certified Professional: AI Operations

NVIDIA Certified Associate: Generative AI Multimodal NVIDIA Certified Associate: Generative AI LLMs NVIDIA Certified Associate: AI Infrastructure and Operations AWS Certified Solutions Architect Associate AWS Certified SysOps Administrator Associate

AWS Certified Cloud Practitioner WorldQuant University Data Science Lab

🎯 Recent Focus (2025–2026)

  • ML Infrastructure: Building the Nexus runtime, Triton-Bench benchmarking platform, and Triton/PyTorch plugin inference stack at Kernelize
  • MLOps Tooling: Accelerator kernel capture/replay pipelines, hardware profiling with Nsight Compute, and roofline/occupancy analysis for Triton kernels
  • Distributed Inference: vLLM, Ray, and custom PyTorch PrivateUse1 backends for heterogeneous hardware
  • Compiler Internals: Triton, PTX, MLIR, and torch.compile/Inductor for cross-accelerator kernel portability
  • CI/CD for ML: Automated build, benchmarking, and artifact pipelines spanning multiple accelerator targets

πŸš€ Core Expertise

Primary β€” MLOps & ML Infrastructure

  • Accelerator kernel tooling, Triton backends, and PyTorch custom runtimes
  • Distributed inference platforms (vLLM, Ray) and model serving infrastructure
  • Accelerator profiling, roofline modeling, and kernel-level performance engineering
  • ML-focused CI/CD, artifact management, and benchmarking pipelines

Supporting Depth β€” Systems, Cloud, and Networking

  • High-Performance Networking: DPDK, VPP, RDMA, InfiniBand, kernel bypass
  • Cloud & Virtualization: AWS, KVM, Open vSwitch, Ceph, Docker, Kubernetes, Terraform
  • Low-Level Systems: Linux/FreeBSD kernel development, device drivers, hypervisors
  • Data Engineering: Pipelines for benchmark data, financial time-series, and fine-tuning datasets
  • Security Engineering: Reverse engineering, vulnerability research, secure virtualization

πŸ”§ Tech Stack

ML Infrastructure     : vLLM, Ray, PyTorch, torch.compile/Inductor, Triton, CUDA, PTX, MLIR
MLOps / CI-CD         : GitHub Actions, Jenkins, Docker, Kubernetes, artifact pipelines
Accelerator Profiling : NVIDIA Nsight, Nsight Compute (ncu), roofline analysis, Flamegraph
Languages             : C/C++, Python, Assembly (x86/ARM), Java, Bash
Cloud & IaC           : AWS, Terraform, Docker, Kubernetes
Virtualization        : KVM, Open vSwitch, Ceph, hypervisor internals
Systems               : Linux, FreeBSD, NVIDIA, DPDK, eBPF
Networking            : RDMA, InfiniBand, TCP/IP, DPDK, VPP
Data Engineering      : Apache Airflow, Spark, Pandas
Security              : IDA Pro, Ghidra, reverse engineering, secure ML pipelines

🏫 Education

  • M.S., Financial Engineering β€” WorldQuant University
  • B.S., Computer Science β€” Northern Illinois University

πŸ“Š GitHub Stats

Matthew's GitHub Stats Top Languages

Pinned Loading

  1. InfraMatrix/IGS InfraMatrix/IGS Public

    Python 1