Skip to content
View ShreyPatel4's full-sized avatar
🏠
Working from home
🏠
Working from home
  • Boston

Block or report ShreyPatel4

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ShreyPatel4/README.md

Hi, I’m Shrey Patel πŸ‘‹

I build systems that are production-oriented, performance-driven, and reproducible β€” from low-latency C++ engines and paravirtual drivers to Python ML/automation tooling, tenant-fair LLM inference, and distributed control planes.

  • πŸ”­ Current work: founding Coconut Labs β€” tenant-fair LLM inference orchestration (kvwarden), observability, and Generative AI engineering
  • 🌱 Interested in: systems, distributed control planes, low-latency trading infra, tooling, and ML for system observability
  • ⚑ Fun: game QA automation, synthetic data & logs, and hardware-accelerated I/O

Recent Highlights β€” Last 2 Months (Mar–May 2026)

  • Founded Coconut Labs β€” independent inference research lab at coconutlabs.org. Shipped org profile, landing page (ccocnutlabs-LP), 4-dot brand mark, and contact / newsletter rebrand off the kvwarden domain.
  • Launched kvwarden v0.1.1 β†’ v0.1.3 β€” tenant-fair LLM inference orchestration on a single GPU (no Kubernetes). Renamed from InferGrid and ran a multi-gate fairness ladder on H100 / A100 with vLLM and SGLang covering Llama-3.1-70B (TP=4), Mixtral-8x7B MoE, and mixed prompt-length distributions. Highlights:
    • DRR-priority admission with per-tenant token-bucket rate limiting (closed a 523Γ— starvation gap to baseline).
    • Per-tenant TTFT histograms, fairness Grafana dashboard, opt-in anonymous telemetry receiver on Cloudflare Workers.
    • Streaming router fixes (admission-slot lifetime, real TTFT, max-stream-duration fence), interactive CLI + doctor + man pages.
    • Show HN launch with one-pager, FAQ, pitch, and architecture overview.
  • kvwarden β€” Gate 3 / T2 (cache-pressure admission) β€” RFC + KV-eviction config + runbook, T2 admission test skeleton, TenantPolicy + tenant_id surface stub, M4 bench harness flags (--prefix-overlap, --bias-flooder-cost), py3.13 CI, and a one-command Docker Compose eval bundle.
  • Minierva-SEPA (private) β€” algorithmic SEPA (Specific Entry Point Analysis) for Indian markets: NIFTY-500 screener with VCP detection, IBD-weighted RS percentile, 7-level exit hierarchy, walk-forward backtesting, TradingView Pine export.
  • mlxd (private) β€” research + strategy docs for tenant-fair LLM inference on Apple Silicon (pre-product).
  • solution_SnowConvertAI_final β€” SQL Server β†’ Snowflake migration take-home with a verification harness and head-to-head comparison vs. SnowConvert AI.
  • SP-K8s-Control-Panel-Using-Streamlit β€” Streamlit-based Kubernetes control panel for deployment scaling and pod operations.
  • dream_team β€” a 31-agent engineering organization running as a daemon, forking OpenClaw as the gateway/UI layer (Slack / Discord / Telegram bridges, web dashboard) with per-agent identity files (SOUL.md, AGENTS.md).
  • risk-hotpath-hft β€” overlay analytics + reporting with audit/compliance, local-first with optional cloud switches.

Spotlight Projects

(Additional repos: Data-Kitchen β€” A brewing hot product idea)

Repo of the week

Repo of the week

What I built this week

Nothing added to the journal yet.

Latest blog / RSS

RSS not configured.

Now coding

5dd0ed1 - chore: update README (repo-of-week & journal) (by github-actions[bot])

Languages & Tech Highlights

Based on public repositories:

  • Python, Rust, Go, C/C++, GDScript
  • LLM inference & serving: vLLM, SGLang, tenant-fair scheduling (DRR + token-bucket), per-tenant TTFT/Grafana telemetry
  • Systems engineering: low-latency C++ pipelines, Rust drivers, paravirtual devices
  • Automation & tooling: Python agents, synthetic data, QA automation
  • Cloud & infra: Kubernetes, observability, CI/CD, Cloudflare Workers

How to collaborate

Open to collaborations, contract work, and speaking about systems & infra. Best contact: patelshrey77@gmail.com

License

This profile README is available under CC0 β€” reuse as you like.

Pinned Loading

  1. react/react react/react Public

    The library for web and native user interfaces.

    JavaScript 246k 51.1k

  2. facebook/infer facebook/infer Public

    A static analyzer for Java, C, C++, and Objective-C

    OCaml 15.7k 2.1k

  3. Advanced-Data-Predictive-Analytics Advanced-Data-Predictive-Analytics Public

    Advanced analytics which is used to make predictions about unknown Test-Cases From Test-Data. Predictive analytics uses many techniques from data mining, statistics, modeling, machine learning, and…

    Jupyter Notebook

  4. facebookarchive/react-360 facebookarchive/react-360 Public archive

    Create amazing 360 and VR content using React

    JavaScript 8.7k 1.2k