โโโโโโโ โโโโโโโ โโโ โโโโโโโโโโโโโโ โโโโโโโ โโโโ โโโ
โโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโ
โโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ โโโโโโโโโ โโโ
โโโ โโโโโโโโโโ โโโ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโ
โโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโ
โโโโโโโ โโโ โโโโโโโ โโโโโโโโโโโ โโโโโโโ โโโ โโโโโ
The GPU Illusion โ AI Acceleration for Every Indian
Your laptop thinks it has a GPU now. It doesn't. That's the point.
GPUsion is an open-source Windows driver that tricks your CPU and operating system into believing a GPU is present โ then routes all AI inference workloads to a highly optimized CPU engine underneath.
Every AI app you install โ Ollama, Whisper, LM Studio โ sees a GPU in Device Manager. They run. You pay nothing extra.
No GPU required. No cloud subscription. No configuration hell.
India has 300 million+ laptops in the โน30,000โโน60,000 range.
None of them have a discrete GPU.
Every serious local AI workload requires one.
| What you want to run | What it needs | What it costs |
|---|---|---|
| Llama 3 8B locally | 8GB VRAM | โน25,000+ GPU |
| Stable Diffusion | RTX 3060 min | โน22,000+ GPU |
| Whisper (speech โ text) | GPU preferred | 5ร slower on CPU |
| Local coding assistant | GPU for real-time | โน800โ2,000/mo cloud |
The barrier isn't intelligence. It's access.
GPUsion removes the barrier.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ YOUR AI APP (Ollama / Whisper / LM Studio / etc.) โ
โ calls standard GPU APIs as normal โ
โโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ DirectML / Vulkan / OpenCL calls
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ GPUSION VIRTUAL DRIVER โ
โ Windows believes this is a real GPU adapter โ
โ "GPUsion Virtual Adapter โ 8GB VRAM" in Device โ
โ Manager. DXGI enumeration. WDDM 2.x compliant. โ
โโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ routed inference workloads
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ INFERENCE ENGINE โ
โ llama.cpp ยท ONNX Runtime ยท AVX2/AVX-512 SIMD โ
โ INT4/INT8 quantization ยท CPU RAM as VRAM proxy โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
The illusion is at the driver level. The performance is real.
โ ๏ธ GPUsion is in early development. This is not yet ready for production use.
Star the repo and watch for our first release.
# Coming in Phase 1 โ Month 4
# One-click installer. No command line required.
# gpusion-setup.exeFor developers who want to build from source:
git clone https://github.com/gpusion/gpusion-driver
cd gpusion-driver
# Enable test signing mode (development only)
# Run as Administrator:
bcdedit /set testsigning on
# Build (requires Windows Driver Kit)
./build.ps1
# Install driver
./install.ps1| Repo | Description | Status |
|---|---|---|
gpusion-driver |
WDDM virtual GPU kernel driver โ the core illusion | ๐จ Active |
gpusion-inference |
CPU inference translation layer (llama.cpp / ONNX) | ๐จ Active |
gpusion-installer |
One-click Windows installer for non-technical users | ๐ Planned |
gpusion-firmware |
FPGA/ASIC firmware for Phase 2 hardware dongle | ๐ Phase 2 |
gpusion-benchmark |
Standardized benchmarks vs. real GPU baselines | ๐ Planned |
| Framework | Priority | Status |
|---|---|---|
| Ollama | Critical | ๐จ In progress |
| llama.cpp | Critical | ๐จ In progress |
| ONNX Runtime DirectML | High | ๐ Planned |
| Whisper.cpp | High | ๐ Planned |
| AUTOMATIC1111 | Phase 2 | ๐ฎ Future |
Month 1โ2 โโโโโโโโโโโโโโโโ Driver skeleton โ WDDM adapter in Device Manager
Month 2โ3 โโโโโโโโโโโโโโโโ Ollama integration โ first LLM runs end-to-end
Month 3โ4 โโโโโโโโโโโโโโโโ One-click installer โ zero config for non-technical users
Month 4โ5 โโโโโโโโโโโโโโโโ PUBLIC LAUNCH โ GitHub + HackerNews + Reddit
Month 5โ6 โโโโโโโโโโโโโโโโ Benchmarks published โ honest numbers vs. real GPU
Month 6โ9 โโโโโโโโโโโโโโโโ PHASE 2 โ FPGA hardware dongle POC
Month 9โ12 โโโโโโโโโโโโโโโโ Hardware v1 โ GPUsion Stick @ โน2,499
Month 12โ24 โโโโโโโโโโโโโโโโ PHASE 3 โ Custom ASIC, Made in India
GPUsion Phase 1 (software) is not a GPU replacement. It is a GPU emulator that makes local AI possible on hardware that previously couldn't run it at all.
| Model | Real RTX 3060 | GPUsion Phase 1 | GPUsion Phase 2 (FPGA) |
|---|---|---|---|
| Llama 3 8B (INT4) | ~50 tok/s | ~6โ10 tok/s | ~20โ30 tok/s (est.) |
| Whisper Base | Real-time | ~0.5ร real-time | ~0.9ร real-time (est.) |
| Stable Diffusion | ~15 s/img | Not Phase 1 | ~60 s/img (est.) |
Slow is better than impossible.
Local is better than cloud.
Free is better than โน2,000/month.
GPUsion's driver and inference layer are MIT licensed โ completely free, forever.
The software is the proof. The hardware dongle (Phase 2) is the product.
We believe:
- Indian developers deserve infrastructure they can trust, audit, and improve
- Open source creates the community moat that makes this impossible to kill
- A working virtual GPU driver with real traction is the best pitch to any hardware partner
GPUsion is built by one person and one AI.
It needs you to become something bigger.
Where help is most needed right now:
- ๐ง Windows Driver Kit experience โ WDDM 2.x kernel driver development
- ๐ง llama.cpp / ONNX Runtime โ inference optimization on CPU
- ๐งช Testing โ Intel / AMD laptop compatibility across generations
- ๐ Documentation โ Hindi + English setup guides for non-technical users
- ๐ Translations โ Hindi, Tamil, Telugu, Bengali READMEs
Bounties available:
- โน5,000 โ First working DirectML โ CPU passthrough
- โน5,000 โ Successful Ollama model run via GPUsion driver
- โน2,500 โ Compatibility report for any new CPU/laptop model
- โน500 โ Documentation improvements
See CONTRIBUTING.md for details.
For the full architecture, phase specifications, risk analysis, and commercial strategy see the Product Requirements Document.
Key technical decisions:
Why WDDM and not a user-space shim?
Kernel-mode registration is required for apps that query the DirectX device list at startup. User-space intercepts miss these queries and apps refuse to run. WDDM is the only path to true OS-level illusion.
Why not just optimize CPU inference directly?
We do that too โ but without the virtual GPU layer, apps that check for GPU presence at startup simply refuse to launch. The illusion enables the inference, not the other way around.
Why INT4 quantization by default?
A 7B parameter model in FP16 needs ~14GB RAM. In INT4 it needs ~4GB. Most budget Indian laptops have 8GB RAM. INT4 is not a compromise โ it is the product.
GPU + Illusion = GPUsion
Inspired by Mohini โ the only female avatar of Vishnu, who created a perfect illusion to trick the Asuras from drinking Amrit.
GPUsion creates a perfect GPU illusion.
The Asuras are the hardware bottleneck.
The Amrit is local AI.
MIT License โ see LICENSE
The driver is free. The story is Indian. The mission is access.
Built in India ๐ฎ๐ณ for India โ and everyone else who deserves local AI
If this resonates, star the repo. That's how this story grows.
โญ Star ยท ๐ด Fork ยท ๐ Issue ยท ๐ฌ Discuss