TokenSpeed is a speed-of-light LLM inference engine.
-
Updated
May 17, 2026 - Python
TokenSpeed is a speed-of-light LLM inference engine.
Ollama based Benchmark with detail I/O token per second. Python with Deepseek R1 example.
Add a description, image, and links to the tokenspeed topic page so that developers can more easily learn about it.
To associate your repository with the tokenspeed topic, visit your repo's landing page and select "manage topics."