Zero-setup ML inference for Flutter using Rust engines (Candle, Linfa)
-
Updated
Jul 12, 2025 - Dart
Zero-setup ML inference for Flutter using Rust engines (Candle, Linfa)
C++23 inference engine for Qwen3-4B on Apple Silicon. 125.77 tok/s (1.62x llama-cli). OpenAI-compatible streaming server. No Python, no llama.cpp, no Ollama.
Add a description, image, and links to the inferernce topic page so that developers can more easily learn about it.
To associate your repository with the inferernce topic, visit your repo's landing page and select "manage topics."