A Premium AI Dictation "Dynamic Island" for Windows
Speak naturally, stutter, pause, or mumble. Flow catches your voice, polishes it into perfect text using state-of-the-art LLMs, and types it seamlessly into any application.
- 🏝️ True Dynamic Island UI: A beautiful, top-anchored notch built with PySide6 that expands elastically when you speak, mimicking the premium Apple aesthetic on Windows.
- 🎵 Reactive Neon Waveform: Features a math-driven, glassmorphism audio visualizer. The stems idle like musical notes and pulse with vibrant neon gradients (Cyan, Magenta, Purple) based on your real-time voice volume.
- ⚡ Blazing Fast Transcription: Uses heavily optimized
faster-whisper(CTranslate2) running locally to transcribe audio with near-zero latency. - 🧠 AI Text Polishing: Streams the raw transcription through Groq's
llama-3.3-70b-versatileengine to fix grammar, remove conversational filler, and structure the text professionally. - ⌨️ Universal Auto-Paste: Press the global hotkey (
F9) anywhere, dictate your thought, and Flow will automatically type the polished text directly into your active window.
- Idle State: A tiny, discreet notch at the top of your screen.
- Listening: Press
F9. The island expands smoothly. As you speak, the neon stems react to your voice. - Transcribing: You release
F9. The audio is instantly transcribed locally. - Polishing: The text is sent to Groq for split-second intelligent cleaning.
- Done: The island flashes green, and the polished text is typed wherever your cursor is!
git clone https://github.com/YOUR_USERNAME/flow.git
cd flowEnsure you have Python 3.10+ installed.
python -m venv .venv
.venv\Scripts\activatepip install -r requirements.txtCreate a .env file in the root directory and add your Groq API key:
GROQ_API_KEY=gsk_your_api_key_hereNote: The
.envfile is safely included in.gitignoreto prevent leaking your keys.
python main.py- UI Architecture: Python, PySide6 (Qt for Python)
- Audio Processing: PyAudio, NumPy
- Speech-to-Text: faster-whisper (CTranslate2)
- LLM Engine: Groq API (
llama-3.3-70b-versatile) - Automation: PyAutoGUI, Pyperclip, pynput
Building Flow required solving advanced OS-level rendering issues:
- Preventing GUI Thread Blocking: The entire audio capturing pipeline runs on separate background threads to ensure the UI continues animating at a buttery smooth 33fps without stuttering.
- Bypassing Qt Compositor Bugs: Animating heavy
QGraphicsDropShadowEffectwith translucent backgrounds caused severeQPaintercrashes on Windows NVIDIA drivers. I engineered a strict Z-order architecture, separating the static drop-shadow into an unmoving background layer (ShadowWidget) and calculating all foreground alpha opacities via manual math, achieving 100% stability.
This project is open-source and available under the MIT License.