A local desktop speech-to-text app. Hold a key, speak, release — transcribed on-device and pasted into the active app. No API calls, no cloud.
- Python 3.12+
uv— installffmpeg— required by faster-whisper- Windows builds: Inno Setup — required to produce the installer
# macOS
brew install ffmpeg
# Windows
winget install ffmpeguv run go.py setup
uv run go.py runuv run go.py setup # Install dependencies
uv run go.py run # Run the app
uv run go.py dev # Run with auto-restart on file changes
uv run go.py build # Build .app (macOS) or .exe (Windows)
uv run go.py build:onefile # Build single-file executable
uv run go.py lint # Run ruff linter
uv run go.py clean # Clean build artifactsAll configuration is via environment variables:
| Variable | Default | Description |
|---|---|---|
WHISPER_MODEL |
base |
Model size: tiny, base, small, medium, large |
WHISPER_DEVICE |
auto | cpu, cuda, or auto |
WHISPER_COMPUTE_TYPE |
int8 |
int8, float16, etc. |
WHISPER_LANGUAGE |
auto-detect | Force a language code (e.g. en, hi) |
WHISPER_BEAM_SIZE |
5 |
Decoding beam size |
WHISPER_INITIAL_PROMPT |
none | Bias vocabulary or style |
WHISPER_AUTO_TRANSLATE_EN |
1 |
Translate all speech to English |
WHISPER_AUTO_TYPE |
1 |
Auto-paste transcription into active app |
WHISPER_PASTE_MODE |
keys |
keys, osascript, or clipboard |
WHISPER_SEND_ENTER |
0 |
Press Enter after pasting |
WHISPER_INPUT_DEVICE |
system default | Device index or name substring |
WHISPER_TRIGGER_KEY |
fn |
fn, left_ctrl |
WHISPER_HIDE_UI |
0 |
Hide the floating bar |
WHISPER_VAD_FILTER |
0 |
Enable VAD filtering |
WHISPER_MIN_AUDIO_RMS |
0.003 |
Minimum RMS to transcribe |
WHISPER_MIN_AUDIO_PEAK |
0.01 |
Minimum peak to transcribe |
Requested automatically on first use:
- Microphone — to record speech
- Accessibility — to paste via
Cmd+V - Input Monitoring — to detect the
Fnkey globally