speech to text thing that runs on my home server.
- create a
.envfile (OpenAI compatible API required):LLM_API_KEY=abc123 LLM_BASE_URL=https://api.example.com/v1/chat/completions LLM_MODEL=llama-3.1-8b-instant - create a
dictionary.txtfile (words to recognize, one per line)Rony Parakeet Doraemon - install dependencies:
uv sync - start the server:
uv run uvicorn main:app --host 0.0.0.0 --port 8000 --env-file .env
curl -X POST "http://localhost:8000/transcribe" \
-H "Content-Type: multipart/form-data" \
-F "file=@/path/to/audio.wav"