AI engineer working across speech systems, LLM applications, and the backend infrastructure that makes them usable in production.
I have spent years building around ASR, TTS, callbot/chatbot systems, LangChain and LangGraph agents, RAG, realtime audio, and practical automation. I care about systems that are fast enough, observable enough, and simple enough to keep running after the demo.
Ho Chi Minh City, Vietnam
GitHub: https://github.com/thanhdat77
- Speech AI: ASR, TTS, VAD pipelines, realtime audio over WebSocket, callbot flows
- LLM applications: LangChain, LangGraph, tool calling, RAG, stateful agents
- Backend systems: FastAPI, async Python, Kafka, Redis, MongoDB, Qdrant, MinIO
- Production workflow: Docker, uv, pytest, ruff, observability, API contracts
- Developer environment: Neovim, tmux, dotfiles, Pi agent workflows, custom keyboards
Event-driven callbot services that connect ASR, NLU/LLM logic, TTS, session state, and business actions.
- WebSocket gateway for realtime audio sessions
- Kafka-based ASR/TTS/event routing
- Redis-backed hot state, cache, and coordination
- Vietnamese text normalization for speech output
- Graph-style call flows with interrupt, fallback, and action handling
Tech: Python, FastAPI, WebSocket, Kafka, Redis, Docker
LangGraph-based chatbot architecture for multi-turn customer workflows, retrieval, state, and backend-facing actions.
- LangGraph agent runtime with conversation state and routing
- RAG over Qdrant with document ingestion and reranking paths
- MongoDB/Beanie for durable data, Redis for hot state and checkpointing
- SSE/streaming API surface for frontend and backend integration
- Provider-wrapped LLMs and embeddings
Tech: Python 3.12, FastAPI, LangGraph, LangChain, Qdrant, MongoDB, Redis
ASR API service for uploaded files, local media, MinIO objects, and raw bytes.
- VAD segmentation before ASR
- Remote ASR worker pool over WebSocket
- Sync and async callback modes
- Redis caching and MinIO integration
- Public and legacy API contracts
Tech: FastAPI, WebSocket, Redis, MinIO, Docker, pytest
TTS API and test harnesses for comparing providers, saving generated WAV files, and measuring runtime behavior.
- Queue-based TTS API with callback and polling support
- Provider-agnostic TTS testkit
- Local and remote provider adapters
- Experiments with F5-TTS, VITS, Kokoro, StyleTTS2, OmniVoice/Triton
- GPU-aware benchmarking and Dockerized model runs
Tech: Python, FastAPI, Redis, ONNX Runtime, Triton, CUDA-aware workflows
Languages
AI and LLM
Backend and Data
Tools
- I tune my workflow around Neovim, terminal tools, and dotfiles.
- I like experimenting with Pi agent style automation for day-to-day coding.
- I enjoy custom keyboards, firmware/configs, and efficient layouts.
