A prototype chatbot designed to answer questions by referencing videos from Gary's Economics YouTube channel.
chatbot/
βββ src/ # Application source code
β βββ interfaces/ # Channel integrations
β β βββ chatbot.py # CLI chatbot interface
β β βββ telegram_bot.py # Telegram bot
β β βββ discord_bot.py # Discord bot
β βββ llm/ # LLM client management
β β βββ llm_manager.py # Multi-provider Ollama wrapper (priority-based fallback)
β β βββ prompt_template.py # RAG prompt builder (reads from prompt_versions)
β β βββ prompt_versions.py # Versioned prompt texts (v1βv4+)
β βββ rag/ # RAG pipeline
β β βββ rag_manager.py # LangGraph retrieve β generate graph
β β βββ vector_database.py # Chroma vector store factory for query-time access
β β βββ video_links.py # YouTube video link generation
β β βββ langfuse_helpers.py # Langfuse tracing utilities
β βββ config.py # Central configuration (pydantic-settings)
βββ content_database/ # Content database (vector DB management, separate from runtime)
β βββ scripts/ # Import and inspection scripts
β β βββ import_documents.py # Import SRT subtitles into Chroma
β β βββ srt_splitter.py # SRT chunking with overlap
β β βββ vector_database_manager.py # DB init, search, add documents
β β βββ collections_viewer.py # DB inspection utility
β β βββ tests/ # Tests for content database scripts
β βββ docs/ # Source documents for the knowledge base
β β βββ channel_topics.md # Topics covered per video
β β βββ gary_bio.md # Gary's biography
β β βββ video_transcripts/ # Video transcript files (SRT)
β βββ data/ # Chroma vector database files (content DB)
β βββ config.py # Content database configuration
βββ tests/ # pytest test suite (main chatbot)
βββ analytics/ # Analytics scripts and data
β βββ config.py # Analytics configuration
β βββ db/ # Database setup + import/export pipeline
β β βββ export.py # Langfuse trace export
β β βββ setup_database.py # SQLite analytics DB setup
β β βββ trace_importer.py # Import traces from Langfuse export
β βββ scripts/ # Manual tools and one-off tasks
β β βββ ask_questions.py # Batch question runner
β β βββ questions_for_testing.py # Test question sets
β β βββ test_cloud_limits.py # Cloud provider limit testing
β β βββ trace_viewer.py # Trace inspection utility
β β βββ trace_viewer_old.py # Viewer for pre-2026-03-28 data (analytics_old.db)
β βββ raw_data/ # Exported trace data (JSON/CSV)
βββ plan/ # Project plans and reports
β βββ phase_1/ # Testing Phase 1 reports
β βββ phase_2/ # Testing Phase 2 plans and evaluation
βββ pyproject.toml # Project metadata and dependencies
βββ learning.md # Developer learning tracker (checked by Claude)
βββ TODO.md # Pending tasks and investigations
βββ docker-compose.yml # Docker Compose (Telegram + Discord bot services)
βββ Dockerfile # Docker image (Python 3.11-slim)
βββ .github/workflows/ # CI/CD (Docker build + push to GHCR)
Channels (CLI / Telegram / Discord)
β
βΌ
βββββββββββββββββββ
β RAG Pipeline β β LangGraph (retrieve β generate)
β (RAG_manager) β
ββββββββββ¬βββββββββ
β
ββββββ΄βββββ
β β
βΌ βΌ
Vector DB LLM Manager
(Chroma) (Ollama: priority-based provider fallback)
- Channels β CLI, Telegram bot, Discord bot. Each receives a question and calls the RAG pipeline.
- RAG pipeline β LangGraph graph: retrieves relevant documents from the vector DB, builds a versioned prompt with context, calls the LLM.
- Vector database β Chroma with Ollama embeddings. Stores chunked SRT subtitles with video metadata. Content import scripts live in
content_database/scripts/; runtime query access viasrc/rag/vector_database.py. - LLM manager β Wraps Ollama clients with priority-based provider fallback. Chat: cloud (
qwen3-next:80b) β self-hosted (qwen3:32b) β local (qwen3:4b). Embeddings: self-hosted β local (qwen3-embedding:8b). Provider priority is configured insrc/config.py. - Analytics β SQLite database for traces (questions, answers, latency, models, vector search results). Imported from Langfuse. Scripts in
analytics/. See analytics/ANALYTICS_GUIDE.md for details.
- Python 3.11+, <3.14.1
- pytest for testing
- ruff for linting and formatting
- LangChain + LangGraph for RAG pipeline and LLM integration
- Chroma (
chromadb,langchain-chroma) as vector database - Ollama (
langchain-ollama) for LLM and embeddings - pydantic-settings for typed configuration with automatic
.envloading - Langfuse for LLM observability and tracing
- pysrt for SRT subtitle parsing
- python-dotenv for environment variable loading
- python-telegram-bot and discord.py for bot integrations
- Docker + GitHub Actions for CI/CD
We use Large Language Models (LLMs) for two things: embedding documents into the vector database, and answering user questions. All models run through Ollama.
Install Ollama:
sudo apt install curl
curl -fsSL https://ollama.com/install.sh | shEmbedding model β used to process subtitles before importing them to the vector database. The process is called embedding: it converts text into numerical vectors that represent concepts, which are later used to find related content when searching.
ollama pull qwen3-embedding:8bYou can use a different embedding model β check the Ollama library for options and change embeddings_model in src/config.py.
Chat model β used to answer user questions. The chatbot tries providers in priority order (configured in src/config.py): cloud (qwen3-next:80b) β self-hosted (qwen3:32b) β local (qwen3:4b). To use the local fallback:
ollama pull qwen3:4bIMPORTANT: The .env file contains secret keys and is not in the repository. Create your own by copying .env.sample:
cp .env.sample .envThen fill in the values. The available variables are:
Ollama servers
OLLAMA_LOCAL_HOST_URLβ URL of the local Ollama server. Defaults tohttp://localhost:11434.OLLAMA_SELF_HOSTED_URLβ URL of the self-hosted Ollama server.OLLAMA_CLOUD_URLβ URL of the cloud Ollama provider.OLLAMA_CLOUD_API_KEYβ API key for the cloud Ollama provider.
Backup LLM Providers
OPENROUTER_API_KEY- API key for OpenRouter they have a decent free tier to be used as a backup.
Bot tokens
TELEGRAM_TOKENβ Token for the Telegram bot. You get this when you create a bot with BotFather.DISCORD_TOKENβ Token for the Discord bot. You need a bot installed on a server with permission to read and send messages.
Observability
LANGFUSE_PUBLIC_KEYβ Public key for the Langfuse observability platform.LANGFUSE_SECRET_KEYβ Secret key for Langfuse.LANGFUSE_HOSTβ Langfuse server URL. Defaults tohttps://cloud.langfuse.com.
Vector database
DATABASE_PATHβ Path to the Chroma database directory. Defaults to./content_database/data/chroma_langchain_db.
To run the application you need the vector database with the processed subtitles. You can use a pre-built copy (recommended) or generate your own.
Using a pre-built copy:
- Download
chroma.sqlite3from this repository. - Place it at
content_database/data/chroma_langchain_db/chroma.sqlite3.
Alternatively, you can generate your own database β see Import documents to the database.
All configuration is managed through pydantic-settings classes. Environment variables from .env are loaded automatically β each field maps to an env var of the same name. Fields not set in .env use the defaults defined in the class.
There are three config files:
src/config.pyβ Main application settings (LLM providers, provider priority, bot tokens, Langfuse).content_database/config.pyβ Content database settings (chunk size, overlap, batch size, import directory).analytics/config.pyβ Analytics settings (database path, Langfuse keys).
Some variables in .env are shared across config files β the database path, Ollama URLs, API keys, and Langfuse keys are read by more than one config. Other settings like embeddings_model, collection_name, and video_ids_separator are defined in the main config and imported by content_database/config.py so they stay in sync.
docker compose build
docker compose upOr, to run in the background with automatic restart:
docker compose up -dpython3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install .This reads pyproject.toml and installs all dependencies. To also install development tools (ruff, pytest):
pip install ".[dev]"Activate the virtual environment first (not needed with Docker):
source .venv/bin/activateThe chatbot uses video subtitles (SRT format) as its knowledge base. Before the chatbot can use them, they must be imported into the vector database.
Place the SRT files you want to import in the content_database/docs/video_transcripts/ folder, then run:
python -m content_database.scripts.import_documentsIf documents have already been imported, the script will ask if you want to delete the existing collection first. Answer "yes" only if you want to start from scratch.
Note: All documents in the folder will be imported one after the other. This can take a while, so try with just one subtitle first to get an idea of how long it takes.
python -m interfaces.chatbotThe chatbot will greet you and ask for a question. It searches the vector database for relevant context and uses it to generate an answer. Answers are better when the uploaded subtitles cover the topic asked about.
- Open the Telegram app (or web version).
- Start a conversation with @BotFather (look for the blue checkmark).
- Send
/newbotand follow the prompts to choose a name and username. - BotFather will give you a URL to access your bot and a token.
- Copy the token to
TELEGRAM_TOKENin your.envfile.
More information: Telegram Bot Tutorial.
python -m interfaces.telegram_bot- Add the Discord token to your
.envfile asDISCORD_TOKEN. - Set the channel name as
DISCORD_CHANNELin.env(or change the default insrc/config.py). - Launch the bot:
python -m interfaces.discord_botThe bot supports two modes of interaction:
- In channels: mention the bot with @botname followed by your question.
- In DMs: send a message directly to the bot β no mention needed.
pytest # main chatbot tests
pytest content_database/scripts/tests/ # content database testsWe are rolling out the chatbot in phases. See the plan/ folder for details.