A fully client-side Retrieval Augmented Generation chatbot built with the Anthropic Claude API. Upload any document, watch it get chunked and embedded in real time, then ask questions — the app retrieves the most relevant sections and uses them to ground every answer.
| Step | What happens |
|---|---|
| 1. Ingest | User uploads a .txt or .md file (or loads the sample doc) |
| 2. Chunk | Document split into paragraph-aware segments |
| 3. Embed | Each chunk encoded as a TF-IDF vector (production: swap for OpenAI/Voyage embeddings) |
| 4. Retrieve | Query embedded → cosine similarity search → top-k chunks returned |
| 5. Augment | Retrieved chunks injected into Claude's system prompt as grounding context |
| 6. Generate | Claude generates a factually grounded answer, citing source chunks |
- Full RAG pipeline visualization — watch each step light up in real time
- Semantic chunk retrieval — cosine similarity over TF-IDF vectors
- Source attribution — every answer cites which chunks grounded it
- Chunk browser — sidebar shows all indexed chunks; active ones highlight on query
- Vercel serverless proxy — API key stays secure on the server, never exposed to the client
- Drag & drop file upload — supports
.txt,.md,.csv - Dark mode — respects system preference
Visit the live demo — no setup needed.
To run locally:
git clone https://github.com/Shaonlib/rag-chatbot.git
cd rag-chatbot
python3 -m http.server 8080Open http://localhost:8080.
| Component | This project | Production |
|---|---|---|
| Embeddings | TF-IDF vectors | OpenAI text-embedding-3-small or Voyage AI |
| Vector store | In-memory JS array | Pinecone, Weaviate, or ChromaDB |
| Search | Linear cosine scan | HNSW approximate nearest-neighbor |
| Backend | Vercel serverless | FastAPI or Node.js |
- Claude claude-sonnet-4-20250514 via Anthropic Messages API
- Vanilla HTML / CSS / JavaScript — zero frontend dependencies
- Vercel serverless function as API proxy
MIT