Multi-User Email RAG

Production-grade RAG over 10,000+ emails with per-user vector isolation. E5-small-v2 embeddings · Mistral 7B · pgvector · JWT auth · sub-200ms retrieval

Architecture

Client
  │  JWT token
  ▼
FastAPI  ──► JWT auth ──► user_id extracted
  │
  ├── /ingest   ──► e5-small-v2 embed ──► pgvector (user_id, vector, chunk)
  │
  └── /query    ──► embed query
                      │
                      ▼
                pgvector cosine search
                WHERE user_id = :uid        ← hard SQL isolation
                      │
                      ▼
                Top-K chunks + metadata
                      │
                      ▼
                Mistral 7B (llama-cpp-python)
                      │
                      ▼
                Answer + source /emails/{id} links

Key design decision: user isolation is enforced at the SQL layer (WHERE user_id = :uid), not application logic. No amount of prompt injection can leak cross-user data.

Features

Multi-user isolation — JWT auth with SQL-enforced per-user filtering at the pgvector layer
10K+ email index — batch ingestion with embedding caching to avoid re-embedding on restart
E5-small-v2 embeddings — 384-dim, CPU-friendly, strong retrieval precision
Mistral 7B Instruct — local GGUF inference via llama-cpp-python, no OpenAI dependency
Retrieval metrics — precision@5, answer faithfulness, and latency benchmarks included
Source references — every answer cites /emails/{id} links for auditability
Docker Compose — single command brings up Postgres 16 + pgvector

Performance

Metric	Value
Index size	10,000+ emails
Retrieval latency (p50)	< 80ms
End-to-end query latency	< 4s (Mistral 7B, CPU)
Retrieval precision@5	0.84
Answer faithfulness	0.79
Cross-user data leakage	0 (SQL-enforced)

Project Structure

email-rag/
├── docker-compose.yml          # Postgres 16 + pgvector
├── backend/
│   ├── app/
│   │   ├── main.py             # FastAPI app
│   │   ├── auth.py             # JWT token issuance + validation
│   │   ├── ingest.py           # Email parsing + chunking + embedding
│   │   ├── retrieval.py        # pgvector cosine search with user filter
│   │   ├── generation.py       # Mistral 7B inference via llama-cpp-python
│   │   └── models.py           # SQLAlchemy ORM (emails, chunks, users)
│   ├── scripts/
│   │   ├── seed_users.py       # Create demo users + JWT tokens
│   │   ├── ingest_emails.py    # Batch ingest .eml files
│   │   └── benchmark.py        # Retrieval precision + faithfulness eval
│   ├── docs/
│   │   ├── architecture.md
│   │   └── evaluation_report.md
│   └── sample_data/            # Two-user demo email set
└── requirements.txt

Quickstart

git clone https://github.com/Ajeenckya5/LLM_RAG
cd LLM_RAG

# 1. Start Postgres + pgvector
docker compose up -d

# 2. Install dependencies
pip install -r requirements.txt

# 3. Seed demo users and ingest emails
python backend/scripts/seed_users.py
python backend/scripts/ingest_emails.py --data backend/sample_data/

# 4. Start the API
uvicorn backend.app.main:app --reload

# 5. Query
curl -H "Authorization: Bearer <jwt>" \
  "http://localhost:8000/query?q=What+meetings+were+scheduled+last+week"

Why pgvector Instead of a Dedicated Vector DB

pgvector keeps the entire system in a single Postgres instance — no separate Pinecone/Weaviate service to manage, no cross-service auth, and user isolation is a single WHERE clause on an indexed column rather than a multi-tenant API policy.

Tech Stack

Python 3.10+ · FastAPI · PostgreSQL 16 + pgvector · intfloat/e5-small-v2 · Mistral 7B GGUF · llama-cpp-python · SQLAlchemy · Docker Compose · JWT (PyJWT)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
backend		backend
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-User Email RAG

Architecture

Features

Performance

Project Structure

Quickstart

Why pgvector Instead of a Dedicated Vector DB

Tech Stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Multi-User Email RAG

Architecture

Features

Performance

Project Structure

Quickstart

Why pgvector Instead of a Dedicated Vector DB

Tech Stack

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages