Build a mental model of any codebase in minutes.
Go from zero to productive on any codebase in minutes - CodeAtlas autonomously explores, maps, and explains the full architecture.
- Autonomous Exploration - LLM-driven agent generates hypotheses, selects tools, executes them in parallel, and iterates until it understands the codebase.
- BFS + Beam Search Hybrid - Broad exploration at each depth level with beam pruning that keeps only the top-K scoring branches, focusing compute on the most promising paths.
- Code Graph Backend - Every symbol, call, import, and inheritance edge mapped via graphify.
- Traced Citations - Every claim links back to specific files and lines (
file.py:42) for verification. - Architecture Diagrams - Auto-generated Mermaid call graphs and data flow diagrams.
- Real-Time Streaming - WebSocket-based so you see progress as it happens.
CodeAtlas first builds a rich code graph (via graphify, AST-derived, with nodes for every symbol and edges for calls, imports, and inheritance) that represents the full structure of the repository. This graph then drives a LangGraph-powered Tree-of-Thought workflow - exploring the codebase, reasoning about architecture, and synthesizing everything into a coherent mental model with call graphs, data flow diagrams, and cited evidence.
The workflow is a BFS + Beam Search hybrid - BFS controls depth while beam search prunes low-value branches at each level, keeping only the top-K scoring thoughts.
generate_thoughts → execute_batch → evaluate_batch → beam_prune_expand ──→ synthesize
│
├──→ execute_batch (normal loop)
└──→ generate_thoughts (re-generate when beam empty)
| Node | Description |
|---|---|
| generate_thoughts | LLM proposes 2–3 hypotheses with tool selections. On re-generation, avoids previously explored angles. |
| execute_batch | Runs each pending thought's tool against the repo in parallel (ThreadPoolExecutor). Collects outcomes and file paths. |
| evaluate_batch | Hybrid scorer: LLM evaluates relevance + evidence strength, computes source diversity from unique files touched. All evaluations run in parallel. |
| beam_prune_expand | Beam step - drops scores < 0.4, keeps top-K (keep_top_k), generates child thoughts (up to max_children). If beam empties before max_depth, routes back for fresh angles. Early-exits to synthesis when ≥70% of beam candidates are ready. |
| synthesize | Collects evidence from best branches, generates a Mermaid architecture diagram, produces a final answer with numbered citations (file:line), rejected-branch summary, and uncertainties. |
The state machine is built with LangGraph. The stack: FastAPI (backend), Next.js (frontend), GitHub Models (LLM, free tier).
# 1. Install dependencies
make setup # macOS / Linux
.\dev.ps1 setup # Windows
# 2. Configure your API key
cp apps/api/.env.example apps/api/.env
# Edit .env → set GITHUB_TOKEN=ghp_your_token_here
# 3. Run (API + Web in parallel)
make dev # macOS / Linux
.\dev.ps1 dev # Windows- API: http://localhost:8000
- Web: http://localhost:3000
# Run tests
make test # macOS / Linux
.\dev.ps1 test # Windows| Variable | Default | Description |
|---|---|---|
GITHUB_TOKEN |
- | GitHub PAT for API access |
GENERATION_LLM_MODEL |
gpt-4o-mini |
Model for hypothesis generation |
EVALUATION_LLM_MODEL |
gpt-4o-mini |
Model for scoring |
SYNTHESIS_LLM_MODEL |
gpt-4o-mini |
Model for final synthesis |
MAX_DEPTH |
3 |
BFS depth limit |
MAX_CHILDREN |
2 |
Max child thoughts per parent |
KEEP_TOP_K |
5 |
Beam width |
EXECUTION_WORKERS |
4 |
Parallel tool call workers |
EVALUATION_WORKERS |
2 |
Parallel LLM evaluation workers |
See ROADMAP.md for planned work including CLI, BYOK (bring your own LLM key), Slack/Discord bots, VS Code extension, GitHub Action, and more.