CodeAtlas

Build a mental model of any codebase in minutes.

Go from zero to productive on any codebase in minutes - CodeAtlas autonomously explores, maps, and explains the full architecture.

Features

Autonomous Exploration - LLM-driven agent generates hypotheses, selects tools, executes them in parallel, and iterates until it understands the codebase.
BFS + Beam Search Hybrid - Broad exploration at each depth level with beam pruning that keeps only the top-K scoring branches, focusing compute on the most promising paths.
Code Graph Backend - Every symbol, call, import, and inheritance edge mapped via graphify.
Traced Citations - Every claim links back to specific files and lines (file.py:42) for verification.
Architecture Diagrams - Auto-generated Mermaid call graphs and data flow diagrams.
Real-Time Streaming - WebSocket-based so you see progress as it happens.

How It Works

CodeAtlas first builds a rich code graph (via graphify, AST-derived, with nodes for every symbol and edges for calls, imports, and inheritance) that represents the full structure of the repository. This graph then drives a LangGraph-powered Tree-of-Thought workflow - exploring the codebase, reasoning about architecture, and synthesizing everything into a coherent mental model with call graphs, data flow diagrams, and cited evidence.

The workflow is a BFS + Beam Search hybrid - BFS controls depth while beam search prunes low-value branches at each level, keeping only the top-K scoring thoughts.

generate_thoughts → execute_batch → evaluate_batch → beam_prune_expand ──→ synthesize
                                                                  │
                                                                  ├──→ execute_batch (normal loop)
                                                                  └──→ generate_thoughts (re-generate when beam empty)

Node	Description
generate_thoughts	LLM proposes 2–3 hypotheses with tool selections. On re-generation, avoids previously explored angles.
execute_batch	Runs each pending thought's tool against the repo in parallel (`ThreadPoolExecutor`). Collects outcomes and file paths.
evaluate_batch	Hybrid scorer: LLM evaluates relevance + evidence strength, computes source diversity from unique files touched. All evaluations run in parallel.
beam_prune_expand	Beam step - drops scores < 0.4, keeps top-K (`keep_top_k`), generates child thoughts (up to `max_children`). If beam empties before `max_depth`, routes back for fresh angles. Early-exits to synthesis when ≥70% of beam candidates are ready.
synthesize	Collects evidence from best branches, generates a Mermaid architecture diagram, produces a final answer with numbered citations (file:line), rejected-branch summary, and uncertainties.

The state machine is built with LangGraph. The stack: FastAPI (backend), Next.js (frontend), GitHub Models (LLM, free tier).

Quick Start

# 1. Install dependencies
make setup                          # macOS / Linux
.\dev.ps1 setup                     # Windows

# 2. Configure your API key
cp apps/api/.env.example apps/api/.env
# Edit .env → set GITHUB_TOKEN=ghp_your_token_here

# 3. Run (API + Web in parallel)
make dev                            # macOS / Linux
.\dev.ps1 dev                       # Windows

API: http://localhost:8000
Web: http://localhost:3000

# Run tests
make test                           # macOS / Linux
.\dev.ps1 test                      # Windows

Configuration

Variable	Default	Description
`GITHUB_TOKEN`	-	GitHub PAT for API access
`GENERATION_LLM_MODEL`	`gpt-4o-mini`	Model for hypothesis generation
`EVALUATION_LLM_MODEL`	`gpt-4o-mini`	Model for scoring
`SYNTHESIS_LLM_MODEL`	`gpt-4o-mini`	Model for final synthesis
`MAX_DEPTH`	`3`	BFS depth limit
`MAX_CHILDREN`	`2`	Max child thoughts per parent
`KEEP_TOP_K`	`5`	Beam width
`EXECUTION_WORKERS`	`4`	Parallel tool call workers
`EVALUATION_WORKERS`	`2`	Parallel LLM evaluation workers

Roadmap

See ROADMAP.md for planned work including CLI, BYOK (bring your own LLM key), Slack/Discord bots, VS Code extension, GitHub Action, and more.

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
apps		apps
.gitattributes		.gitattributes
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
ROADMAP.md		ROADMAP.md
dev.ps1		dev.ps1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CodeAtlas

Features

How It Works

Quick Start

Configuration

Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CodeAtlas

Features

How It Works

Quick Start

Configuration

Roadmap

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages