🚀 AutoReviewer: Production-Grade Multi-Agent Code Review System

Elevate your pull requests with an autonomous squad of expert AI engineers.

Features • Architecture • The Agents • Quick Start • Evaluation

💡 The Vision

Code reviews are critical but time-consuming. AutoReviewer goes beyond simple "wrapper" LLM scripts by introducing a highly asynchronous, multi-agent orchestration layer. It evaluates Git diffs and full repositories through the lens of specialized AI agents—filtering out noise, checking for security vulnerabilities, and ensuring architectural alignment—all before a human reviewer even looks at the code.

Constructed with enterprise-grade engineering practices, this system leverages Celery & Redis for scalable task queuing, FastAPI for high-throughput webhook ingestion, LangGraph for multi-agent workflows, and MLflow for rigorous LLM evaluation.

✨ Key Features

🤖 Multi-Agent Orchestration (LangGraph): A routed workflow of specialized AI agents (Planner, Bug Detector, Security, Style).
⚡ Completely Asynchronous: Built on FastAPI, Redis, and Celery. Fire-and-forget API drops heavy AI tasks into background queues seamlessly.
🔍 AST-Powered Context Engine: Intelligently extracts localized AST (Abstract Syntax Tree) contexts from raw diffs to prevent LLM hallucination and manage token limits.
🛡️ Enterprise Observability: Fully integrated with LangSmith for real-time execution tracing and debugging.
📈 Automated Prompt Evaluation: Uses MLflow LLM-as-a-judge to mathematically score new prompt versions against a "golden dataset" of historical pull requests.
📦 Universal Ingestion: Supports bare code .zip uploads, diff extraction, and full-repository fallback analysis.
🐳 Fully Dockerized: Spin up the API, Worker, and Redis broker with a single docker-compose up.

🏗 System Architecture

The architecture separates the high-throughput web layer from the blocking nature of LLM inferences, ensuring the system can process hundreds of pull requests concurrently without dropping hooks.

flowchart LR
    classDef primary fill:#2563eb,stroke:#1e40af,stroke-width:2px,color:#fff
    classDef secondary fill:#059669,stroke:#047857,stroke-width:2px,color:#fff
    classDef agent fill:#7c3aed,stroke:#5b21b6,stroke-width:2px,color:#fff
    classDef db fill:#dc2626,stroke:#b91c1c,stroke-width:2px,color:#fff
    
    Client((Client / CI))
    
    subgraph Infrastructure [Async Event Infrastructure]
        API[FastAPI Gateway]:::primary
        Redis[(Redis Broker)]:::db
        Worker[Celery Task Worker]:::primary
    end
    
    subgraph Engine [LangGraph Agentic Engine]
        AST[AST Diff Parser]:::secondary
        Planner{Planner Strategy}:::agent
        Bug[Bug Detector]:::agent
        Sec[Security Scanner]:::agent
        Style[Style Checker]:::agent
        Synth{Synthesizer}:::agent
    end
    
    Client -->|1. POST .zip| API
    API -.->|2. Enqueue| Redis
    Redis -.->|3. Consume| Worker
    
    Worker ==>|4. Execute Graph| AST
    AST --> Planner
    Planner -->|async dispatch| Bug
    Planner -->|async dispatch| Sec
    Planner -->|async dispatch| Style
    
    Bug --> Synth
    Sec --> Synth
    Style --> Synth
    
    Synth ==>|5. Final Markdown| Worker
    Worker -.->|6. Store Result| Redis
    Client -.->|7. GET /results/id| API

🧬 Meet the Agents

The system utilizes a directed acyclic graph (DAG) to coordinate responsibilities rather than relying on a single, easily confused LLM prompt:

The Planner: 🧭 Reads the initial diff and maps out a strategic review plan. Only activates the necessary downstream agents to save tokens and time.
Bug Detector: 🐛 Deep-dives into logic to find edge cases, off-by-one errors, and runtime exceptions.
Security Scanner: 🔐 Acts as an AppSec engineer, aggressively hunting for OWASP top 10 vulnerabilities and hardcoded secrets.
Style Enforcer: 👔 Checks PEP8 (or language-specific) guidelines, variable naming conventions, and linting rules.
The Synthesizer (Lead Dev): 🎯 Consolidates all agent reports, filters out contradictory hallucinations, and formats a beautiful, actionable Markdown report for the final output.

🚀 Quick Start

Prerequisites

Docker & Docker Compose
OpenAI / Anthropic API Key (configurable via environment variables)

Installation

Clone the repository

git clone https://github.com/yourusername/autoreviewer.git
cd autoreviewer

Configure Environment Variables

cp .env.example .env
# Add your LLM keys (e.g., OPENAI_API_KEY) and LangSmith API keys to .env

Spin up the stack
```
docker-compose up --build -d
```
This starts the FastAPI server on port 8000, the Redis broker, and the Celery background workers.

Kubernetes Deployment (Production)

For true enterprise scale, deploy AutoReviewer on Kubernetes. Manifests configure scalable API instances and infinite Celery workers.

Configure Secrets

cp k8s/secret.yaml.example k8s/secret.yaml
# Edit secret.yaml with your OPENAI_API_KEY
kubectl apply -f k8s/secret.yaml

Apply Manifests
```
kubectl apply -f k8s/manifests.yaml
```
This deploys Redis, the FastAPI LoadBalancer, and the Celery Worker deployment.

Usage

1. Submit a repository for review:

curl -X POST "http://localhost:8000/review" \
  -H "accept: application/json" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@/path/to/your/repo.zip"

Response:

{
  "task_id": "b1a2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "Task submitted successfully."
}

2. Poll for the final AI synthesis:

curl -X GET "http://localhost:8000/results/b1a2c3d4-e5f6-7890-abcd-ef1234567890"

🧪 LLM Evaluation & Testing

We treat our AI prompts as actual software. This repository features an aggressive CI/CD pipeline and mathematical LLM evaluations.

Test Coverage: Maintained at >85% with pytest and pytest-mock, heavily utilizing Dependency Injection to mock LLM calls.
CI/CD: GitHub Actions automatically tests every PR.
Prompt Evals: Run python eval/mlflow_evaluate.py to trigger local MLflow deployments. It uses LLM-as-a-judge against a golden dataset of past pull requests to ensure our Bug Detector and Security Scanners aren't dropping in accuracy when we tweak the system prompt.

🤝 Contributing

Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

Built with ❤️ by an engineer passionate about Agentic AI.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.chainlit		.chainlit
.github/workflows		.github/workflows
__pycache__		__pycache__
agents		agents
core		core
docs		docs
eval		eval
gateway		gateway
k8s		k8s
manual tests		manual tests
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Dockerfile.api		Dockerfile.api
README.md		README.md
api.py		api.py
app.py		app.py
chainlit.md		chainlit.md
docker-compose.yml		docker-compose.yml
gateway_config.yaml		gateway_config.yaml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
worker.py		worker.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 AutoReviewer: Production-Grade Multi-Agent Code Review System

💡 The Vision

✨ Key Features

🏗 System Architecture

🧬 Meet the Agents

🚀 Quick Start

Prerequisites

Installation

Kubernetes Deployment (Production)

Usage

🧪 LLM Evaluation & Testing

🤝 Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚀 AutoReviewer: Production-Grade Multi-Agent Code Review System

💡 The Vision

✨ Key Features

🏗 System Architecture

🧬 Meet the Agents

🚀 Quick Start

Prerequisites

Installation

Kubernetes Deployment (Production)

Usage

🧪 LLM Evaluation & Testing

🤝 Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages