HTML Presentation β Open in browser
| Scene | Description |
|---|---|
| Intro | Project overview, features, graph visualization |
| Architecture (Interactive) | Clickable component diagram |
| Tech Stack | Dependencies |
| Demo | Demo video |
| OpenMetadata | Governance plane |
| Status | Capabilities + gaps |
| Thank You | Credits |
- Complete Demo Walkthrough β 2-3 min circuit discovery β quarantine β re-run
- Product Demo β Live run showing active heads, lineage, quarantine
Synapse-Graph started from a simple frustration: when an LLM hallucinates, most teams are still left choosing between two bad fixes.
|
1. Prompt Engineering
Ask the system prompt to behave better and hope the next response cooperates. |
2. Retraining / Fine-tuning
Spend compute and engineering time to train the bad behavior out of the model. |
Most observability tools only inspect the surface area of the model: prompts, tokens, latency, and logs. Synapse-Graph treats the AI like an engine, not a black box. If a spark plug misfires, you do not replace the entire car; you identify the exact component and fix it.
Synapse-Graph performs live neural surgery by operating directly on model tensors instead of wrapping the API surface.
The tracer uses register_forward_hook on attention modules to observe generation as it happens.
- Captures attention matrices at the layer, head, and position level
- Records projection outputs during inference
- Keeps tracing lightweight enough to run alongside generation
We map neural structure into familiar catalog primitives so the model can be governed like enterprise data.
| Neural Concept | OpenMetadata Representation |
|---|---|
| Model | Database |
| Transformer layers | Tables |
| Attention heads | Columns |
That mapping turns thought paths into lineage edges, making it possible to tag specific heads as DEFECTIVE and feed that signal back into runtime control.
Correlation is not causation, so the backend runs an automated
- Identify a suspicious head
- Zero its projection output and re-run the generation
- Measure whether the hallucination rate drops
- Promote the confirmed culprit to
DEFECTIVEin OpenMetadata and mask it in the FastAPI proxy on future requests
The result is a debugging loop that is:
- Targeted rather than speculative
- Deterministic rather than anecdotal
- Cheap to operate because it avoids retraining
LLMs are powerful but opaque. Current observability stops at prompts, tokens, latency, and logs. They don't answer:
- Which layers and heads were most active for this response?
- Can we trace a "thought path" through the network?
- Can governance tools intervene on specific neural components?
Synapse-Graph repurposes OpenMetadata as a governance and lineage system for transformer internals:
- Model β Database
- Transformer layers β Tables
- Attention heads β Columns
- High-activation paths β Lineage edges
DEFECTIVEtag β Runtime control signal that masks a head during next generation
Turns model internals into inspectable infrastructure with familiar data-platform primitives.
flowchart LR
subgraph Dashboard["π¨ Operator Dashboard"]
D["Next.js<br/>React<br/>@xyflow/react"]
end
subgraph Proxy["β‘ Neural Proxy (FastAPI)"]
P["Generation + Tracing<br/>Governance + SSE<br/>HeadMaskStore"]
end
subgraph Generation["π₯ Generation"]
O["Ollama<br/>(Preferred)"]
end
subgraph Tracing["π Tracing"]
T["HF Tracer<br/>PyTorch hooks"]
end
subgraph Governance["π‘οΈ Governance"]
OM["OpenMetadata<br/>Topology + Lineage<br/>Tags β Masks"]
DEF["β DEFECTIVE<br/>β Runtime Mask"]
end
D -->|"REST + SSE"| P
P -->|"Generation"| O
P -->|"Tracing"| T
P -->|"Topology<br/>Lineage<br/>Tags"| OM
OM -->|"tag"| DEF
Interactive diagram: Click here for full interactive architecture
REST Endpoints:
| Endpoint | Method | Purpose |
|---|---|---|
/api/v1/state |
GET | Current runtime state |
/api/v1/generate |
POST | Full generation response |
/api/v1/generate/stream |
POST | SSE with trace steps |
/api/v1/autopsy/discover_circuit |
POST | Circuit discovery |
/api/v1/autopsy/discover_circuit/stream |
POST | SSE discovery progress |
/api/v1/autopsy/causal |
POST | Causal autopsy |
/api/v1/openmetadata/bootstrap |
POST | Bootstrap catalog |
/api/v1/openmetadata/sync-defects |
POST | Sync tags to masks |
/api/v1/openmetadata/quarantine |
POST | Quarantine heads |
/api/v1/webhooks/openmetadata |
POST | Webhook handler |
/api/v1/governance/local-mask |
POST | Set head mask |
/api/v1/governance/clear-local-masks |
POST | Clear masks |
/api/v1/hf/preload |
POST | Load HF tracer |
Execution Modes:
AUTOβ Prefer Ollama if availableFASTβ Ollama + parallel HF tracingFAITHFULβ Only HF with inline tracing
Hook-Based Attention Capture:
def _register_attention_hooks(model, layer_idx, hook_handles):
# Registers register_forward_hook on attention modules
# Captures: attention_weights, projection output
def _make_projection_mask_hook(layer_idx, head_idx):
# Applies masking to output projection
# Zeroes masked head's hidden statesTwo-Level Masking:
- Attention tensor masking
- Projection masking (hidden states)
Default Models:
- Ollama:
qwen2.5:3b-instruct - HuggingFace:
Qwen/Qwen2.5-1.5B-Instruct - Dashboard default:
gpt2(12 layers Γ 12 heads = 144 heads)
erDiagram
Service ||--o| Database : "Synapse_Neural_Service"
Database ||--o| Schema : "Transformer_Graph"
Schema ||--o| Table : "Layer_N (per layer)"
Schema ||--o| Table : "Prompt_Ingress"
Schema ||--o| Table : "Response_Egress"
Table ||--o| Column : "Head_N (per head)"
Service {
string name "Synapse_Neural_Service"
string type "mysql"
}
Column {
string name "Head_N"
string type "FLOAT"
string tag "DEFECTIVE/QUARANTINED"
}
Classification & Tags:
- Classification:
SynapseQuarantine - Tag:
DEFECTIVE(color: #39FF14)
Lineage: Prompt_Ingress β Layer_1 β ... β Layer_N β Response_Egress
frontend/components/synapse-dashboard.tsxβ Main dashboard with metrics, discovery panel, governance controlsfrontend/components/synapse-graph.tsxβ @xyflow/react graph visualizationfrontend/components/activation-chart.tsxβ Per-layer, per-head activation chartsfrontend/components/console-log.tsxβ Real-time log stream display
Metric Cards:
- Generation Backend (Ollama live / HF inline)
- Trace Fidelity (Exact / Proxy evidence)
- Lineage Depth (active hops)
- Masked Heads count
Causal Discovery Panel:
- Target token input (hallucination to remove)
top_k_headsslider (1-20)max_pair_sweepsslider (0-190)- Run Discovery β View Overlay β Quarantine buttons
Governance Panel:
- Quarantine Top Head
- Clear Local Masks
- Sync Defects button
[project]
requires-python = ">=3.11,<3.13"
dependencies = [
"fastapi>=0.115.0",
"torch>=2.4.0",
"transformers>=4.46.0",
"openmetadata-ingestion>=1.12.0",
"httpx>=0.28.0",
"pydantic-settings>=2.7.0",
"uvicorn[standard]>=0.32.0",
"accelerate>=1.1.0",
"cachetools>=5.3.0",
]{
"dependencies": {
"next": "^15.2.0",
"react": "^19.0.0",
"@xyflow/react": "^12.4.4",
"recharts": "^2.15.0",
"lucide-react": "^0.468.0"
},
"devDependencies": {
"tailwindcss": "^3.4.16",
"typescript": "^5.7.2"
}
}# Backend
python3.11 -m venv .venv && source .venv/bin/activate
pip install -e ./backend
cp backend/.env.example backend/.env
cd backend && python -m uvicorn app.main:app --reload --port 8000
# Frontend (new terminal)
cd frontend && npm install && npm run devDashboard: http://localhost:3000
flowchart TD
A["1. Start<br/>Boot dashboard"] --> B["2. Trace<br/>Submit prompt"]
B --> C["3. Discover<br/>Run circuit discovery"]
C --> D["4. Quarantine<br/>Push DEFECTIVE tags"]
D --> E["5. Verify<br/>Re-run prompt<br/>Show masked heads"]
E --> A
- Start β Boot dashboard, verify "Ollama live" or "HF fallback"
- Trace β Submit prompt β watch synapse graph light up
- Discover β Enter hallucination token β run circuit discovery
- Quarantine β Click "Quarantine" β push DEFECTIVE tags to OpenMetadata
- Verify β Re-run prompt β show masked heads count increase
Synapse-Graph/
βββ backend/
β βββ app/
β β βββ main.py # FastAPI + endpoints
β β βββ inference.py # Generation + tracing
β β βββ om_client.py # OpenMetadata client
β βββ tests/
β βββ test_quarantine.py
β βββ test_discover_quarantine_integration.py
βββ frontend/
β βββ app/ # Next.js app router
β βββ components/ # Dashboard, graph, charts
β βββ lib/ # API client
βββ architecture.html # Interactive architecture diagram
βββ first_frame.html # GitHub Pages presentation
MIT
algsoch
npdimagine@gmail.com Β· +91 8383848219
Project: GitHub Repo Β· Live Demo Β· YouTube Demo
Built for AI Interpretability

