Skip to content

FiscalMindset/Synapse-Graph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

34 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Synapse-Graph (AI Autopsy Engine)

Synapse-Graph Logo

Turn LLM internals into observable, governable infrastructure

GitHub Pages YouTube Demo YouTube Product


Live Presentation

HTML Presentation β€” Open in browser

Scene Description
Intro Project overview, features, graph visualization
Architecture (Interactive) Clickable component diagram
Tech Stack Dependencies
Demo Demo video
OpenMetadata Governance plane
Status Capabilities + gaps
Thank You Credits

Video Demos


πŸ’‘ The Motivation: Breaking the Black Box

Synapse-Graph started from a simple frustration: when an LLM hallucinates, most teams are still left choosing between two bad fixes.

1. Prompt Engineering
Ask the system prompt to behave better and hope the next response cooperates.
2. Retraining / Fine-tuning
Spend compute and engineering time to train the bad behavior out of the model.

Most observability tools only inspect the surface area of the model: prompts, tokens, latency, and logs. Synapse-Graph treats the AI like an engine, not a black box. If a spark plug misfires, you do not replace the entire car; you identify the exact component and fix it.

Core idea: combine Mechanistic Interpretability with Enterprise Data Governance so hallucinations can be traced, explained, and corrected in real time without retraining.

βš™οΈ Engineering Philosophy: How It Actually Works

Synapse-Graph performs live neural surgery by operating directly on model tensors instead of wrapping the API surface.

1. πŸ” The PyTorch Shadow Tracer

The tracer uses register_forward_hook on attention modules to observe generation as it happens.

  • Captures attention matrices at the layer, head, and position level
  • Records projection outputs during inference
  • Keeps tracing lightweight enough to run alongside generation

2. πŸ›‘οΈ The OpenMetadata Governance Layer

We map neural structure into familiar catalog primitives so the model can be governed like enterprise data.

Neural Concept OpenMetadata Representation
Model Database
Transformer layers Tables
Attention heads Columns

That mapping turns thought paths into lineage edges, making it possible to tag specific heads as DEFECTIVE and feed that signal back into runtime control.

3. ⚑ Causal Ablation

Correlation is not causation, so the backend runs an automated $O(n^2)$ ablation sweep to isolate the exact circuit responsible for the behavior.

  1. Identify a suspicious head
  2. Zero its projection output and re-run the generation
  3. Measure whether the hallucination rate drops
  4. Promote the confirmed culprit to DEFECTIVE in OpenMetadata and mask it in the FastAPI proxy on future requests

The result is a debugging loop that is:

  • Targeted rather than speculative
  • Deterministic rather than anecdotal
  • Cheap to operate because it avoids retraining


The Problem

LLMs are powerful but opaque. Current observability stops at prompts, tokens, latency, and logs. They don't answer:

  • Which layers and heads were most active for this response?
  • Can we trace a "thought path" through the network?
  • Can governance tools intervene on specific neural components?

The Solution

Synapse-Graph repurposes OpenMetadata as a governance and lineage system for transformer internals:

  • Model β†’ Database
  • Transformer layers β†’ Tables
  • Attention heads β†’ Columns
  • High-activation paths β†’ Lineage edges
  • DEFECTIVE tag β†’ Runtime control signal that masks a head during next generation

The Impact

Turns model internals into inspectable infrastructure with familiar data-platform primitives.


Architecture

flowchart LR
    subgraph Dashboard["🎨 Operator Dashboard"]
        D["Next.js<br/>React<br/>@xyflow/react"]
    end

    subgraph Proxy["⚑ Neural Proxy (FastAPI)"]
        P["Generation + Tracing<br/>Governance + SSE<br/>HeadMaskStore"]
    end

    subgraph Generation["πŸ”₯ Generation"]
        O["Ollama<br/>(Preferred)"]
    end

    subgraph Tracing["πŸ” Tracing"]
        T["HF Tracer<br/>PyTorch hooks"]
    end

    subgraph Governance["πŸ›‘οΈ Governance"]
        OM["OpenMetadata<br/>Topology + Lineage<br/>Tags β†’ Masks"]
        DEF["β›” DEFECTIVE<br/>β†’ Runtime Mask"]
    end

    D -->|"REST + SSE"| P
    P -->|"Generation"| O
    P -->|"Tracing"| T
    P -->|"Topology<br/>Lineage<br/>Tags"| OM
    OM -->|"tag"| DEF
Loading

Interactive diagram: Click here for full interactive architecture


Backend Details

backend/app/main.py β€” FastAPI Application

REST Endpoints:

Endpoint Method Purpose
/api/v1/state GET Current runtime state
/api/v1/generate POST Full generation response
/api/v1/generate/stream POST SSE with trace steps
/api/v1/autopsy/discover_circuit POST Circuit discovery
/api/v1/autopsy/discover_circuit/stream POST SSE discovery progress
/api/v1/autopsy/causal POST Causal autopsy
/api/v1/openmetadata/bootstrap POST Bootstrap catalog
/api/v1/openmetadata/sync-defects POST Sync tags to masks
/api/v1/openmetadata/quarantine POST Quarantine heads
/api/v1/webhooks/openmetadata POST Webhook handler
/api/v1/governance/local-mask POST Set head mask
/api/v1/governance/clear-local-masks POST Clear masks
/api/v1/hf/preload POST Load HF tracer

Execution Modes:

  • AUTO β€” Prefer Ollama if available
  • FAST β€” Ollama + parallel HF tracing
  • FAITHFUL β€” Only HF with inline tracing

Hook-Based Attention Capture:

def _register_attention_hooks(model, layer_idx, hook_handles):
    # Registers register_forward_hook on attention modules
    # Captures: attention_weights, projection output
    
def _make_projection_mask_hook(layer_idx, head_idx):
    # Applies masking to output projection
    # Zeroes masked head's hidden states

Two-Level Masking:

  1. Attention tensor masking
  2. Projection masking (hidden states)

Default Models:

  • Ollama: qwen2.5:3b-instruct
  • HuggingFace: Qwen/Qwen2.5-1.5B-Instruct
  • Dashboard default: gpt2 (12 layers Γ— 12 heads = 144 heads)

OpenMetadata Topology

erDiagram
    Service ||--o| Database : "Synapse_Neural_Service"
    Database ||--o| Schema : "Transformer_Graph"
    Schema ||--o| Table : "Layer_N (per layer)"
    Schema ||--o| Table : "Prompt_Ingress"
    Schema ||--o| Table : "Response_Egress"
    Table ||--o| Column : "Head_N (per head)"

    Service {
        string name "Synapse_Neural_Service"
        string type "mysql"
    }

    Column {
        string name "Head_N"
        string type "FLOAT"
        string tag "DEFECTIVE/QUARANTINED"
    }
Loading

Classification & Tags:

  • Classification: SynapseQuarantine
  • Tag: DEFECTIVE (color: #39FF14)

Lineage: Prompt_Ingress β†’ Layer_1 β†’ ... β†’ Layer_N β†’ Response_Egress


Frontend Details

Dashboard Components

  • frontend/components/synapse-dashboard.tsx β€” Main dashboard with metrics, discovery panel, governance controls
  • frontend/components/synapse-graph.tsx β€” @xyflow/react graph visualization
  • frontend/components/activation-chart.tsx β€” Per-layer, per-head activation charts
  • frontend/components/console-log.tsx β€” Real-time log stream display

Dashboard Features

Metric Cards:

  • Generation Backend (Ollama live / HF inline)
  • Trace Fidelity (Exact / Proxy evidence)
  • Lineage Depth (active hops)
  • Masked Heads count

Causal Discovery Panel:

  • Target token input (hallucination to remove)
  • top_k_heads slider (1-20)
  • max_pair_sweeps slider (0-190)
  • Run Discovery β†’ View Overlay β†’ Quarantine buttons

Governance Panel:

  • Quarantine Top Head
  • Clear Local Masks
  • Sync Defects button

Tech Stack

Backend (backend/pyproject.toml)

[project]
requires-python = ">=3.11,<3.13"

dependencies = [
    "fastapi>=0.115.0",
    "torch>=2.4.0",
    "transformers>=4.46.0",
    "openmetadata-ingestion>=1.12.0",
    "httpx>=0.28.0",
    "pydantic-settings>=2.7.0",
    "uvicorn[standard]>=0.32.0",
    "accelerate>=1.1.0",
    "cachetools>=5.3.0",
]

Frontend (frontend/package.json)

{
  "dependencies": {
    "next": "^15.2.0",
    "react": "^19.0.0",
    "@xyflow/react": "^12.4.4",
    "recharts": "^2.15.0",
    "lucide-react": "^0.468.0"
  },
  "devDependencies": {
    "tailwindcss": "^3.4.16",
    "typescript": "^5.7.2"
  }
}

Quickstart

# Backend
python3.11 -m venv .venv && source .venv/bin/activate
pip install -e ./backend
cp backend/.env.example backend/.env
cd backend && python -m uvicorn app.main:app --reload --port 8000

# Frontend (new terminal)
cd frontend && npm install && npm run dev

Dashboard: http://localhost:3000


Demo Workflow

flowchart TD
    A["1. Start<br/>Boot dashboard"] --> B["2. Trace<br/>Submit prompt"]
    B --> C["3. Discover<br/>Run circuit discovery"]
    C --> D["4. Quarantine<br/>Push DEFECTIVE tags"]
    D --> E["5. Verify<br/>Re-run prompt<br/>Show masked heads"]
    E --> A
Loading
  1. Start β€” Boot dashboard, verify "Ollama live" or "HF fallback"
  2. Trace β€” Submit prompt β†’ watch synapse graph light up
  3. Discover β€” Enter hallucination token β†’ run circuit discovery
  4. Quarantine β€” Click "Quarantine" β†’ push DEFECTIVE tags to OpenMetadata
  5. Verify β€” Re-run prompt β†’ show masked heads count increase

Repository Layout

Synapse-Graph/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ main.py          # FastAPI + endpoints
β”‚   β”‚   β”œβ”€β”€ inference.py     # Generation + tracing
β”‚   β”‚   └── om_client.py    # OpenMetadata client
β”‚   └── tests/
β”‚       β”œβ”€β”€ test_quarantine.py
β”‚       └── test_discover_quarantine_integration.py
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ app/                 # Next.js app router
β”‚   β”œβ”€β”€ components/         # Dashboard, graph, charts
β”‚   └── lib/               # API client
β”œβ”€β”€ architecture.html       # Interactive architecture diagram
└── first_frame.html        # GitHub Pages presentation

License

MIT


Connect

Vicky Kumar

Vicky Kumar

algsoch

LinkedIn GitHub Primary GitHub Secondary

npdimagine@gmail.com Β· +91 8383848219

Project: GitHub Repo Β· Live Demo Β· YouTube Demo

Built for AI Interpretability

About

An AI autopsy engine that repurposes OpenMetadata for live neural governance, allowing operators to trace, discover, and surgically quarantine hallucinating attention heads at runtime.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors