Skip to content

airblackbox/air-trust

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

178 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

AIR Trust

PyPI Downloads License EU AI Act Status

The flight recorder for autonomous AI agents. Record, replay, enforce, audit.

One proxy swap. Complete coverage. Runs locally.

# Before
client = OpenAI(base_url="https://api.openai.com/v1")

# After — everything else in your code stays identical
client = OpenAI(
    base_url="http://localhost:8080/v1",
    default_headers={"X-Gateway-Key": "your-key"}
)

Every LLM call now generates a signed, tamper-evident, replayable audit record. No SDK changes. No refactoring. No performance impact.

What You Get

Audit chain — every call produces an HMAC-SHA256 chained .air.json record, written asynchronously. Tamper with one record and every record after it breaks.

Quantum-safe signing — the chain is signed with ML-DSA-65 (FIPS 204 / Dilithium3). Keys are generated locally and never leave your machine. Post-quantum secure today.

Evidence bundle — one command packages the audit chain, scan results, and ML-DSA-65 signatures into a self-verifying .air-evidence ZIP. An auditor runs python verify.py and gets PASS/FAIL in two seconds. No pip install needed on their end.

PII and injection scanning — 20 weighted patterns across 5 attack categories detected before the prompt reaches the model. Configurable sensitivity. Auto-blocking.

EU AI Act gap analysis — 48 checks across Articles 9, 10, 11, 12, 14, 15. Maps to ISO 42001, NIST AI RMF, and Colorado SB 24-205. One scan, four frameworks, one report.

Replay — load any past episode from the audit chain, verify the HMAC signature, and replay every step with timestamps. Incident reconstruction without guesswork.

Framework trust layers — drop-in wrappers for LangChain, CrewAI, OpenAI Agents SDK, Anthropic, AutoGen, Google ADK, and Haystack. Same audit chain, native integration.

Quickstart

pip install air-blackbox

# Run your first gap analysis — works on any Python AI project
air-blackbox comply --scan . -v

# Find undeclared model calls hiding in helpers and utilities
air-blackbox discover

# Replay any recorded episode
air-blackbox replay

# Generate a signed evidence package for audit or regulator review
air-blackbox export

Full stack (Gateway + Episode Store + Policy Engine + observability):

git clone https://github.com/airblackbox/air-platform.git
cd air-platform
cp .env.example .env      # add OPENAI_API_KEY
make up                   # running in ~8 seconds
  • Traces → localhost:16686 (Jaeger)
  • Metrics → localhost:9091 (Prometheus)
  • Episodes → localhost:8081 (Episode Store API)

How It Fits Your Stack

Your Agent
    │
    ▼
AIR Gateway          ← swap base_url here
    │
    ├── PII + injection scan      (before prompt reaches model)
    ├── HMAC audit record         (async, zero latency impact)
    ├── ML-DSA-65 signing         (keys never leave your machine)
    │
    ▼
LLM Provider         ← OpenAI / Anthropic / Azure / local
    │
    ▼
AIR Record           ← tamper-evident .air.json
    │
    ▼
Evidence Bundle      ← self-verifying .air-evidence ZIP

Works with any OpenAI-compatible API. Same format, same integration, regardless of provider.

Why Not Just Log Everything?

You probably already have logging. The problems logging doesn't solve:

Tamper-evidence — anyone with write access to your log store can alter a record. HMAC chains make alteration detectable. ML-DSA-65 signatures prove who signed and when.

Prompt reconstruction — most logging captures responses but not the full prompt context, tool calls, and intermediate reasoning. AIR records the complete episode.

Compliance structure — EU AI Act Article 12 requires tamper-evident logs with specific retention and audit access guarantees. Raw logs don't satisfy that. Evidence bundles do.

Secrets leaking into traces — every team that builds their own logging eventually discovers credentials in their observability backend. AIR strips and vault-encrypts API keys before writing any record.

Framework Trust Layers

# LangChain
pip install air-langchain-trust

# CrewAI
pip install air-crewai-trust

# OpenAI Agents SDK (included in gateway)
from air_blackbox.trust import OpenAITrustLayer

Each trust layer produces the same HMAC audit chain as the gateway, natively, inside the framework's execution loop. Pick the integration that fits your stack.

Validated By

  • Julian Risch (deepset) — public validation on LinkedIn and GitHub issue #10810
  • Piero Molino (Ludwig maintainer) — merged EU AI Act compliance changes driven by AIR scan results
  • arXiv AEGIS (March 2026) — independent researchers published the identical interception-layer architecture for AI agent governance
  • McKinsey State of AI Trust 2026 — trust infrastructure named as the critical agentic AI category

Contributing

See CONTRIBUTING.md.

False positive on a compliance check? Correct it — your correction flows into training data for the fine-tuned scanner model. The scanner gets smarter with every fix your team submits.

Good first issues: labeled good first issue — mostly new compliance checks and framework integrations.

License

Apache-2.0 — airblackbox.ai

This is not a certified compliance test. It is a starting point to identify potential gaps.


If this helps you prepare for EU AI Act enforcement, star the repo — it helps other teams find it.