ShadowAudit

Deterministic, fail-closed runtime authorization for AI agents.

ShadowAudit sits between AI agents and their tools to enforce deterministic authorization before execution happens.

Documentation: https://shadowauditlabs.github.io/shadowaudit-python/

It is infrastructure for governing agent tool use at runtime: closer to IAM, Open Policy Agent, admission controllers, and API gateways than prompt guardrails or moderation.

Agent → ShadowAudit → Tool
          │
          ├─ Allow
          ├─ Deny
          └─ Require approval

Example: a finance agent can read invoices freely, but a payments.transfer call over $1,000 is paused for approval, and a shell command like rm -rf /var/lib/postgresql is denied before it runs.

Why ShadowAudit Exists

Prompts and moderation can shape model behavior, but they are not deterministic authorization systems.

They depend on the model, are probabilistic by nature, and usually run before the final tool execution point.

ShadowAudit enforces explicit runtime policy at the point where execution actually happens.

How It Works

ShadowAudit wraps agent tools with a runtime gate.

When an agent attempts to use a tool, ShadowAudit evaluates the request against explicit policy-as-code. The gate returns one of three outcomes:

allow: execute the tool
deny: block execution before the tool runs
require approval: send the request to a human approval workflow

graph LR
    A[Agent] --> B[ShadowAudit Gate]
    B --> C{Policy Decision}
    C -->|Allow| D[Tool Executes]
    C -->|Deny| E[Blocked]
    C -->|Approval| F[Approval Queue]
    C -.-> G[(Audit Log)]

Enforcement is deterministic and fail-closed. If a tool call is not authorized, it does not execute.

Quickstart

pip install shadowaudit

from shadowaudit import ShadowAuditTool
from langchain.tools import ShellTool

safe_tool = ShadowAuditTool(
    tool=ShellTool(),
    agent_id="ops-agent",
    capability="shell.execute",
    policy_path="policies/production_shell_policy.yaml"
)

Example policy:

deny:
  - capability: filesystem.delete
  - capability: shell.root_access

require_approval:
  - capability: payments.transfer
    amount_gt: 1000

allow:
  - capability: filesystem.read

Supported integrations:

LangChain
LangGraph
CrewAI
OpenAI Agents SDK
MCP
Direct Python APIs

Direct Gate API

Use the core gate directly when you want ShadowAudit inside your own runtime, framework adapter, MCP gateway, or infrastructure workflow.

from shadowaudit.core.gate import Gate

gate = Gate()

result = gate.evaluate(
    agent_id="ops-agent-1",
    task_context="shell",
    risk_category="shell_execution",
    capability="shell.execute",
    policy_path="policies/production_shell_policy.yaml",
    payload={
        "command": "rm -rf /var/lib/postgresql"
    }
)

if not result.passed:
    print("BLOCKED")
    print(f"Capability: shell.execute")
    print(f"Decision: denied")
    print(f"Reason: {result.reason}")

Expected output:

BLOCKED
Capability: shell.execute
Decision: denied
Reason: destructive_command_detected

ShadowAudit automatically extracts numeric fields such as amount, total, and value from tool arguments so policies can evaluate conditions like amount_gt.

Why ShadowAudit Is Different

ShadowAudit is not a prompt wrapper, moderation layer, or generic observability SDK. It is runtime authorization infrastructure for agent tool execution.

Capability	What it means
Deterministic runtime enforcement	The same request and policy produce the same decision.
Fail-closed execution	Unauthorized tool calls are blocked before execution.
No LLM dependency in the enforcement path	Policy evaluation does not depend on a model call.
Policy-as-code	Authorization rules live in explicit, reviewable policy files.
Offline-first enforcement	The gate can run without cloud services or network access.
Approval workflows	Sensitive actions can pause for human approval instead of being blindly allowed or denied.
Replayable execution trails	Decisions can be replayed for debugging, incident response, and audit review.
Tamper-evident audit chain	Runtime decisions are stored in a hash-chained audit log.
Cryptographic verification	Audit logs can be verified and optionally signed with Ed25519.

The goal is simple: an agent should not be able to execute a sensitive action unless a deterministic runtime policy allows it.

Auditability and Replay

Every runtime decision can be recorded in an append-only SQLite audit log.

Audit entries are:

SHA-256 hash chained
replayable
tamper-evident
optionally signed with Ed25519

Modify any row and the verification chain breaks.

shadowaudit verify --audit-log audit.db

Example audit entry:

{
  "timestamp": 1715492534.123,
  "agent_id": "finance-agent",
  "capability": "payments.transfer",
  "decision": "require_approval",
  "payload_hash": "a8f5f167f44f...",
  "previous_hash": "9ab12de...",
  "signature": "ed25519:..."
}

Forensic workflows:

shadowaudit replay trace.jsonl
shadowaudit trace <entry_hash>
shadowaudit logs --audit-log audit.db

Replay output can show triggered rules, capability mapping, enforcement decisions, risk deltas, and the final decision path.

Approval Workflows

Policies can require approval for sensitive capabilities.

require_approval:
  - capability: production.database.write
  - capability: payments.transfer
    amount_gt: 1000

shadowaudit pending-approvals
shadowaudit approve req-1234
shadowaudit reject req-1234

Every approval or rejection is recorded as part of the audit trail.

Observe Mode

Use observe mode to roll out policies before enforcing them.

from shadowaudit.core.gate import Gate

gate = Gate(mode="observe")

Observe mode logs decisions without blocking execution. This lets teams see what would have been denied before switching to fail-closed enforcement.

CI/CD Enforcement

ShadowAudit can scan codebases for ungated agent tools and fail CI when high-risk tools are not protected.

shadowaudit check ./src --fail-on-ungated

Replay traces against policy changes before rollout:

shadowaudit simulate --trace-file session.jsonl --taxonomy alternative.yaml --compare

This supports:

governance regression testing
enforcement simulation
policy diff analysis
safer rollout workflows

Advanced Capabilities

The core primitive is runtime authorization. ShadowAudit also includes deeper infrastructure for complex agent systems.

MCP Governance

Put ShadowAudit in front of MCP tools.

from shadowaudit.mcp.gateway import MCPGatewayServer

gateway = MCPGatewayServer(
    upstream_command=[
        "python",
        "-m",
        "mcp_server_filesystem",
        "/tmp"
    ],
    policy_path="policies/mcp_restrictions.yaml"
)

gateway.run()

FlowTracer and Trust Propagation

FlowTracer tracks how data moves across agents and preserves trust boundaries through chained workflows.

from shadowaudit import FlowTracer, TrustLevel

tracer = FlowTracer()

tracer.record_output(
    "web-scraper",
    scraped_data,
    trust=TrustLevel.UNTRUSTED
)

tracer.record_flow(
    "web-scraper",
    "payment-agent",
    parsed_data
)

annotation = tracer.annotate(
    receiving_agent="payment-agent",
    source_agents=["web-scraper"],
    declared_trust=TrustLevel.SYSTEM,
)

print(annotation.effective_trust)

This is useful for:

multi-agent systems
autonomous workflows
MCP ecosystems
chained execution graphs

FlowTracer is an observability primitive designed to integrate with dynamic risk threshold plugins.

Compliance and Reporting

ShadowAudit includes reporting helpers for governance and assurance teams.

shadowaudit owasp
shadowaudit eu-ai-act ./src

Included mappings and reports:

OWASP Agentic Top 10 coverage
EU AI Act Annex IV evidence packs
HTML governance reports
structured audit exports

Governance Lifecycle

graph LR
    A[Agent] --> B[Capability Mapper]
    B --> C[Policy Engine]
    C --> D[Risk Evaluation]
    D --> E{Enforcement Decision}

    E -->|Allow| F[Tool Execution]
    E -->|Require Approval| G[Approval Queue]
    E -->|Deny| H[Blocked Response]

    F -.-> I[(Audit Trace)]
    G -.-> I
    H -.-> I

Every decision is:

deterministic
replayable
explainable
cryptographically auditable

Internal Architecture

┌─────────────────────────────────────────────────────┐
│                  ShadowAudit                        │
├─────────────┬─────────────┬─────────────┬──────────┤
│ LangChain  │  CrewAI     │ LangGraph   │   MCP    │
│ OpenAI SDK │ Direct Gate │ FlowTracer  │ Gateway  │
├─────────────────────────────────────────────────────┤
│               Core Gate Engine                      │
│                                                     │
│ Capability Mapper → Policy Engine → Enforcement FSM │
│                                                     │
│  Risk Engine    │ Replay Engine │ Audit Chain       │
│  Thresholds     │ Simulator     │ SHA-256 + Ed25519 │
│                                                     │
├─────────────────────────────────────────────────────┤
│          SQLite State + Audit Storage               │
└─────────────────────────────────────────────────────┘

Examples

See examples/ for runnable demos including:

LangChain agents
MCP governance
tamper-evident audit verification
fintech payment agents
FlowTracer demos
observe mode rollouts
replay and simulation workflows

python examples/core_concepts/run_all_examples.py

Project Status

ShadowAudit is production-ready for:

runtime tool gating
deterministic authorization
fail-closed execution
audit-time replay
policy-as-code enforcement
approval workflows
compliance evidence generation

Current capabilities:

LangChain, CrewAI, LangGraph, OpenAI Agents SDK, and MCP adapters
hash-chained audit logs
Ed25519 signing
replay and simulation engine
FlowTracer trust propagation
vertical taxonomies
OWASP and EU AI Act reporting
offline-first operation
zero LLM calls in the enforcement path

Designed for:

regulated workloads
fintech
healthcare
air-gapped environments
enterprise governance teams
production agent infrastructure

Contributing

git clone https://github.com/AnshumanKumar14/shadowaudit-python.git

cd shadowaudit-python

pip install -e ".[dev]"

pytest tests/ -q

Bug reports, governance plugins, framework adapters, and policy contributions are welcome.

License

MIT License

_{Built by Anshuman Kumar}

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
docs-dev		docs-dev
docs		docs
examples		examples
policies		policies
shadowaudit-demo		shadowaudit-demo
shadowaudit		shadowaudit
tests		tests
.coverage		.coverage
.cursorrules		.cursorrules
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
env-example.txt		env-example.txt
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ShadowAudit

Why ShadowAudit Exists

How It Works

Quickstart

Direct Gate API

Why ShadowAudit Is Different

Auditability and Replay

Approval Workflows

Observe Mode

CI/CD Enforcement

Advanced Capabilities

MCP Governance

FlowTracer and Trust Propagation

Compliance and Reporting

Governance Lifecycle

Internal Architecture

Examples

Project Status

Contributing

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

ShadowAudit

Why ShadowAudit Exists

How It Works

Quickstart

Direct Gate API

Why ShadowAudit Is Different

Auditability and Replay

Approval Workflows

Observe Mode

CI/CD Enforcement

Advanced Capabilities

MCP Governance

FlowTracer and Trust Propagation

Compliance and Reporting

Governance Lifecycle

Internal Architecture

Examples

Project Status

Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Uh oh!

Contributors 1

Languages

Packages