把 AI Agent / service operation 转换为可验证、可审计、可复核的 evidence object。
Turn AI agent operations into auditable and verifiable evidence objects.
普通 AI workflow 通常只留下聊天记录、trace 页面或零散日志。它们能帮助开发者排查问题,但很难直接交给审查者、客户、治理团队或后续系统复核。
agent-evidence 关注的是一次 Agent / service operation 结束之后,能不能留下结构化证据:operation、policy、provenance、evidence、validation、hashes、verification result,以及可以被 validator 检查的 evidence object。
这个仓库的目标不是再做一个通用 Agent 平台,而是提供一个最小、可运行、可验证的 operation evidence 路径。
Execution Evidence and Operation Accountability Profile v0.1- JSON Schema
- profile-aware validator
- valid / invalid examples
- runnable single-path demo
- LangChain-first evidence exporter and offline bundle verification
- FDO-style mapping material for discussion, not a claim of official standard adoption
- release and archive surface anchored by GitHub Release
v0.2.0and DOI10.5281/zenodo.19334062
Install from source:
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"Validate a minimal valid evidence profile:
agent-evidence validate-profile examples/minimal-valid-evidence.jsonCheck that an intentionally invalid example fails:
agent-evidence validate-profile examples/invalid-missing-required.jsonRun the single-path demo:
python3 demo/run_operation_accountability_demo.pyExpected result:
- the valid example returns JSON with
"ok": true - the invalid example returns JSON with
"ok": falseand one primary error code - the demo writes artifacts under
demo/artifacts/and ends with onePASSsummary line
For the current LangChain-first path:
pip install -e ".[dev,langchain,sql]"
python integrations/langchain/export_evidence.py
agent-evidence verify-bundle --bundle-dir integrations/langchain/langchain-evidence-bundleThis runs the documented LangChain exporter and verifies the emitted bundle offline.
For a smaller callback/export recipe aimed at external readers, see LangChain minimal evidence cookbook.
The minimal profile binds these parts into one reviewable object:
statement_id— one accountable operation statementoperation— what operation happened, on which subject, with which inputs and outputspolicy— which rule or constraint set was referencedprovenance— which actor and references connect the operation to inputs and outputsevidence— artifacts, references, and integrity digestsvalidation— method, validator, status, and linked evidence/provenance/policysubjectandactor— object and runtime identity used by the statement
Validator proof:
FDO Testbed registration proof:
agent-evidence is an experimental, minimal, discussion-oriented operation evidence profile. It explores how AI / Agent operation evidence can be expressed with an FDO-style object shape: identity, metadata, references, provenance, integrity, and validation.
It is not an official FDO standard. The current public claim is narrower: this repository provides a working profile, schema, examples, validator, demo, and FDO-facing mapping material for discussion.
FDO-facing reading path:
- Execution Evidence to FDO
- Minimal FDO-style Object Example
- FDO Relevance
- Public positioning
- Lineage map
This repository shows that I can:
- turn LangChain / Agent workflow thinking into a concrete evidence boundary
- design JSON Schema and validator logic for high-responsibility AI workflows
- model audit trail, provenance, hashes, and verification results as deliverable artifacts
- connect trustworthy AI governance ideas to runnable examples
- package technical work as open-source documentation, examples, release artifacts, and CLI validation
The current canonical package is Execution Evidence and Operation Accountability Profile v0.1.
Core entry points:
- Spec: spec/execution-evidence-operation-accountability-profile-v0.1.md
- Schema: schema/execution-evidence-operation-accountability-profile-v0.1.schema.json
- Validator CLI:
agent-evidence validate-profile <file> - Examples: examples/README.md
- Demo: demo/README.md
- Reviewer-facing high-risk entry: docs/high-risk-scenario-entry.md
- Status and acceptance: docs/STATUS.md, docs/ACCEPTANCE-CHECKLIST.md
- Submission handoff: submission/package-manifest.md, submission/final-handoff.md
After one run, the primary outputs are intentionally narrow:
bundle— exported evidence package that can be handed off, verified, and retained outside the original runtimereceipt— machine-readable verification result returned byagent-evidence validate-profile,agent-evidence verify-bundle, oragent-evidence verify-exportsummary— reviewer-facing summary output produced by the current demo and example surfaces
Tracing and logs help operators inspect a run. Agent Evidence packages runtime events into portable artifacts that another party can verify later, including offline.
Evidence path:
runtime events -> evidence bundle -> signed manifest -> detached anchor (when present) -> offline verify
External anchoring is out of scope for AEP v0.1 and is not enabled by default.
The toolkit currently supports two storage modes:
- append-only local JSONL files
- SQLAlchemy-backed SQLite/PostgreSQL databases
The current model treats each record as a semantic event envelope:
event.event_typeis framework-neutral, such aschain.startortool.endevent.context.source_event_typepreserves the raw framework event namehashes.previous_event_hashlinks to the prior eventhashes.chain_hashprovides a cumulative chain tip for integrity checks
Current priority order:
- LangChain / LangGraph
- OpenAI-compatible runtimes
- Automaton sidecar export, marked experimental
The goal is one narrow evidence handoff surface, not many adapters at once.
The read-only Automaton sidecar/exporter reads state.db, git history, and persisted on-chain references, then emits an AEP bundle plus fdo-stub.json and erc8004-validation-stub.json.
agent-evidence export automaton \
--state-db /path/to/state.db \
--repo /path/to/state/repo \
--runtime-root /path/to/automaton-checkout \
--out ./automaton-aep-bundleagent-evidence export automaton has been validated against a live isolated-home Automaton run and remains marked experimental while the live data contract is still settling.
agent-evidence record \
--store ./data/evidence.jsonl \
--actor planner \
--event-type tool.call \
--input '{"task":"summarize"}' \
--output '{"status":"ok"}' \
--context '{"source":"cli","component":"tool"}'
agent-evidence list --store ./data/evidence.jsonl
agent-evidence show --store ./data/evidence.jsonl --index 0
agent-evidence verify --store ./data/evidence.jsonlSQL stores use a SQLAlchemy URL instead of a file path:
agent-evidence record \
--store sqlite+pysqlite:///./data/evidence.db \
--actor planner \
--event-type tool.call \
--context '{"source":"cli","component":"tool"}'
agent-evidence query \
--store sqlite+pysqlite:///./data/evidence.db \
--event-type tool.call \
--source climake install
make test
make lint
make hooksFor PostgreSQL support, install the extra driver dependencies:
pip install -e ".[dev,postgres]"For a repeatable real-database validation path, use the bundled Docker-backed integration script:
make install-postgres
make test-postgresagent-evidence is the active code surface here for bundle, receipt, and summary.
It is not:
- the full Digital Biosphere Architecture stack
- the audit control plane
- the walkthrough demo
- the execution-integrity kernel
- a generic agent governance platform
- an official FDO standard
- Architecture: digital-biosphere-architecture
- Demo: verifiable-agent-demo
- Audit: aro-audit
- Historical map: docs/lineage.md
agent-evidence turns AI agent operations into structured evidence objects that can be validated, reviewed, and retained outside the original runtime. The project focuses on a minimal operation evidence profile, JSON Schema, validator, examples, LangChain export, offline bundle verification, and FDO-facing discussion material. It is experimental and discussion-oriented, not an official FDO standard.

