One pipeline that takes a written claim, decides which check applies, runs the check, seals a tamper-evident receipt, and replays it to prove the result reproduces. It ties together four small, standalone, public tools:
| Stage | Tool | What it does |
|---|---|---|
| route | (this demo) | decide which gate a claim belongs to |
| dimensional check | UnitGate | is LHS = RHS dimensionally consistent? |
| receipt | EvidencePack | seal the verdict with a deterministic certificate hash |
| replay | ReplayGate | re-run the sealed pack, detect drift |
| wording | ClaimLint | (companion) lint README/docs for over-claims |
The verdict on every claim comes from a deterministic gate, never from a language model. The model does not get a vote.
git clone https://github.com/kyal102/claimstack-demo
cd claimstack-demo
python run_demo.py # no dependencies — standard library onlyOutput (from the bundled demo_claim.txt):
| Claim | Gate | Verdict | Sealed | Replay |
|------------------------------------------|-------------|-------------------------------|-----------|--------------|
| Energy = mass * acceleration | unitgate | ❌ DIMENSIONALLY_INVALID | SEALED_OK | REPLAY_MATCH |
| E = m * c**2 | unitgate | ✅ DIMENSIONALLY_VALID | SEALED_OK | REPLAY_MATCH |
| This method proves a new law of physics | unsupported | ⛔ UNSUPPORTED_CLAIM | SEALED_OK | REPLAY_MATCH |
| Our optimizer saves 90% of compute | needs_data | 📊 NEEDS_DATA | SEALED_OK | REPLAY_MATCH |
| The new scheduler is 4x faster | needs_data | 📊 NEEDS_DATA | SEALED_OK | REPLAY_MATCH |
| We refactored the storage layer | ambiguous | ❓ AMBIGUOUS_NEEDS_CLARIFICATION| SEALED_OK | REPLAY_MATCH |
Use your own claims (one per line):
python run_demo.py myclaims.txtdemo_report.json— machine-readable result for every claimdemo_output.md— the table abovepacks/<pack_id>.json— one sealed, replayable EvidencePack per claim
claim ─▶ route ─▶ gate ─▶ EvidencePack.seal() ─▶ ReplayGate.replay()
│ │ │ │
│ │ │ └─ re-runs the pack's
│ │ │ replay_command and
│ │ │ recomputes the hash
│ │ └─ deterministic certificate_hash
│ └─ DIMENSIONALLY_VALID / UNSUPPORTED_CLAIM / NEEDS_DATA / ...
└─ unitgate · unsupported · needs_data · ambiguous
Routing is transparent regex heuristics (claimstack/classify.py) — it only
decides which gate applies. The gate is the authority on the verdict.
- It is a demonstration of composing verifiable checks into a reproducible, receipt-backed pipeline.
- It does not prove scientific truth, guarantee correctness, or replace
experiment, simulation or peer review. UnitGate checks dimensions only;
NEEDS_DATA/UNSUPPORTED_CLAIMmean "a tool can't establish this," not "false." - Quantitative results (e.g. compute savings) are early prototype framing and require independent validation before being treated as a commercial claim.
See docs/LIMITATIONS.md.
claimstack/
classify.py demo glue: extract + route claims
pipeline.py demo glue: route -> gate -> seal -> replay
dimensions.py vendored from UnitGate
unitgate.py vendored from UnitGate
evidencepack.py vendored from EvidencePack
replaygate.py vendored from ReplayGate
_echo.py tiny replay target (so demos are self-contained)
run_demo.py CLI runner
demo_claim.txt sample input
tests/ unit tests (python -m unittest discover -s tests)
The tool modules are vendored copies of their standalone repos so the demo runs with zero third-party dependencies. Nothing here imports any private code.
MIT — see LICENSE.
