cladding

English · 한국어

cladding

To trust AI with coding, an organization needs three things —
that the code can be trusted, that it's traced, and that it holds up as you scale. cladding builds those three.
True to its name (cladding = the outer layer), it wraps the host LLM and verifies what comes before and after.

The official reference implementation of the Ironclad standard.
Before your host LLM (Claude Code · Codex · Gemini · Cursor) starts work, cladding feeds it the project's intent;
after it finishes, cladding verifies the result with 40 detectors and a 15-stage gate.

Only verified code ships as "done" — Even when the AI says "it's done," it has to clear the checks — so code that couldn't be verified is never recognized as complete.
Who · what · why is all on the record — The evidence you need for audits, regulatory response, and handoff can be traced back at any time.
It holds up as the team grows and you add more AIs — Because the spec is the shared baseline, conflicts and drift are blocked automatically.

Host LLM before (intent injection) · after (verification) · record (feedback loop) — how cladding wraps the LLM in a collaborative structure

This loop is after one thing — turning the AI's "it's done" from a claim into a proof.

So you can ship code an AI wrote with the same trust as code a human wrote.

cladding builds itself with cladding too — 196 of its 200 features cleared the same gate, the first L4 implementation of the Ironclad standard.

How it works with your host LLM

Before — inject the intent

So the LLM starts with the right context.

Project map injected — every time a conversation starts, "how many features, what's in progress, the last verification result" is handed to the LLM automatically _{(now you can see it too ↓)}.
Only the intent that matters — just the why of the feature at hand, its related features, and its acceptance criteria are pulled out (it does not dump the whole spec).
Project rules applied — the forbidden and preferred patterns the team agreed on go in as standing instructions every time.

After — verify: the 15-stage gate, 40 drift detectors, and an implementation-blind grader (below).

_{Real-time intervention (map injection · instant block · stop-block) all works on Claude Code. On Codex · Gemini · Cursor the same verification runs through in-conversation tool calls plus the git · CI gate.}

"done" is earned, not declared

The chronic disease of AI coding is "it's done" declared with no verification behind it. In cladding, a feature's status: done is not a value you write — it's a value you earn.

One scene — a hook blocks the LLM's 'done' declaration, the gate's RED feeds back as a repair card, and 'done' is earned only when the gate is GREEN

When the AI tries to write the completion mark itself → it's blocked on the spot ("earn completion by verifying it").
When the AI requests completion → all 9 deterministic stages run, and it's recorded as done only if every one passes; one failure and it auto-reverts — the E2E · evidence stages are handled by CI's full 15.
The moment it passes, a verification signature is left behind — committable proof that "this code was verified at this point."
Try to end a conversation leaving a failure → it blocks you once (end again on the same failure and it records the fact rather than letting it through) and carries the repair card into the next conversation.

The limits are disclosed plainly too: bypass paths exist that the instant block can't see, and those are caught by after-the-fact verification (the gate · drift checks). The instant block is the first line of defense, after-the-fact verification the second — and neither is a standalone guarantee.

What changes

How a vanilla AI coding environment and a cladding environment behave in the same situation.

Situation	Vanilla AI coding	cladding
Code drifts from the spec	fixed if a reviewer notices	auto-detected right after the edit (alert) · "done" can't pass while it's drifting
The AI says "it's done"	you can only take its word	`done` earned only when the gate is GREEN
Ending a session in a failing state	exits as-is, forgotten next time	the exit is blocked once, the repair card handed off
Two devs add a feature at the same time	merge conflict	hash-8 IDs · separate files → 0 conflicts
Who verifies the AI-written code?	the AI that wrote it self-certifies (risky)	an implementation-blind grader + the mechanical gate
Switching AI tools	reconfigure per tool	one spec → 4 hosts wired automatically

Project map — now you can see it and ask it _new

cladding always keeps a map inside it that connects spec · code · tests · docs. Now you can see that map with your own eyes.

Why this matters — the docs and the code don't drift apart. Docs lie as time passes — the code changes but the description stays put. cladding re-checks that connection every time the code is read, and blocks "done" while the two are out of sync.

Blue = spec (center), orange = code, green = tests, pink = docs; more-connected nodes grow larger and pull to the center.

cladding knowledge graph — spec · code · tests · docs colour-coded and linked (animated)

See — the whole project on one canvas — Run clad graph serve, open the printed localhost address in your browser, and you see what connects to what at a glance.
Ask — "what breaks if I change this?" — Ask the map and it tells you what's affected and which tests to run — it doesn't guess.
Measure — it shines brighter the larger the project — The amount you have to look at when fixing something drops sharply — on average 4× less than reading everything. (clad measure)

To launch it yourself — from your project folder:

clad graph serve                                  # live graph — localhost:3000, auto-reloads on save
clad graph export --format html --out graph.html  # or export to a single offline file (.html)

_{Both require cladding 0.7.0+.}

How it works

Spec → Code → Tests runs as a single cycle — the spec records the why, the gate verifies, and the detectors block drift.

Spec → Code → Tests cycle — the 15-stage verification and 40 drift detectors guard the cycle

1. Spec — the single source of intent (SSoT)

The spec records the why (what we're building and why). A 4-tier single source of truth — intent on top, the implementation below, code follows the spec.

Tier	Role	Defined & written by	Authority
A — Spec	intent (what · why)	humans define the intent → the LLM writes it in EARS form	sealed · doesn't change without human approval · outranks all
B — Design	design (how)	humans steer → the LLM writes	checked against A
C — Derived	implementation (code · tests) + attestation (verification signature)	the LLM writes	auto-regenerated by reading the code
D — Audit	audit record (what actually happened)	auto-recorded (append-only)	immutable

A outranks every tier below it — if spec (A) and code (C) disagree, the code is the one that's wrong.

Sharded · multi-dev safe — like spec/features/<slug>-<hash8>.yaml, each feature gets its own file + an 8-char hash ID (e.g. F-d86375d8). Two devs creating new features at the same time land in different files with different IDs, so zero merge conflicts. Details: Hash-based feature IDs.

4-tier SSoT — A(Spec) → B(Design) → C(Derived + attestation) → D(Audit), A outranks B

2. Gate — the 15-stage Iron Law

One check engine, bundled by cost: 3 at commit, 9 at push/completion, all 15 in CI. Only the depth differs.

15-stage Iron Law gate — static(6) · test & conformance(4) · E2E(3) · evidence(2), attestation signature when GREEN

Stage	What it checks
1.1 Type · 1.2 Lint	type errors · code style
1.3 Drift	spec ↔ code mismatches across 40 detectors
1.4 Commit · 1.5 Arch · 1.6 Secret	clean working tree · architecture invariants · leaked API keys
2.1 Unit · 2.2 Coverage	unit tests pass · coverage drop blocked
2.3 Spec conformance · 2.4 Deliverable smoke	the implementation-blind grader's tests pass · the declared deliverable actually runs (blocks the empty-green "tests pass but the deliverable doesn't run")
3.1 Smoke · 3.2 Perf · 3.3 Visual	e2e critical paths · performance budgets · UI visual regression
4.1 Audit · 4.2 UAT	every AC (acceptance criterion) has at least one piece of evidence · every done feature has at least one piece of evidence

3. Detector — 40 drift detectors

Drift in every direction across spec · code · test is detected automatically. Full catalog: detector catalog.

Direction	What it catches	Count	Representative detectors
spec ↔ code	in the spec but missing from code, or code that strays from the spec	10	`MISSING_IMPLEMENTATION`, `AC_DRIFT`, `DELIVERABLE_INTEGRITY`
code ↔ test	code present but no tests · coverage drop · secrets	6	`MISSING_TESTS`, `COVERAGE_DROP`, `HARDCODED_SECRET`
spec ↔ test	an AC in the spec not verified by a test · false status	6	`UNTESTED_AC`, `STATUS_DRIFT`, `SPEC_CONFORMANCE`
spec hygiene	the spec's own integrity (ID collisions · dependency cycles)	8	`ID_COLLISION`, `SLUG_CONFLICT`, `DEPENDENCY_CYCLE`
environment integrity	build environment · meta files	3	`HARNESS_INTEGRITY`, `META_INTEGRITY`
verification freshness	whether code changed since the verification signature	1	`STALE_ATTESTATION` (new)
governance · docs	policy violations · doc drift	3	`ABSENCE_OF_GOVERNANCE`, `PROJECT_CONTEXT_DRIFT`
graph · doc links	broken doc ↔ spec links · missing dependency edges	3	`DOC_LINK_INTEGRITY`, `REFERENCE_INTEGRITY`, `INFERABLE_DEPENDS_ON` (new)

The knowledge graph these power is a traceability / retrieval capability, not a correctness one — cladding's own A/B record shows correctness is orthogonal to governance. It tells you what connects to what and what to re-check; it does not claim the code is correct.

4. Cycle — one feature's lifecycle

Define → Sync → Implement → Earn. You earn "done" only by passing every check.

One feature's lifecycle — Define → Sync → Implement → Earn, completion earned when all checks pass / auto-revert on failure

Multi-Agent — separating the builder from the verifier

The agents that build are kept separate from the agents that verify, so no agent can sign off on its own work. blind-author goes one step further — the agent that writes the tests has no tool to read the implementation at all (no Read/Grep granted). "Wrote it without looking at the implementation" becomes a structural fact, not a promise. This separation aligns with the segregation-of-duties principle that regulatory · audit regimes (EU AI Act · SOX) call for — it maps onto the spirit of those regimes, not a certification.

Agent separation of duties — orchestrator dispatches, planner/developer/reviewer act, blind-author is the test writer who can't see the implementation, observability watches

Ecosystem

cladding sits at the junction of three existing categories.

How it differs from the neighbors

Spec Kit · OpenSpec · Tessl · Kiro — tools that help you write a good spec. On top of that, cladding keeps continuously cross-checking, inside the dev loop, that the spec and the actual code don't drift.
BMAD · ChatDev · Claude Code Agent Teams — systems for splitting roles across multiple AI agents. cladding's agent division of labor runs with spec · gate · audit record combined on top.
tdd-guard — a tool that forces the AI to write tests first. The Unit · Coverage · oracle stages among cladding's 15 do the same job, more structurally.
OpenHands · Cline · Aider · Goose — runners that make the AI write code (pure executors). cladding is the upper layer that verifies and governs the code those runners produce.

cladding's distinction is the combination — binding the core of the categories above into one verification loop.

Install

Two steps — install the infrastructure → create the project spec.

Step 1 — Install the infrastructure (npm)

npm install -g cladding   # install the cladding CLI
cd <project>              # move into the project
clad setup                # auto-wire your AI tools (Claude / Codex / Gemini / Cursor)

Where clad setup connects (4 hosts · 5 wire points)

Host (when detected)	Wired location	Auto-activation
Claude Code (`~/.claude/`)	`~/.claude/plugins/cladding`	`claude plugin marketplace add` + `install`
Codex CLI skills (`~/.agents/`)	`~/.agents/skills/cladding-*`	(auto on Codex restart)
Codex CLI MCP server (`~/.codex/`)	`[mcp_servers.cladding]` in `~/.codex/config.toml`	(TOML entry itself)
Gemini CLI (`~/.gemini/`)	`~/.gemini/extensions/cladding`	`gemini extensions link`
Cursor (`~/.cursor/`)	`mcpServers.cladding` in `~/.cursor/mcp.json`	(JSON entry itself)

clad setup invokes each host's activation command automatically when the claude / gemini binaries are on PATH. Safe to re-run after an upgrade or after installing a new AI tool.

Verification level (honesty note): Claude Code is fully verified through real-usage campaigns (including real-time intervention). Codex · Gemini CLI have automated wiring + basic behavior confirmed. Cursor wires automatically, but real-usage verification is still pending — to be updated as it lands.

About the MCP server. All 4 hosts wire cladding as an MCP server — only the wire location differs. MCP is not something you invoke directly — no /mcp slash, no manual connect step. The AI in each host calls cladding's tools on its own in response to natural-language requests; you only type /cladding:init once and chat normally.

Step 2 — Init (create the project spec)

From the project directory, call it once inside your AI tool:

[inside your AI tool] /cladding:init "B2B payment SaaS"

The project's spec.yaml and supporting docs are created — once per project.

To raise enforcement: clad init --with-hook (install pre-commit + pre-push git hooks) · clad init --with-ci (scaffold the CI gate — true enforcement lives in CI).

Three init scenarios

Starting point	Command	What happens
An idea, nothing else	`/cladding:init "I'm going to build a B2B payment SaaS"`	LLM analyzes the domain → spec · docs · policies generated + 2–3 follow-up questions
A planning doc	`/cladding:init docs/plan.md`	recognizes the file path → loads the contents automatically and uses them as intent
Adopting into an existing project	`/cladding:init "apply cladding to this project"`	auto-scans the existing code → observed patterns merged with the intent

Init once, that's it

Init once and you're done — after that, just develop as usual. cladding runs the before/after loop in the background, so there are no commands to memorize.

Upgrading

npm update -g cladding     # 1. install the new version
cd <your project>          # 2. once per project
clad update                # 3. bring it in line with the new version

Your code · spec.yaml · docs are left untouched, so it's safe — and if the newer version is stricter and has something to flag, it just points it out (it won't block or fix anything).

Status

Version	Conformance	Tests	Gate	Features
v0.7.1 (2026-07)	L4 · self-declared	1691 / 1691	15 stages · 40 detectors	200 (196 done)

_{170 test files · 6 capabilities · coverage drop blocked by the COVERAGE_DROP detector}

Road to Ironclad 1.0 — 1.0 locks only when two independent implementations pass the L4 conformance fixtures (GOVERNANCE § 1). cladding is the first.

Docs

License

MIT. LICENSE · Related: Ironclad (the standard cladding implements) · harness-boot (the seed).

Name		Name	Last commit message	Last commit date
Latest commit History 447 Commits
.claude-plugin		.claude-plugin
.claude		.claude
.github		.github
bin		bin
conformance		conformance
docs		docs
plugins		plugins
scripts		scripts
skills		skills
spec		spec
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.secretlintignore		.secretlintignore
.secretlintrc.json		.secretlintrc.json
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
GOVERNANCE.md		GOVERNANCE.md
LICENSE		LICENSE
README.html		README.html
README.ko.html		README.ko.html
README.ko.md		README.ko.md
README.md		README.md
SECURITY.md		SECURITY.md
eslint.config.js		eslint.config.js
package-lock.json		package-lock.json
package.json		package.json
spec.yaml		spec.yaml
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cladding

How it works with your host LLM

Before — inject the intent

"done" is earned, not declared

What changes

Project map — now you can see it and ask it _new

How it works

1. Spec — the single source of intent (SSoT)

2. Gate — the 15-stage Iron Law

3. Detector — 40 drift detectors

4. Cycle — one feature's lifecycle

Multi-Agent — separating the builder from the verifier

Ecosystem

How it differs from the neighbors

Install

Step 1 — Install the infrastructure (npm)

Step 2 — Init (create the project spec)

Three init scenarios

Init once, that's it

Upgrading

Status

Docs

License

About

Uh oh!

Releases 44

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cladding

How it works with your host LLM

Before — inject the intent

"done" is earned, not declared

What changes

Project map — now you can see it and ask it new

How it works

1. Spec — the single source of intent (SSoT)

2. Gate — the 15-stage Iron Law

3. Detector — 40 drift detectors

4. Cycle — one feature's lifecycle

Multi-Agent — separating the builder from the verifier

Ecosystem

How it differs from the neighbors

Install

Step 1 — Install the infrastructure (npm)

Step 2 — Init (create the project spec)

Three init scenarios

Init once, that's it

Upgrading

Status

Docs

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 44

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Project map — now you can see it and ask it _new

Packages