Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .claude-plugin/marketplace.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"name": "claude-code",
"source": "./plugins/claude-code",
"description": "Reference implementation of the Ironclad standard — multi-agent dev harness for Claude Code.",
"version": "0.6.3",
"version": "0.7.0",
"author": {
"name": "qwerfunch"
},
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -68,3 +68,6 @@ playwright-report/
# but should not bloat the repository — keep them local-only.
docs/ab-evaluation-extended/scenarios/*/cladding/
docs/ab-evaluation-extended/scenarios/*/vanilla/

# Local screen recordings — source for README GIFs, regeneratable; keep out of the repo.
*.mov
2 changes: 1 addition & 1 deletion AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ This file is the cross-tool entry point for any AI coding agent working on cladd

## 1. Project

cladding is the reference implementation of the [Ironclad](https://github.com/qwerfunch/ironclad) standard. Multi-agent dev harness; 15 Iron Law stages; 37 drift detectors; polyglot toolchain (9 languages). Successor to harness-boot.
cladding is the reference implementation of the [Ironclad](https://github.com/qwerfunch/ironclad) standard. Multi-agent dev harness; 15 Iron Law stages; 40 drift detectors; polyglot toolchain (9 languages). Successor to harness-boot.

## 2. Setup

Expand Down
115 changes: 115 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,121 @@ Versioning: [Semantic Versioning 2.0](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [0.7.0] — 2026-07-01 — Knowledge Graph

### Knowledge graph (spec↔code↔doc)

**In one line:** the links between spec, code, tests, and docs — which until now
only flowed one way and were scattered across shards — become a single,
always-current graph you can query for impact and *see* in a graph viewer.

> **Heads-up:** this is a **traceability and retrieval** capability, not a
> correctness one. It does not make generated code more correct (cladding's own
> A/B record shows that is orthogonal). What it does: pull the exact relevant
> neighborhood in one call instead of grepping, and stop doc/spec links from
> silently rotting.

**Added**

- **Reverse-edge index (backlinks).** Every forward edge the spec already carries
(`depends_on`, `modules`, `test_refs`) is now also queryable in reverse —
derived in memory, 0 bytes on disk. Module ownership is many-to-many on purpose
(a file records every feature that touches it).
- **`clad impact <feature|file>` + the `clad_get_impact` agent tool.** The blast
radius for a change: everything that transitively depends on it, the scenarios
at risk, and the exact set of tests to re-run. The backward complement of
`clad context` (what this needs) ↔ impact (what depends on this). A module path
fans out to every feature that touches it.
- **Doc graph + link integrity.** `clad sync` indexes which docs reference which
features and which docs link to which docs (`spec/_doc-links.yaml`). A new check
fails on a dead doc-to-doc link and warns on a doc citing a feature that no
longer exists. Scoped to skip fixture dirs, code examples, and docs marked
`clad-doc-links: ignore`, so it stays quiet on illustrative ids.
- **`clad graph export` + `clad graph stats`.** See the whole spec↔code↔doc graph
in a viewer you already have: `--format mermaid` for a PR, `--format obsidian`
for a navigable vault (one note per node with backlinks), `dot`/`json` for any
graph tool. `--focus <id> --depth N` exports just one neighborhood. `stats`
ranks the load-bearing hubs by degree.
- **Our own graph viewer, colored by SSoT layer.** `clad graph export --format
html` writes one self-contained file you can double-click — a dependency-free
interactive graph (no internet, no install). Each spec layer gets its own color
(sealed spec / design / derived / audit), code sits in a neutral tone, and
features show their readable slug instead of an opaque id. Search, filter by
layer or kind, hover to light up a neighborhood, drag to pin, a "Live/Calm"
toggle, light/dark — all in one offline page.
- **A live graph that follows your work.** `clad graph serve` opens the same
viewer at a local address and **updates itself as you edit** — change the spec
or a doc and the open page reflects it, no re-export. Agents can read the same
always-current graph through the new `clad_get_graph` tool.
- **An Obsidian-grade viewer.** The layout is now a continuously-running force
simulation: drag a node and the web stretches and recoils with real tension;
hovering pauses the motion so you can read; four force sliders (center / repel /
link / link distance) retune it live. Each node class has its own color — the
four spec layers, and code/test/doc each distinct — so the structure reads at a
glance.
- **The killer: live conformance, healing as you watch.** Every node carries its
real spec↔code health, computed from cladding's own drift detectors — a feature
whose test went missing, a file no feature claims, a doc pointing at a deleted
feature. Problem nodes glow; **fix the drift and the glow clears in real time**
(`clad graph serve`), with a top "in-sync %" pill. The graph IS the gate, made
visible — something only a tool that keeps spec and code connected-and-current
can show. (Static exports embed a point-in-time snapshot.)

**Notes**

- Drift detectors: 37 → 40 — this work adds `DOC_LINK_INTEGRITY` and `INFERABLE_DEPENDS_ON`; develop's `UNVERIFIED_AC` (below) is the third.
- The viewer is hand-rolled (no bundled third-party graph library) to stay
dependency-free and fully offline; the layout draws itself and settles, then
stays calm. It is a way to *see and navigate* the spec↔code↔doc structure, not
a correctness check — run `clad check` for that.
- Design + measured cost/benefit model: `docs/knowledge-graph/design.md`.
### Added

- **EARS `complex` pattern — the 6th canonical shape** (`F-9d168287`) — `src/spec/ears.ts`
implemented 5 of the 6 EARS patterns; the 6th, `complex` (a precondition combined
with a trigger, e.g. *"While the aircraft is on the ground, when reverse thrust is
commanded, the system shall …"*), was missing, and the validator keyed on the first
trigger word only — so a multi-clause `While … when …` requirement either failed
validation or was forced into a single-keyword bucket, silently losing the trigger
clause. `complex` is now a first-class `EarsPattern`: `checkEarsShape` validates BOTH
clauses (a leading `while` precondition AND a `when` trigger) and names the missing
one, preserving the precondition→trigger relationship EARS exists to capture. Purely
additive — the existing five patterns validate exactly as before. The new value is
mirrored across every enum site (`types.ts`, `spec/schema.json` ac.ears + always_ears,
`new.ts`, the MCP server enum) to keep authoring/validation/schema in lockstep.

- **`UNVERIFIED_AC` drift detector — AC → test → *observed pass*** (`F-96700032`)
— closes the one soft spot in the otherwise execution-based gate. `UNTESTED_AC`
only checks that a done AC's `test_refs` *exist on disk*, so an empty file, a
`test.skip`, or a failing test still satisfied it. When a JUnit XML report is
available — `gate.test_report` in `.cladding/config.yaml`, or a conventional
path (`test-report.junit.xml`, `coverage/junit.xml`, `.cladding/test-report.junit.xml`)
— `UNVERIFIED_AC` confirms each done AC's referenced tests actually **ran and
passed**: failing/errored or only-skipped tests are an `error`, and a test_ref
absent from a present report is a `warn` (a scoped/partial run is legitimate;
`--strict` promotes it). **Graceful by default:** with no report present the
detector emits nothing, leaving `UNTESTED_AC`'s existence check as the baseline,
so projects that don't emit JUnit XML are unaffected. Parsing is pure and
regex-based (no XML dependency), mirroring the coverage-XML approach.

- **`UNVERIFIED_AC` multi-framework `test_ref`↔testcase matching** (`F-d980359c`)
— the matcher was effectively vitest-only: it keyed every testcase by its
`classname` and assumed that was a file path. pytest (`tests.test_foo`),
Java/Kotlin (`com.example.FooTest`), and `file=`-attribute emitters therefore
never matched, so a *passing* test read as **`absent`** (a false positive under
`--strict`) and a *real* fail/skip was mis-reported as "did not run". The
parser now indexes each testcase under every path-shaped key it can derive
(the `file=` attribute, the `classname` as-is, and a dot→slash conversion of a
dotted classname) and matches `test_refs` **extension-agnostically**
(`FooTest` ↔ `FooTest.kt`). **Confident-or-degrade:** a report whose keys are
none path-like (e.g. jest describe-title `classname`s that cannot be mapped to
files) is treated as unmappable and the detector emits nothing, rather than
flooding false `absent` findings — preserving the low-false-positive contract.
Measured A/B (OLD = `classname`-only): correct verdicts across a
vitest/pytest/Kotlin/jest matrix went **2/8 → 8/8**; parse cost on a 10k-case
report grew by ~1.6 ms (vitest-shaped) to ~7.8 ms (pytest-shaped) per gate run
— noise against the ~50 s gate.

## [0.6.3] — 2026-06-26 — Honest Status

**In one line:** the one-line-per-feature index that agents grep (and that feeds
Expand Down
Loading
Loading