feat(gate): emit self JUnit + reclassify doc/scaffold test_refs as evidence_refs#220
Open
yuyu04 wants to merge 1 commit into
Open
feat(gate): emit self JUnit + reclassify doc/scaffold test_refs as evidence_refs#220yuyu04 wants to merge 1 commit into
yuyu04 wants to merge 1 commit into
Conversation
… evidence_refs cladding's UNVERIFIED_AC detector verifies that each done AC's test_refs actually RAN and PASSED (via a JUnit report) — but cladding never emitted one for itself, so on its own gate the check degraded to existence-only and gave ZERO self-benefit. Enabling emission first surfaced 21 warnings: 15 test_refs across 10 features pointed at things that are not executable tests (docs, underscore-prefixed scaffolds, .gitignore, spec.yaml dogfood markers), so they have no observed result in any JUnit report. - Wire vitest to emit JUnit to .cladding/test-report.junit.xml (a DEFAULT_REPORT_CANDIDATE the detector auto-discovers; gitignored, so it never pollutes git status). In CI `npm test` runs before `clad check`, so the gate reads a fresh, complete report and now genuinely verifies AC↔test↔pass. - Reclassify the 15 non-test refs test_refs → evidence_refs. MISSING_TESTS is satisfied by EITHER test_refs or evidence_refs, so each AC stays verified; the refs are simply recorded as the doc/artifact evidence they actually are (the v0.2.2 "lump everything into test_refs" dishonesty, recurring). One AC (F-d8223c) kept its real self-consistency test_ref and moved only the scaffold. Result: UNVERIFIED_AC 0 / UNTESTED_AC 0 / MISSING_TESTS 0, strict gate GREEN with the report present — the self-gate now actually exercises its own JUnit feature instead of perpetually degrading. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
208bd0a to
9c1e1fb
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
cladding's
UNVERIFIED_ACdetector verifies each done AC'stest_refsactually ran and passed via a JUnit report — but cladding never emitted one for itself, so on its own self-gate the check silently degraded to existence-only: zero self-benefit (the gap flagged while triaging #201/#202).This wires the emission and fixes what enabling it exposed.
Changes
1. Emit JUnit (
vitest.config.ts)reporters: ['default', ['junit', {outputFile: '.cladding/test-report.junit.xml'}]]— aDEFAULT_REPORT_CANDIDATEthe detector auto-discovers, under.cladding/(gitignored, never pollutes git status). In CInpm testruns beforeclad check, so the gate reads a fresh, complete report and now genuinely verifies AC↔test↔observed-pass.2. Reclassify 15 mis-categorized
test_refs→evidence_refs(10 features)Turning emission on first surfaced 21 warnings: 15
test_refspointed at things that are not executable tests and so have no observed result in any JUnit:docs/ab-evaluation*/**.md),.gitignore,spec.yaml(dogfood markers), and underscore-prefixed scaffolds (_vanilla-sim.ts,_shared-scaffold.ts,_drift-injection.ts,_curator.ts,_size-budgets.ts).MISSING_TESTSis satisfied by eithertest_refsorevidence_refs, so each AC stays verified — the refs are now recorded as the doc/artifact evidence they actually are (undoing the v0.2.2 "lump everything into test_refs" dishonesty that had recurred). One AC (F-d8223c) kept its realself-consistency.test.tstest_ref and moved only the scaffold.Verification
UNVERIFIED_AC 0 / UNTESTED_AC 0 / MISSING_TESTS 0; strict gate GREEN with the report present — the self-gate now exercises its own JUnit feature instead of perpetually degrading.npm test, conformance,clad check --tier=pre-commit --strict.Base note
Branched off
develop(independent of #219). Both re-stampspec/attestation.yaml, so whichever of #219/this merges second needs a trivialclad sync+ re-attest — no logical conflict.Closes the ② half of the #202 follow-up (① = detector count, shipped in #219).
🤖 Generated with Claude Code