perf(gate): run the test suite once per gate — share coverage run with unit stage (closes #215)#216
Open
yuyu04 wants to merge 2 commits into
Open
perf(gate): run the test suite once per gate — share coverage run with unit stage (closes #215)#216yuyu04 wants to merge 2 commits into
yuyu04 wants to merge 2 commits into
Conversation
…(F-bfe14aac) The TypeScript type and lint gates re-ran from scratch every gate. They now reuse a build cache: `tsc --noEmit --incremental` (build-info file) and `eslint --cache`, both under `.cladding/cache/` (already gitignored, so a managed project's tree stays clean). Measured on cladding's own repo (unchanged re-run — the local pre-commit/ pre-push loop): tsc 2.7s → 1.1s, eslint 2.5s → 0.6s (~3.4s saved). SOUND, not a shortcut: with a stale build-info present, a newly-introduced type error in an included file is STILL caught (verified — tsc rebuilds the affected program slice; eslint --cache keys on file+config hash). Cold runs (fresh CI checkout) just rebuild the cache — no regression. Test execution is deliberately NOT scoped: a gate must certify the whole tree, so changed-files / test-selection (unsound for a gate) was avoided. The dominant test cost (~20s) and the unit+coverage double-run (~9.5s) are noted as separate follow-ups. Existing toolchain arg-pins updated; blind-authored test (incremental-gate.test.ts, 5). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… with the unit stage (F-97abf5db) pre-push ran BOTH stage_2.1 (vitest run) and stage_2.2 (vitest run --coverage), executing the full suite TWICE (~9.5s + ~10.7s). The coverage run already runs every test, so the unit run was redundant. - src/stages/test-run-cache.ts (new) — gate-scoped memo (mirrors spec cache F-cd0415): primeTestRunCache(on) + memoizeTestRun(cwd, run) + the pure unitActionFromCoverage decision. - unit.ts — in a primed gate, trigger + share the coverage run; on GREEN reuse it (no second suite run); on non-green fall back to a tests-only run. - cov.ts — read the shared (memoized) coverage run. - clad.ts — prime around the stage loop, clear in finally. SOUND attribution: reuse-pass is returned ONLY for a green coverage run, so a failing test can never surface as a unit pass. Verified: a failing test reds BOTH stages; a passing suite greens both with one run. Test SELECTION (changed-files) intentionally avoided — a gate must run the whole suite. Measured (cladding's own repo): clad check --tier=pre-push ~40.4s → ~30.1s (-~10s). Blind-authored tests (tests/stages/test-run-dedup.test.ts, 13). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
8c3ac81 to
6e8a5cf
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
pre-pushran the full test suite twice —stage_2.1(vitest run) andstage_2.2(vitest run --coverage). The coverage run already runs every test, so a gate-scoped memo now shares ONE run across both stages. Closes #215.A/B (cladding's own repo, worktrees)
clad check --tier=pre-pushSoundness (verified — a gate must never pass a broken tree)
unitActionFromCoveragereturnsreuse-passonly for a green coverage run (exitCode === 0); every non-green case →fallback(tests-only re-run). Direct integration check on a temp fixture:A coverage-threshold miss (tests pass) → coverage fails, unit's tests-only fallback passes → correct attribution. A failing test can never be reported as a unit pass via reuse.
What's in the box
src/stages/test-run-cache.ts(new) —primeTestRunCache(on),memoizeTestRun(cwd, run)(generic, keyed by resolved cwd), and the pureunitActionFromCoveragedecision.unit.ts— in a primed gate, trigger + share the coverage run; reuse on green, tests-only fallback on non-green. Unprimed (standalone / MCP) → unchanged.cov.ts— read the shared (memoized) coverage run.clad.ts— prime around the stage loop, clear infinally.Scope
Test selection (changed-files) is deliberately NOT done — a gate certifies the whole tree. This removes only the duplicate full run, not coverage of any test.
Feature cycle
spec/features/test-run-dedup-97abf5db.yaml(F-97abf5db, 4 ACs) → implement → blind tests (tests/stages/test-run-dedup.test.ts, 13) →clad doneGREEN (cladding's own gate ran with the dedup and passed).🤖 Generated with Claude Code