Per-plan PR model — each plan ships as its own stacked/parallel PR#26
Open
andrewemark wants to merge 3 commits into
Open
Per-plan PR model — each plan ships as its own stacked/parallel PR#26andrewemark wants to merge 3 commits into
andrewemark wants to merge 3 commits into
Conversation
## Motivation: slowing down to speed up We're introducing a deliberately more broken-out PR process — one PR per plan, each gated by per-plan assess → docs → open-pr — so that we can review each plan's actual output in isolation and see where the agent's implementation or our planning falls short. A feature-level PR makes those gaps easy to miss; a per-plan PR forces them to the surface. The eventual goal is to NOT need this much process. Once we trust the agent's planning + implementation cycle enough that per-plan review stops catching meaningful gaps, we can collapse this back toward fewer, larger PRs. Until then, slowing down here lets us: - See where the agent under-specifies plans or over-builds implementations - Revise the SDLC process itself based on real friction, not speculation - Get features across the line in the architectural form we actually want, not the form that emerged from accumulated drift across one large PR TL;DR: slowing down to speed up. ## What changes Each plan in a feature now ships as its own pull request with self-contained PR-body documentation, instead of a single feature-level PR at the end. PR bases form a DAG derived from plan depends_on: independent plans open parallel PRs against the feature parent; dependent plans stack on their upstream plan's branch. Per-plan PR gate (REQUIRED before next plan): /drvr:assess <plan> → /drvr:docs-artifacts <plan> → /drvr:open-pr <plan> Each gate step is surfaced explicitly in the orchestration and implementation skills so the agent doesn't collapse the chain or skip ahead to the next plan. Key changes: - CLAUDE.md: lifecycle diagram, folder layout, DAG-of-bases mental model - /drvr:feature: collects feature parent + branch prefix (not single feature branch) - planning-guidance: per-plan Environment (Base/Feature branch from depends_on); PR Stack table in overview; branch confirmation in Step 4.5 - implementation-guidance: per-plan branch validation in pre-flight 2.3; Step 5.5 surfaces the per-plan gate explicitly with the assess→docs→open-pr chain - sdlc-orchestration: per-plan gate transitions; per-plan FEATURE_LOG events (assessment_complete_<plan>, pr_created_<plan>, pr_merged_<plan>, etc.); DAG-aware "next plan" hints (parallel vs stacked) - /drvr:assess: scoped to one plan; output is assessment/<plan>-test-curation.md - /drvr:docs-artifacts: writes driver-docs/<plan>/* with mandatory Stack Position section so PR body is self-contained for reviewers without DAG context - /drvr:open-pr: per-plan; base from plan Environment; asks the user for a new base if recorded Base Branch was deleted by a merged upstream Files: 8 changed, +605 / -291 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The assess command previously biased toward KEEP in four places ("when
uncertain, KEEP", "this is a pruning pass, not a purge", failure-handling
notes, etc.) on the rationale that false-keeps cost less than false-prunes.
In practice this lets tautological scaffolding ship — e.g., tests that
assert "the enum has three variants" against a three-variant enum literal.
That assertion passes iff the implementation is unchanged and catches no
real bug; it's the canonical PRUNE case. The KEEP-by-default rule defeated
the assess phase's purpose.
Replace with a shape-based rule: judge by what the test asserts.
- Structural (counts, enum membership, types, mock call shapes,
internal state) → PRUNE
- Behavioral (inputs → outputs, error modes, contract boundaries) → KEEP
Tautological structural assertions are called out explicitly as a PRUNE
signal in Step 3, with the enum-variant example. The "only-coverage-of-an-
edge-case" exception is preserved but reframed to require *behavioral*
coverage — structural-only coverage is not coverage. Step 7's post-prune
failure handling (restore if tests fail) is unchanged — that's a defensive
mechanism, not a default-bias issue.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous Step 7 said: "If tests fail after changes, investigate: a pruned test was the only coverage for a real behavior → restore it as KEEP." This doesn't track: - A deleted test cannot fail (it's gone), and pruning test X does not cause unrelated test Y to fail in any normal sense - The only way a pruned test's removal would surface as a test-suite failure is via test interdependence (shared state, ordering) — and the right fix there is to extract the dependency into a fixture, not to restore the pruned test - "The pruned test was the only coverage for a real behavior" is not a signal a passing/failing test suite can give you — the surviving tests pass and the behavior is silently uncovered The "restore as KEEP" rule was a vestige of the KEEP-by-default bias removed in 5f2013a, repackaged as defensive recovery. Replace with the actual failure modes that arise after Step 7: - Promoted test rewritten incorrectly → fix the rewrite (common case) - Surviving test depended on pruned test's setup/state → extract a fixture - Unrelated regression in the curation commit → revert and redo cleanly - Pre-existing unrelated failure → address separately Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why this change: slowing down to speed up
We're introducing a deliberately more broken-out PR process — one PR per plan, each gated by
/drvr:assess→/drvr:docs-artifacts→/drvr:open-pr— so we can review each plan's actual output in isolation and see where the agent's implementation or our planning falls short. A feature-level PR makes those gaps easy to miss; a per-plan PR forces them to the surface.The eventual goal is to NOT need this much process. Once we trust the agent's planning + implementation cycle enough that per-plan review stops catching meaningful gaps, we can collapse this back toward fewer, larger PRs. Until then, slowing down here lets us:
TL;DR: slowing down to speed up.
What this PR does
Restructures the SDLC post-implementation flow so each plan in a feature ships as its own pull request with self-contained PR-body documentation, instead of one feature-level PR at the end.
Mental model: DAG of bases (not a linear stack)
Each plan's PR Base Branch is derived from its
depends_on:depends_on: []→ Base = feature parent (e.g.main). Independent plans get parallel PRs.depends_on: [N]→ Base = upstream plan N's Feature Branch. Dependent plans get stacked PRs.depends_on: [N, M]→ user picks one as the branch parent; others satisfied by interface contracts.Plans can be implemented in any dependency-respecting order. Plan 8 before Plan 2 is fine if they're independent — both PRs target the feature parent in parallel.
Per-plan PR gate (REQUIRED before next plan)
Each step is surfaced explicitly in
sdlc-orchestrationandimplementation-guidanceso the agent doesn't collapse the chain or skip ahead.Per-plan handoff doc structure
Each per-plan
feature-overview.mdhas a mandatory Stack Position section (base/feature branch, upstream/downstream plans, link to cross-plan rollup) so a reviewer who hasn't seen the rest of the DAG can still evaluate the PR.Edge cases handled
/drvr:open-prchecksgit ls-remoteand asks the user which branch to target instead (usually the feature parent). The plan's recorded Base Branch is updated to match.FEATURE_LOG.mdevents (pr_created_<plan>,pr_merged_<plan>, etc.). Plan 01 merged, Plan 02 in review, Plan 03 mid-implementation is a valid simultaneous state.Files changed
CLAUDE.mdcommands/feature.mdskills/planning-guidance/SKILL.mddepends_on); PR Stack table in overview; branch confirmation in Step 4.5skills/implementation-guidance/SKILL.mdskills/sdlc-orchestration/SKILL.mdcommands/assess.mdassessment/<plan>-test-curation.md; gates/drvr:docs-artifacts <plan>commands/docs-artifacts.mddriver-docs/<plan>/*with mandatory Stack Position section; updates cross-plan rollup; gates/drvr:open-pr <plan>commands/open-pr.mdStats: 8 files changed, +605 / −291
Test plan
This plugin can't be unit-tested end-to-end without running through a real feature, so the verification here is documentation review:
CLAUDE.mdmatches the per-plan gate described insdlc-orchestrationplanning-guidanceStep 4.5 and Step 5 prescribe the same Base Branch derivation rule (fromdepends_on)implementation-guidanceStep 5.5 names the same three gate commands (assess,docs-artifacts,open-pr) in the same order assdlc-orchestrationBookkeeping → Per-Plan PR Gatecommands/open-pr.mdStep 5 correctly handles the merged-upstream case (asks user, updates plan Environment)commands/docs-artifacts.mdincludes the Stack Position section as a required sectionOut of scope (follow-ups if needed)
implementation-guidancepre-flight 2.3 detects missing Feature Branch and suggestsgit checkout -b <branch> <base>, but doesn't run it. Could be automated later if we trust the convention./drvr:rebase-on-upstreamcommand.🤖 Generated with Claude Code