Skip to content

Add prose write test iterations#91

Draft
rawwerks wants to merge 1 commit into
feature/prose-write-runfrom
feature/prose-write-test-iterations-v2
Draft

Add prose write test iterations#91
rawwerks wants to merge 1 commit into
feature/prose-write-runfrom
feature/prose-write-test-iterations-v2

Conversation

@rawwerks
Copy link
Copy Markdown
Contributor

@rawwerks rawwerks commented May 21, 2026

Summary

Adds a bounded generated-test loop for apply-enabled prose write CLI runs.

  • --test-iterations=1 is the default for prose write --out ... --apply and --run.
  • --test-iterations=0 disables the loop.
  • --test-iterations=3 allows up to three generated-test attempts with apply-enabled repair authoring passes between failed attempts.

Use Case

When prose write authors a package that includes kind: test files, the CLI should dogfood those tests before handing the result back or starting the --run follow-up. This is the sister PR to #90: #90 adds write/apply/run, and this PR adds the test feedback loop on top.

Design Boundary

This stays a CLI host-adapter macro. --test-iterations is parsed by the CLI and is not passed into prose-author. The CLI invokes ordinary prose test <generated-test> after successful apply-enabled authoring, then invokes a new apply-enabled authoring pass only to repair failed generated tests.

Plain package-only prose write "..." is unchanged. Plain in-session prose write still routes to prose-author; non-CLI hosts that cannot perform an explicit --test-iterations macro must reject it before authoring.

The loop fails closed if a repair pass deletes an initially generated failing test instead of making it pass. Test discovery skips symlinks and tolerates file/directory races while walking generated files.

Repair prompts include captured failing test output capped at 12 KB. The side-effect ban and target-path repair instruction are prepended before captured output to reduce prompt-injection risk.

If the loop exhausts its iteration bound, the CLI returns the final failing prose test exit code rather than a distinct sentinel.

Examples

prose write --out src/reviewer --apply "draft a reviewer system"
prose write --out src/reviewer --apply --test-iterations=0 "draft a reviewer system"
prose write --out src/reviewer --apply --test-iterations=3 "draft a reviewer system"
prose write --out src/reviewer --run "draft a reviewer system"

Testing

  • pnpm --filter @openprose/prose-cli test -- tests/prose/command-model.test.ts tests/cli/cli.test.ts tests/skills/open-prose.test.ts passed: 140 tests.
  • pnpm --filter @openprose/prose-cli typecheck passed.
  • pnpm --filter @openprose/prose-cli build passed.
  • git diff --check passed.
  • ubs tools/cli/src/commands/base.ts tools/cli/src/prose/command-model.ts tools/cli/tests/prose/command-model.test.ts found no critical issues; remaining warnings were existing heuristic warnings in tests/general code patterns.
  • git rebase origin/feature/prose-write-run completed after preserving both the Add prose write run option #90 --out prompt coverage and this PR's test-iteration coverage.

Residual Risk

This PR is stacked on #90 and should be reviewed/merged after or with that branch. The local full CLI suite still has the known tests/prose/crash-window-replay.test.ts timeout observed on #90/main-local testing; the focused slice and CI-equivalent typed/build checks for this feature are green locally. GitHub checks are rerunning on the latest pushed head.

@rawwerks rawwerks force-pushed the feature/prose-write-test-iterations-v2 branch from 518824e to 328af05 Compare May 21, 2026 22:42
@rawwerks
Copy link
Copy Markdown
Contributor Author

Applied the advisory #91 polish on top of the rebase to PR #90 head in 328af05:

  • clearer missing-value UX for --test-iterations --apply
  • race-tolerant generated test discovery/read paths
  • README/body notes that --apply now defaults to one generated-test attempt, repair prompts include capped captured test output, and exhausted iterations return the final failing prose test exit code

Focused tests, typecheck, build, git diff --check, and UBS critical=0 passed locally. GitHub checks are green on the latest head.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant