Add prose write test iterations#91
Draft
rawwerks wants to merge 1 commit into
Draft
Conversation
518824e to
328af05
Compare
Contributor
Author
|
Applied the advisory #91 polish on top of the rebase to PR #90 head in
Focused tests, typecheck, build, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a bounded generated-test loop for apply-enabled
prose writeCLI runs.--test-iterations=1is the default forprose write --out ... --applyand--run.--test-iterations=0disables the loop.--test-iterations=3allows up to three generated-test attempts with apply-enabled repair authoring passes between failed attempts.Use Case
When
prose writeauthors a package that includeskind: testfiles, the CLI should dogfood those tests before handing the result back or starting the--runfollow-up. This is the sister PR to #90: #90 adds write/apply/run, and this PR adds the test feedback loop on top.Design Boundary
This stays a CLI host-adapter macro.
--test-iterationsis parsed by the CLI and is not passed intoprose-author. The CLI invokes ordinaryprose test <generated-test>after successful apply-enabled authoring, then invokes a new apply-enabled authoring pass only to repair failed generated tests.Plain package-only
prose write "..."is unchanged. Plain in-sessionprose writestill routes toprose-author; non-CLI hosts that cannot perform an explicit--test-iterationsmacro must reject it before authoring.The loop fails closed if a repair pass deletes an initially generated failing test instead of making it pass. Test discovery skips symlinks and tolerates file/directory races while walking generated files.
Repair prompts include captured failing test output capped at 12 KB. The side-effect ban and target-path repair instruction are prepended before captured output to reduce prompt-injection risk.
If the loop exhausts its iteration bound, the CLI returns the final failing
prose testexit code rather than a distinct sentinel.Examples
Testing
pnpm --filter @openprose/prose-cli test -- tests/prose/command-model.test.ts tests/cli/cli.test.ts tests/skills/open-prose.test.tspassed: 140 tests.pnpm --filter @openprose/prose-cli typecheckpassed.pnpm --filter @openprose/prose-cli buildpassed.git diff --checkpassed.ubs tools/cli/src/commands/base.ts tools/cli/src/prose/command-model.ts tools/cli/tests/prose/command-model.test.tsfound no critical issues; remaining warnings were existing heuristic warnings in tests/general code patterns.git rebase origin/feature/prose-write-runcompleted after preserving both the Add prose write run option #90--outprompt coverage and this PR's test-iteration coverage.Residual Risk
This PR is stacked on #90 and should be reviewed/merged after or with that branch. The local full CLI suite still has the known
tests/prose/crash-window-replay.test.tstimeout observed on #90/main-local testing; the focused slice and CI-equivalent typed/build checks for this feature are green locally. GitHub checks are rerunning on the latest pushed head.