fix: apply code-review findings (trace integrity, CLI dedup, docs sync) by Thormatt · Pull Request #15 · Thormatt/orc

Thormatt · 2026-06-13T00:33:24Z

Context

Full-repo review surfaced two correctness bugs, a dead module, duplicated CLI boilerplate, and stale docs. This PR applies the verified findings; each fix landed with a failing test first where behavior changed.

Changes

Correctness

runs/runner.py — write the trace JSON before finalizing the run row. Previously a failed JSON write left a status='ok' row with no trace file, which audit export treats as corruption.
verify_claim.py — decomposed mode now aggregates run.retrieval across all atoms (summed candidates_considered, deduped union of returned chunks) instead of recording only the last atom's retrieval.

Refactor

New cli_commands/_shared.py::resolve_workspace replaces the identical try/except → ClickException wrapper duplicated in 12 command modules.
UnknownDomainError moved into orc.errors with the rest of the hierarchy; routing.py re-exports it.

Chore / docs

Removed dead src/orc/config.py (never imported; its config.toml knobs silently did nothing — the effects allow-list is config.toml's real consumer and is untouched).
README command table gains workspace embed, eval show, approve show, execute, worker, audit export; CHANGELOG's shipped [Unreleased] items moved into 0.2.0.
Documented (CLI help + README) that approve --by is self-reported and unauthenticated — multi-approver gates used for EU AI Act Art. 14(5) need an authenticated surface.

Review findings intentionally NOT applied

lease_one rowcount check: transaction() is BEGIN IMMEDIATE, so the suspected race is unreachable — refuted rather than patched.
Chunker start_char "bug": refuted; window starts advance monotonically.

Testing

401 passed, 2 skipped (gated live-LLM), ruff clean.
New tests: failed-trace-write finalization guard, decomposed-mode retrieval aggregation, shared resolve helper.

🤖 Generated with Claude Code

If write_trace_json failed after finalize_run_row committed, the db showed status='ok' for a run with no trace file — load_trace raised TraceNotFoundError and audit export aborted on what looked like corruption. Writing the JSON first means a failed write leaves the row 'running', which is recoverable. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Each atom's binary sub-run overwrote run.retrieval, so a decomposed run's trace recorded only the last atom's retrieval — under-reporting candidates_considered and the returned set for audit and frozen-replay drift checks. Snapshot each atom's record and merge after the loop: candidates_considered sums across atoms, returned is the deduped union. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

The try/except WorkspaceNotFoundError -> ClickException wrapper was duplicated in 12 command modules. One helper in _shared.py means a future change to the error message (e.g. suggesting `orc workspace list`) lands everywhere at once. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Nothing imports orc.config. It promised ~/.orc/config.toml knobs for retrieval_k/retrieval_pool and model overrides that never took effect: skills read manifest defaults and llm/models.py reads the env vars directly. config.toml itself stays — the effects allow-list (effects/registry.py) is its real consumer. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

- README command table gains workspace embed, eval show, approve show, execute, worker, and audit export — all shipped and tested but undocumented. - CHANGELOG: move shipped [Unreleased] items (hybrid retrieval, propose, report, eval, tiered, calibrate) into 0.2.0; [Unreleased] keeps only the Planned list. - Document that `orc approve --by` is self-reported and unauthenticated, in CLI help and README: multi-approver gates used as a compliance control (EU AI Act Art. 14(5)) need an authenticated surface. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Every other user-facing error lives in orc.errors; UnknownDomainError was the one exception, forcing MCP and CLI callers to import from a skill-internal routing module. routing.py re-exports it so existing imports keep working. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Thormatt and others added 6 commits June 12, 2026 20:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: apply code-review findings (trace integrity, CLI dedup, docs sync)#15

fix: apply code-review findings (trace integrity, CLI dedup, docs sync)#15
Thormatt wants to merge 6 commits into
mainfrom
fix/review-findings

Thormatt commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Thormatt commented Jun 13, 2026

Context

Changes

Review findings intentionally NOT applied

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant