Skip to content

fix: apply code-review findings (trace integrity, CLI dedup, docs sync)#15

Open
Thormatt wants to merge 6 commits into
mainfrom
fix/review-findings
Open

fix: apply code-review findings (trace integrity, CLI dedup, docs sync)#15
Thormatt wants to merge 6 commits into
mainfrom
fix/review-findings

Conversation

@Thormatt

Copy link
Copy Markdown
Owner

Context

Full-repo review surfaced two correctness bugs, a dead module, duplicated CLI boilerplate, and stale docs. This PR applies the verified findings; each fix landed with a failing test first where behavior changed.

Changes

Correctness

  • runs/runner.py — write the trace JSON before finalizing the run row. Previously a failed JSON write left a status='ok' row with no trace file, which audit export treats as corruption.
  • verify_claim.py — decomposed mode now aggregates run.retrieval across all atoms (summed candidates_considered, deduped union of returned chunks) instead of recording only the last atom's retrieval.

Refactor

  • New cli_commands/_shared.py::resolve_workspace replaces the identical try/except → ClickException wrapper duplicated in 12 command modules.
  • UnknownDomainError moved into orc.errors with the rest of the hierarchy; routing.py re-exports it.

Chore / docs

  • Removed dead src/orc/config.py (never imported; its config.toml knobs silently did nothing — the effects allow-list is config.toml's real consumer and is untouched).
  • README command table gains workspace embed, eval show, approve show, execute, worker, audit export; CHANGELOG's shipped [Unreleased] items moved into 0.2.0.
  • Documented (CLI help + README) that approve --by is self-reported and unauthenticated — multi-approver gates used for EU AI Act Art. 14(5) need an authenticated surface.

Review findings intentionally NOT applied

  • lease_one rowcount check: transaction() is BEGIN IMMEDIATE, so the suspected race is unreachable — refuted rather than patched.
  • Chunker start_char "bug": refuted; window starts advance monotonically.

Testing

  • 401 passed, 2 skipped (gated live-LLM), ruff clean.
  • New tests: failed-trace-write finalization guard, decomposed-mode retrieval aggregation, shared resolve helper.

🤖 Generated with Claude Code

Thormatt and others added 6 commits June 12, 2026 20:24
If write_trace_json failed after finalize_run_row committed, the db
showed status='ok' for a run with no trace file — load_trace raised
TraceNotFoundError and audit export aborted on what looked like
corruption. Writing the JSON first means a failed write leaves the row
'running', which is recoverable.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Each atom's binary sub-run overwrote run.retrieval, so a decomposed
run's trace recorded only the last atom's retrieval — under-reporting
candidates_considered and the returned set for audit and frozen-replay
drift checks. Snapshot each atom's record and merge after the loop:
candidates_considered sums across atoms, returned is the deduped union.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The try/except WorkspaceNotFoundError -> ClickException wrapper was
duplicated in 12 command modules. One helper in _shared.py means a
future change to the error message (e.g. suggesting `orc workspace
list`) lands everywhere at once.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Nothing imports orc.config. It promised ~/.orc/config.toml knobs for
retrieval_k/retrieval_pool and model overrides that never took effect:
skills read manifest defaults and llm/models.py reads the env vars
directly. config.toml itself stays — the effects allow-list
(effects/registry.py) is its real consumer.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- README command table gains workspace embed, eval show, approve show,
  execute, worker, and audit export — all shipped and tested but
  undocumented.
- CHANGELOG: move shipped [Unreleased] items (hybrid retrieval, propose,
  report, eval, tiered, calibrate) into 0.2.0; [Unreleased] keeps only
  the Planned list.
- Document that `orc approve --by` is self-reported and unauthenticated,
  in CLI help and README: multi-approver gates used as a compliance
  control (EU AI Act Art. 14(5)) need an authenticated surface.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Every other user-facing error lives in orc.errors; UnknownDomainError
was the one exception, forcing MCP and CLI callers to import from a
skill-internal routing module. routing.py re-exports it so existing
imports keep working.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant