feat(errors): structured what/why/fix errors + stable exit codes (error-system v2) by alex-mextner · Pull Request #27 · alex-mextner/rig-cli

alex-mextner · 2026-06-16T21:35:18Z

error-system v2 — structured what/why/fix errors + stable exit codes

Born from two same-day prod failures whose errors were thin and undiagnosable:
unknown mcp item(s): review (known: none) (no hint it was a removed slot) and a
dead hook path surfacing only as a generic harness "PreToolUse error". This makes every
rig error a consistent WHAT / WHY / HOW-to-fix block with a stable per-class exit
code (a public contract, documented in rig --help).

What's in

riglib/errors.py — RigError(what/why/fix/exit_code) + per-class codes
(2 config, 3 drift, 4 unknown-item, 5 missing-target, 6 not-a-repo,
127 missing-dep), a single top-level guard() renderer, did-you-mean
(Levenshtein-nearest), and a removed-slot registry (mcp.items.review → names
agent-tools #32 + the exact fix instead of "known: none").
riglib/layers.py — GLOBAL vs REPO classification, single source of truth so
rig status groups drift by layer and names which config file declares each item.
riglib/missing_target.py — proactively scans the harness settings.json for
hook commands pointing at a script gone from disk (the dead-rtk-hook case). Resolves
the invoked script (skips bare/absolute interpreters like /usr/bin/env,
/opt/homebrew/bin/python3; honors -c only as an interpreter flag), checks only that
token so a runtime output arg never false-flags.
cli.py — 3-part error rendering + layer-grouped rig status; a non-git dir
shows not a git repository — repo layer N/A (never the "should be committed" nag) and
degrades gracefully when the agent-tools catalog can't be resolved. rig doctor honors
the documented 127 missing-dependency exit code.
plan.py raises the structured UnknownItemError; config.py adds
primary_config_path for error provenance.

Verification

uv run --with pytest python -m pytest tests/ → 504 passed.
bash tests/smoke.sh → fully green (incl. the non-git, clean-sample, removed-slot,
and dead-hook legs). Fixed a latent smoke bug: the non-git test dir was nested under the
smoke's own git init-ed $TMP, so it wasn't actually non-git.
Reviewed across models with review --staged (Opus / Codex); their findings on the new
modules were addressed (HOME-isolated the new status tests; tightened the missing-target
script resolution for absolute interpreters and -c).

Known follow-ups (out of scope for this PR; CTO to triage)

Per-key config provenance: primary_config_path names the repo file even when the
bad key came from the global layer.
Exit-code class precedence: when rig status has both drift and a dead target, it
returns 3 (drift) and the 5 (missing-target) signal is masked.
The new structured errors.ConfigError exists but config.ConfigError is still what's
raised for malformed config (so YAML errors still render the old one-line form).
The non-git CatalogError fallback skips the missing-target scan (a dead ~/.claude
hook is surfaced in normal status but not in the non-git + no-catalog path).

🤖 Generated with Claude Code

…or-system v2) Born from two same-day prod failures whose errors were thin and undiagnosable: `unknown mcp item(s): review (known: none)` (no hint it was a REMOVED slot) and a dead hook path surfacing only as a generic harness "PreToolUse error". This makes every rig error a consistent WHAT / WHY / HOW-to-fix block with a stable per-class exit code (a PUBLIC CONTRACT documented in `rig --help`). What's in: - riglib/errors.py: RigError(what/why/fix/exit_code) + per-class codes (2 config, 3 drift, 4 unknown-item, 5 missing-target, 6 not-a-repo, 127 missing-dep), a single top-level `guard()` renderer, did-you-mean (Levenshtein), and a removed-slot registry (mcp.items.review -> names agent-tools #32 + the fix). - riglib/layers.py: GLOBAL vs REPO classification — single source of truth so `rig status` groups drift by layer and names WHICH config file declares each item. - riglib/missing_target.py: proactively scans the harness settings.json for hook commands pointing at a script gone from disk (the dead-rtk-hook case). Resolves the invoked SCRIPT (skips bare/absolute interpreters like /usr/bin/env, honors `-c` only as an interpreter flag) and checks only that token, so a runtime output arg never false-flags. - cli.py: `rig status` renders the 3-part errors + layer grouping; a non-git dir shows "not a git repository — repo layer N/A" (never the "should be committed" nag), and degrades gracefully when the agent-tools catalog can't be resolved. plan.py raises the structured UnknownItemError. config.py adds primary_config_path for provenance. `rig doctor` honors the documented 127 missing-dependency exit code. Tests: 504 pytest passing; tests/smoke.sh fully green (incl. the non-git + dead-hook + removed-slot legs). New: test_errors, test_missing_target, test_plan_errors, test_status_layers; smoke gains non-git / clean-sample / removed-slot assertions. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d4de60aaa3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-16T21:39:34Z

+        if self.repo_path is not None:
+            return self.repo_path


Track which layer declared unknown items

When both a global config and a repo rig.yaml are loaded, an unknown item can come solely from ~/.config/rig/config.yaml, but this always reports the repo file because repo_path exists. In that scenario the new structured error tells the user to remove a key from ./rig.yaml even though the key is only in the global config, making the fix path misleading for stale global entries such as removed MCP slots. The loader needs per-layer/key provenance (or at least category provenance) instead of assuming the repo file is the offender.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-16T21:39:34Z

+        if via_interpreter and tok == "-c":
+            return []
+        if _looks_like_script_path(tok):
+            p = Path(tok).expanduser()
+            return [tok] if not p.exists() else []


Skip module invocations when scanning hook commands

For interpreter commands that use -m, e.g. python3 -m my_hook --out /tmp/result.json, there is no script file to verify, but this loop only special-cases -c and then flags the first absolute argument as a missing hook script if it does not exist. That makes healthy module-based Claude hooks produce false missing-target errors and causes rig status/rig doctor to exit non-zero. Treat -m <module> like an interpreter mode and stop looking for a script path there.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-16T21:39:34Z

+    if dead_targets and report.in_sync:
+        # no config↔disk drift, but a dead hook reference IS a problem worth a non-zero exit.
+        print(_dim("\n  (no config↔disk drift, but a missing target above needs attention)"))
+        return errors.EXIT_MISSING_TARGET


Return missing-target exit code even with drift

This only returns EXIT_MISSING_TARGET when the normal drift report is otherwise clean; if a dead hook path is found alongside any config↔disk drift, the function falls through and returns 3. In that mixed case scripts following the documented stable exit-code contract cannot detect the missing-target class until all unrelated drift is fixed, even though the missing target was found and printed. Give missing targets an explicit precedence (or otherwise document a multi-error precedence) so exit code 5 is not masked by ordinary drift.

Useful? React with 👍 / 👎.

chatgpt-codex-connector Bot reviewed Jun 16, 2026

View reviewed changes

alex-mextner mentioned this pull request Jun 16, 2026

feat(rig): provision .claude/worktrees ignore via global core.excludesfile (supersedes #23) #29

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(errors): structured what/why/fix errors + stable exit codes (error-system v2)#27

feat(errors): structured what/why/fix errors + stable exit codes (error-system v2)#27
alex-mextner wants to merge 1 commit into
mainfrom
error-system-v2

alex-mextner commented Jun 16, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alex-mextner commented Jun 16, 2026

error-system v2 — structured what/why/fix errors + stable exit codes

What's in

Verification

Known follow-ups (out of scope for this PR; CTO to triage)

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant