diff --git a/docs/arch-analysis-2026-06-06-0158/00-coordination.md b/docs/arch-analysis-2026-06-06-0158/00-coordination.md deleted file mode 100644 index be405e5..0000000 --- a/docs/arch-analysis-2026-06-06-0158/00-coordination.md +++ /dev/null @@ -1,71 +0,0 @@ -# 00 — Coordination Plan - -## Analysis Configuration -- **Target**: Legis (`src/legis/`) — git/CI + governance layer of the Weft suite -- **Scope**: `src/legis/` (~7,353 LOC, 63 Python files, ~13 subsystems); cross-reference `tests/` and `docs/` -- **Deliverables**: **Option C — Architect-Ready** (docs 01–06) -- **Strategy**: **PARALLEL** — ≥5 loosely-coupled subsystems; codebase-explorer subagents per subsystem cluster -- **Time constraint**: none stated -- **Complexity estimate**: Medium (clear layering, governance domain complexity) - -## Subsystem inventory (from holistic scan) -| Subsystem | Files | LOC | First-glance responsibility | -|---|---|---|---| -| `api/` | 2 | 831 | FastAPI HTTP surface | -| `enforcement/` | 10 | 1062 | Graded 2×2 enforcement engine | -| `policy/` | 7 | 1072 | Agent-programmable policy grammar | -| `service/` | 6 | 603 | Transport-agnostic service layer (WP-M1) | -| `governance/` | 7 | 585 | Attestations, sign-off, audit | -| `wardline/` | 4 | 386 | Wardline findings integration | -| `identity/` | 4 | 356 | SEI consumption / identity | -| `git/` | 5 | 328 | Branch/commit/PR context, rename feed | -| `store/` | 3 | 217 | Persistence (SQLAlchemy) | -| `checks/` | 3 | 157 | CI check context | -| `filigree/` | 2 | 124 | Filigree issue-lifecycle binding | -| `pulls/` | 3 | 97 | Pull request context | -| `records/` | 2 | 40 | Record types | -| top-level | 5 | — | `cli.py`, `mcp.py`, `canonical.py`, `clock.py`, `__init__.py` | - -## Execution Log -- 2026-06-06 01:58 — Created workspace `docs/arch-analysis-2026-06-06-0158/` -- 2026-06-06 01:58 — User selected **Option C (Architect-Ready)** -- 2026-06-06 01:59 — Holistic scan complete (LOC table, README, pyproject) -- 2026-06-06 01:59 — Chose PARALLEL orchestration; consulted advisor before dispatch -- 2026-06-06 02:00 — Advisor guidance adopted: (1) cluster ~13 subsystems into 6 explorers along seams; (2) read prior audits first; (3) own cross-subsystem flow synthesis in 04; (4) run real tooling for 05 -- 2026-06-06 02:01 — Tooling run: mypy clean (63 files), coverage 90% TOTAL, ruff = 2 trivial F401 unused-import errors -- 2026-06-06 02:01 — Recovered + read prior audits (deleted in worktree, present in HEAD) into temp/. Comprehensive audit = 3 Critical, 7 High, 14 Medium, 5 Low. Baseline for 05/06. -- 2026-06-06 02:01 — Remediation deltas since audit (2026-06-04): C1 partially closed (07cf54e fail-closed override-rate), M11 closed (b4285dc MCP idempotency). To verify in 05. - -## Orchestration: 6 clustered explorers (PARALLEL) -- **A** Enforcement engine — `enforcement/` -- **B** Policy grammar — `policy/` -- **C** Governance + persistence foundations — `governance/`, `store/`, `records/`, `canonical.py`, `clock.py` -- **D** Service layer + HTTP API — `service/`, `api/` -- **E** Agent/CLI frontends — `cli.py`, `mcp.py`, `__init__.py` -- **F** Suite integrations & git/CI domain — `identity/`, `wardline/`, `filigree/`, `git/`, `checks/`, `pulls/` - -Each writes `temp/catalog-.md` (catalog-entry template, rigorous inbound/outbound deps); cross-subsystem flow trace owned by the 04 synthesis pass. - -## Execution Log (cont.) -- 2026-06-06 02:05 — 6 explorers complete. Headline: all 6 MCP adapter-drift findings (C2,C3,H1,M9,M10,M11) RESOLVED in current tree. New findings: single-secret scope bypass, gaps.py null-deref, M6 unguarded content_hash, unsigned Filigree transport, CLI service bypass. -- 2026-06-06 02:10 — Assembled 02 (catalog), 03 (diagrams w/ dependency DAG), 04 (report + 4 cross-subsystem flows). -- 2026-06-06 02:12 — Live tooling: 480 tests/68 files, coverage 90% (filigree 75% lowest), mypy clean, ruff 2×F401 (not in CI), CI cov-floor 70% vs actual 90%, live Loomweave oracle opt-in. -- 2026-06-06 02:14 — Wrote 05 (quality, Q-H1..Q-L8) and 06 (architect handover, 3-tier roadmap + 5-sprint sequencing). -- 2026-06-06 02:15 — Dispatching analysis-validator (Step 7 gate) over 02+04 against the discovery contract. -- 2026-06-06 02:20 — Validation gate: **PASS-WITH-NOTES** (16 confirmed, 1 partial, 0 refuted, 0 BLOCK). All 6 deliverables contract-conformant; all high-stakes claims source-verified. 3 NOTE fixes applied: (N1) M6 relabeled baseline-not-new in 04 §6; (N2) test count 480→492; (N3) Q-M1 citation pointed at unverified-return site `source_binding.py:46-53` + sign site `governance.py:170`. -- 2026-06-06 02:21 — Deliverables 00–06 written; validation report in temp/. -- 2026-06-06 02:30 — Post-validation calibration (advisor-flagged): (a) grepped the *second* audit (AUDIT-readonly.md lines 166-188) — it DOES flag weak operator-scope separation; Q-H1 reframed from "NEW High" to a *sharpening* of that finding with **conditional severity** decided by a product question (is single-secret a split-promising prod mode?). Test contract `tests/api/test_auth.py:100` proves the split is promised/tested ONLY in TOKEN_ACTORS mode; no test promises it in single-secret mode. Recalibrated in 04 §1/§5/§6, 05 (calibration note + verdict), 06 (item 1 decision-gated + sequencing). (b) Confirmed H1 artifact_key plumbing at mcp.py:925-929 → "6/6 adapter-drift RESOLVED" headline now airtight. (c) Stray `480` only in this log's history line (deliverables clean). -- 2026-06-06 02:31 — **COMPLETE.** - -## Final status: COMPLETE (Option C — Architect-Ready) -All deliverables durable in `docs/arch-analysis-2026-06-06-0158/`: -| Doc | Status | -|---|---| -| 00-coordination.md | ✅ | -| 01-discovery-findings.md | ✅ | -| 02-subsystem-catalog.md | ✅ 13 subsystems + foundations, edge-cited | -| 03-diagrams.md | ✅ 5 C4/dependency mermaid views | -| 04-final-report.md | ✅ + 4 cross-subsystem flow traces | -| 05-quality-assessment.md | ✅ live tooling + Q-H1..Q-L8 inventory | -| 06-architect-handover.md | ✅ 3-tier roadmap, 5-sprint sequencing | -| temp/ | validation-report.md, AUDIT-*.md, catalog-A..F | diff --git a/docs/arch-analysis-2026-06-06-0158/01-discovery-findings.md b/docs/arch-analysis-2026-06-06-0158/01-discovery-findings.md deleted file mode 100644 index 5d83c05..0000000 --- a/docs/arch-analysis-2026-06-06-0158/01-discovery-findings.md +++ /dev/null @@ -1,71 +0,0 @@ -# 01 — Discovery Findings - -## What Legis is -Legis is the git/CI + governance layer of the **Weft** suite (four federated tools sharing one -substrate keyed on Loomweave's Stable Entity Identity / SEI). Legis answers: *what changed, in -which branch/commit/PR/check context, and what governance/attestation state exists for that change?* - -Its distinguishing surface is a **governance 2×2** — two independent agent-set axes: -- **structure**: simple ↔ complex -- **judge**: off ↔ on - -yielding four cells: **Chill** (simple/off), **Coached** (simple/on), **Structured** (complex/off), -**Protected** (complex/on — HMAC-signed verdicts, decay sweep, override-rate gate). The root invariant -is *agent-first: humans on the loop, not in the loop* — when a policy fires, the cell decides who -answers, and every decision produces an append-only, SEI-keyed audit trail. - -Version `1.0.0rc2`. Python ≥3.12. Deps: FastAPI, SQLAlchemy 2.0, PyYAML, uvicorn. - -## Technology stack -| Concern | Choice | -|---|---| -| Language | Python 3.12 | -| HTTP | FastAPI + uvicorn | -| Persistence | SQLAlchemy 2.0 over SQLite (`*.db` files: governance, checks, pulls, binding) | -| Agent surface | Hand-rolled MCP server (`mcp.py`), stdio JSON-RPC, protocol `2024-11-05` | -| CLI | `legis` console script → `legis.cli:main` | -| Crypto | HMAC-signed audit records; canonical JSON (RFC-8785 hardening pending) | -| Build/tooling | uv build backend; pytest + pytest-cov; mypy; ruff | - -## Entry points -- **CLI** — `legis.cli:main` (`legis governance-gate`, `verify-trail`, server run, etc.) -- **HTTP** — `legis/api/app.py` FastAPI app (bearer-auth mutating routes; writer/operator scopes) -- **MCP** — `legis/mcp.py` stdio JSON-RPC server (launch-bound identity) -- All three are intended to converge on the transport-agnostic **service layer** (`service/`, WP-M1). - -## Subsystem inventory (63 files, ~7,353 LOC) -| Subsystem | Files | LOC | Responsibility (first-glance) | -|---|---|---|---| -| `policy/` | 7 | 1072 | Agent-programmable policy grammar, cells, boundary decorator/scan | -| `enforcement/` | 10 | 1062 | 2×2 engine, LLM judge, protected/signoff/decay lifecycle, signing | -| `api/` | 2 | 831 | FastAPI HTTP surface, auth, routing | -| `service/` | 6 | 603 | Transport-agnostic governance/wardline/source-binding helpers | -| `governance/` | 7 | 585 | Attestations, binding ledger, sign-off binding, SEI backfill, gaps | -| `wardline/` | 4 | 386 | Wardline scan ingest + governor (route findings → cells) | -| `identity/` | 4 | 356 | SEI consumption, entity keys, resolver (Loomweave client) | -| `git/` | 5 | 328 | Branch/commit/PR context, working-tree + rename feed | -| `store/` | 3 | 217 | SQLAlchemy audit store + store protocol | -| `checks/` | 3 | 157 | CI check context surface | -| `filigree/` | 2 | 124 | Filigree issue-lifecycle binding client | -| `pulls/` | 3 | 97 | Pull-request context surface | -| `records/` | 2 | 40 | Shared record types (`OverrideRecord`) | -| top-level | 5 | — | `cli.py`, `mcp.py`, `canonical.py`, `clock.py`, `__init__.py` | - -## Suite seams (cross-product combinations) -- **Wardline + Legis** (live): agent-defined policy enforced at CI/git boundary; findings route through `wardline/governor.py` into 2×2 cells. -- **Loomweave + Legis** (live, SEI-keyed): attestations key on SEI; git-rename provider contract-locked, pending Loomweave committed-range driving. -- **Filigree + Legis** (live): governed SEI-keyed sign-off binding; closure-gate decision; Filigree retains lifecycle authority. - -## Prior-art baseline -Two read-only audits (2026-06-04, recovered from HEAD into `temp/`): 3 Critical, 7 High, 14 Medium, 5 Low. -Dominant themes: **adapter drift** (MCP omits HTTP/CLI server-side constraints) and **evidence loss / weak -binding** in governance records. Partially remediated since (C1 override-rate fail-closed; M11 MCP idempotency). -These feed `05-quality-assessment.md` and `06-architect-handover.md`. - -## Orchestration decision -**PARALLEL**, 6 clustered explorers along architectural seams (see `00-coordination.md`). Rationale: -≥5 loosely-coupled subsystems, but several are trivial (records 40, pulls 97, filigree 124) — clustering -preserves the wiring that *is* the product rather than fragmenting it across 13 dispatches. - -**Confidence: High** for inventory/stack/entry-points (direct measurement). **Medium** for responsibility -summaries pending per-cluster explorer confirmation. diff --git a/docs/arch-analysis-2026-06-06-0158/02-subsystem-catalog.md b/docs/arch-analysis-2026-06-06-0158/02-subsystem-catalog.md deleted file mode 100644 index 3034406..0000000 --- a/docs/arch-analysis-2026-06-06-0158/02-subsystem-catalog.md +++ /dev/null @@ -1,281 +0,0 @@ -# 02 — Subsystem Catalog - -Consolidated from six parallel codebase-explorer passes (clusters A–F), each reading its -files at 100% and grepping every dependency edge with `file:line`. Subsystems are ordered -bottom-up by dependency layer. Per-subsystem confidence is **High** unless noted; the basis -is "all files read, edges grepped" in every case. - -> **Edge convention:** `X -> Y` means module X imports/depends on module Y. - ---- - -## Foundations — `canonical.py`, `clock.py` - -**Responsibility:** Leaf deterministic primitives — canonical JSON + content hashing (the basis of every hash/HMAC in the suite) and an injectable time source for deterministic timestamps. - -**Key Components:** -- `canonical.py` (22 LOC) — `canonical_json` (`sort_keys=True`, tight separators, `ensure_ascii=False`, **`allow_nan=False`**) and `content_hash` (sha256 of canonical JSON). RFC-8785 convergence explicitly deferred (ADR-0001). -- `clock.py` (30 LOC) — `Clock` Protocol, `SystemClock` (UTC ISO), `FixedClock` (deterministic test double). Production never calls `datetime.now()` directly. - -**Dependencies:** Outbound: none (leaf, stdlib only). Inbound (canonical, 9 edges): `store/audit_store`, `enforcement/signing`, `governance/sei_backfill`, `governance/gaps`, `service/wardline`, `identity/resolver`, `mcp`, `policy/decorator`, `policy/boundary_scan`. Inbound (clock): `enforcement/{engine,protected,signoff}`, `governance/{binding_ledger,sei_backfill}`, `mcp`, `cli`, `api`. - -**Patterns:** Leaf-module discipline (bottom of the DAG); single canonicalization choke point (RFC-8785 upgrade = one-file change); DI clock with deterministic double. - -**Concerns:** **M13 partially closed** — `allow_nan=False` present; full RFC-8785 hardening still deferred. `ensure_ascii=False` makes byte output encoding-dependent (consistent today; latent footgun if any caller hashes the `str` differently). - ---- - -## Identity (SEI) — `src/legis/identity/` - -**Responsibility:** Resolve a code locator to an SEI-keyed (or honestly-degraded, locator-keyed) opaque `EntityKey` by consuming Loomweave's SEI HTTP surfaces — never parsing the SEI, never guessing. - -**Key Components:** -- `entity_key.py` (40) — `EntityKey` frozen dataclass (`value` + `identity_stable`); factories `from_locator`/`from_sei`; `from_dict` validates `value` is non-empty `str` and `identity_stable` is a `bool` (raises `ValueError` otherwise). -- `resolver.py` (96) — `IdentityResolver.resolve` → `IdentityResolution` (entity_key, alive, content_hash, lineage_snapshot, status). Degrades to locator-keyed on capability-absent / no-client / not-alive / non-dict / transport-exception. Captures REQ-L-01 lineage snapshot `{length, hash}` on stable alive SEI. -- `loomweave_client.py` (219) — `LoomweaveIdentity` Protocol + `HttpLoomweaveIdentity` over stdlib `urllib`. HMAC request signing on protected routes (`X-Weft-Component`/timestamp/nonce); HTTPS-unless-loopback; 1 MB cap; JSON content-type enforcement. - -**Dependencies:** Outbound: `resolver -> canonical.content_hash` (only non-cluster edge; entity_key/client are stdlib-only). Inbound (heavily consumed — 14 edges): `api`, `cli`, `mcp`, `enforcement/{engine,lifecycle,protected,signoff}`, `governance/{binding_ledger,gaps,sei_backfill,signoff_binding}`, `records/override_record`, `service/{governance,wardline}`, `wardline/governor` (type only). - -**Patterns:** SEI opacity (`value` never parsed); honest degradation (`alive` `False` vs `None`); injectable transport seam. - -**Concerns:** **M5 NOT reproduced** — `from_dict` rejects non-`bool` stability; defect closed in current tree. Capability cache is per-instance, never invalidated once `True` (long-lived resolver keeps treating a since-degraded Loomweave as capable). `content_hash` taken verbatim from Loomweave response with no type check. - ---- - -## Records — `src/legis/records/` - -**Responsibility:** The shared core `OverrideRecord` schema (the chill-cell recordable override) that serializes to a flat dict for the record-agnostic audit store; judge/HMAC fields attach via `extensions`. - -**Key Components:** `override_record.py` (39) — frozen `OverrideRecord` (policy, entity_key, rationale, agent_id, recorded_at, extensions); `identity_stable` delegates to `EntityKey`; `to_payload()` emits the canonical flat dict. - -**Dependencies:** Outbound: `-> identity.entity_key`. Inbound (all enforcement): `protected`, `judge_factory`, `lifecycle`, `engine`, `judge`, `signoff`. - -**Patterns:** Stable-core / extensible-edge; explicit `to_payload()` serialization boundary; identity delegation. - -**Concerns:** None observed. (`to_payload` does no field-type validation — acceptable for an internal frozen dataclass.) - ---- - -## Store (persistence) — `src/legis/store/` - -**Responsibility:** Record-agnostic, append-only, hash-chained SQLAlchemy audit log with DB-level mutation rejection and a structural integrity verifier; plus the `AppendOnlyStore` protocol consumers depend on. - -**Key Components:** -- `audit_store.py` (186) — `AuditStore` over SQLAlchemy + `NullPool`; SQLite WAL/NORMAL/busy_timeout PRAGMAs; append-only enforced by `BEFORE UPDATE`/`BEFORE DELETE` triggers (`RAISE(ABORT)`); `append` chains `chain_hash = sha256(prev + content_hash)` under `BEGIN IMMEDIATE`; `verify_integrity` re-walks the chain. -- `protocol.py` (30) — `AuditRecordLike` / `AppendOnlyStore` Protocols (the abstraction enforcement types against). - -**Dependencies:** Outbound: `-> canonical`. Inbound — protocol `AppendOnlyStore`: `enforcement/{engine,protected,signoff}`; concrete `AuditStore`: `governance/{sei_backfill,binding_ledger,gaps}`, `api`, `cli`, `mcp` (composition roots). - -**Patterns:** Two integrity layers (DB triggers reject in-band mutation + hash chain detects out-of-band tampering); record-agnostic opaque payloads; protocol-first consumption seam. - -**Concerns:** **M6 PARTIALLY closed** — `verify_integrity` guards decode of `read_all()` but the loop body `content_hash(rec.payload)` (L168) is unguarded; `json.loads` accepts `Infinity`/`NaN`, so a directly-tampered payload makes `canonical_json(allow_nan=False)` **raise `ValueError` out of `verify_integrity`** — the exact tamper case it defends against (empirically reproduced). **HMAC framing:** the store is hash-chain *only*; HMAC lives in `enforcement/signing.py`. PRAGMA failures are silently swallowed (no observability). - ---- - -## Policy Grammar — `src/legis/policy/` - -**Responsibility:** The agent-programmable policy-boundary grammar — boundary types evaluating to CLEAR/VIOLATION/UNKNOWN (fail-closed), policy→cell routing, one-off exemptions, and an AST honesty gate verifying a `@policy_boundary` decoration is backed by a real, pinned test that actually exercises the boundary. - -**Key Components:** -- `grammar.py` (123) — `PolicyResult`, `PolicyEvaluation` (carries `provenance_gap`), `BoundaryType` Protocol, append-only `PolicyGrammar` registry (raises `PolicyConflictError` on shadowing); `evaluate()` fails closed (UNKNOWN+gap on unregistered; `except Exception` around boundary calls). -- `cells.py` (99) — `PolicyCellRegistry.cell_for` resolves policy → {chill, coached, structured, protected} (exact rules, then `fnmatch` globs, else `default_cell`). In-code default is `chill`. -- `decorator.py` (212) — `@policy_boundary` decorator + `check_policy_boundary()` runtime honesty gate (metadata-transplant, qualname scope, citation shape, fingerprint drift, then delegates semantics to `evaluate_test_evidence`). -- `evidence.py` (152) — single shared judgement (gate + scanner) enforcing shadowing / exercise / policy-co-occurrence checks. -- `exemptions.py` (128) — `ExemptionRegistry` + YAML/TOML loaders (fail closed on malformed). -- `boundary_scan.py` (357) — static `@policy_boundary` scanner (`scan_policy_boundaries`) with strict `tests/*.py` path sandboxing; reuses `evaluate_test_evidence`. Drives CLI `policy-boundary-check`. -- `policy/cells.toml` (repo-root) — runtime routing, `default_cell="structured"`; loaded by `mcp.py`, overriding the in-code `chill`. - -**Dependencies:** Outbound: `-> canonical.content_hash` (only intra-legis edge) + intra-package + `yaml`. Inbound: `mcp` (cells, grammar), `service/governance` (grammar), `service/explain` (cells), `api` (grammar), `cli` (boundary_scan). - -**Patterns:** Provider-seam / open instance set (agents add boundaries, no human config); fail-closed everywhere; single-source-of-truth evidence judgement (gate + scanner can't drift); anti-vibe provenance (decoration-time TypeErrors + pinned test fingerprint). - -**Concerns:** **H6 confirmed** — in-code default cell is self-clearing `chill` (`cells.py:44`); only mitigated when `cells.toml` (`structured`) loads — if config absent, `mcp.py:111` falls back to `chill`. **M7 confirmed** — honesty gate's policy-co-occurrence is a `\b`-substring match in an assert, not a check that the boundary *result* is the assertion subject. **L4 confirmed (narrow)** — runtime gate (`inspect.getsource`+dedent) vs scanner (`get_source_segment`+dedent) can diverge for class-method/decorated test_refs. Grammar-layer exemptions silently flip VIOLATION→CLEAR with `provenance_gap=False` and only fire when `target['value']` is a `str`. - ---- - -## Enforcement Engine — `src/legis/enforcement/` (12 files) - -**Responsibility:** Grade a policy firing through the governance 2×2 (simple/complex × judge off/on), writing exactly one append-only hash-chained audit record per submission and — in the protected cell — binding each verdict to its inspected source with an HMAC signature plus lifecycle gates (decay re-judge + override-rate). - -**Key Components:** -- `engine.py` (115) — `EnforcementEngine.submit_override`: chill (`judge=None`) / coached (judge evaluates *before* write). `record_event` for raw governance events. -- `verdict.py` (28) — `Verdict` (ACCEPTED/BLOCKED/OVERRIDDEN_BY_OPERATOR), `SignoffState`, `JudgeOpinion`. -- `judge.py` (111) — `Judge`/`LLMClient` Protocols; `LLMJudge` (structured-JSON-first, fail-closed; BLOCKED wins on ambiguity; untrusted input framed as data). -- `judge_factory.py` (31) — env-wired `OpenRouterLLMClient`, else `FailClosedJudge` (always BLOCKED). -- `llm_client.py` (168) — `OpenRouterLLMClient`; SSRF/transport hardening (HTTPS-or-loopback, no-redirect, 1 MB cap, strict shape validation). -- `protected.py` (288) — `ProtectedGate.submit`/`operator_override`; every record HMAC-signed via `signing_fields()` (binds entity+policy+source fingerprint+ast_path+lineage); `TrailVerifier.verify` (protected-policy set from config/ADR-0002, not the record → no flag-flip downgrade). -- `signoff.py` (151) — `SignoffGate` (structured/protected block+escalate, no LLM); `request` records PENDING (does not clear); `sign_off` records SIGNED_OFF referencing `request_seq` + `request_payload_hash`. -- `lifecycle.py` (122) — `decay_sweep` (re-judges judge-ACCEPTED suppressions), `evaluate_override_rate` (rolling-window; PASS/FAIL/PASS_WITH_NOTICE). -- `signing.py` (47) — keyed HMAC-SHA256 over `canonical_json`; versioned (`v2` default, `v1` legacy); `compare_digest`. - -**Dependencies:** Outbound: `-> clock`, `-> identity.entity_key`, `-> records.override_record`, `-> store.protocol` (protocol, not concrete), `-> canonical`. **No edge to `governance` or `policy`** (one-directional, clean). Inbound: `service/{governance,wardline,explain}`, `mcp`, `api`, `cli`, `wardline/{governor,ingest}` (signing), `governance/{signoff_binding,binding_ledger}` (signing). - -**Patterns:** Ports-and-adapters DI (store/clock/judge/LLM all injected Protocols; chill↔coached is one nullable `judge` arg); single-source-of-signed-fields (signer + verifier can't drift); fail-closed everywhere; append-only single trail; config-driven trust boundary (anti-downgrade); security-hardened LLM egress. - -**Concerns:** `TrailVerifier._requires_verification` ORs config protected-set with in-record markers — correct only if the config set is always complete/current. Dual signing-field functions (v1/v2) widen the accept set during the legacy window. `decay_sweep` has no per-record try/except — one malformed `entity_key` row aborts the whole sweep. `record_event` bypasses the judge/verdict path (relies on callers not misusing it for protected policies). HMAC key rotation out of scope. - ---- - -## Governance — `src/legis/governance/` - -**Responsibility:** Tamper-bound binding of sign-offs to Filigree issues, append-only SEI re-keying/backfill of pre-SEI records, lineage-spine gap/divergence detection, and pure closure-gate decisions — layered on the record-agnostic audit store. - -**Key Components:** -- `binding_ledger.py` (93) — `BindingLedger` records signed `issue_binding`s to a dedicated `AuditStore`; `verify()` now checks `store.verify_integrity()` (hash chain) **then** per-record HMAC; `get`/`get_by_issue_id` fail-closed. -- `signoff_binding.py` (74) — `bind_signoff_to_issue`: validate (rejects locator keys) → `filigree.attach` → optional `ledger.record` (non-atomic, documented). -- `sei_backfill.py` (259) — `run_pre_sei_backfill`: appends `SEI_BACKFILL`/`SEI_BACKFILL_UNRESOLVED` events referencing `original_seq` (never rewrites); idempotent; fails closed on integrity failure. -- `gaps.py` (115) — `find_orphan_gaps` (Loomweave `alive:false`); `find_lineage_integrity` (REQ-L-01 prefix-custody: stored snapshot must be a prefix of current lineage). -- `filigree_gate.py` (32) — `evaluate_issue_closure` (pure decision; closable only with a verified binding). -- `params.py` (11) — ADR-0002 reviewed constants (`OVERRIDE_RATE_THRESHOLD`, window, min-sample). - -**Dependencies:** Outbound: `-> store.audit_store` (concrete), `-> canonical`, `-> clock`, `-> enforcement.signing`, `-> identity.{entity_key,loomweave_client}`, `-> filigree.client`. Inbound: `cli`, `mcp`, `service/governance` (params), `api`. - -**Patterns:** Fail-closed throughout; append-only migration (never rewrites history); prefix-monotonic custody; pure decision functions separated from I/O; dedicated isolated ledger store. - -**Concerns:** **H5 RESOLVED** — `verify()` now invokes `store.verify_integrity()`. **M12 residual relocated** — enforcement now uses the `AppendOnlyStore` protocol, but `binding_ledger`/`sei_backfill`/`gaps` type against concrete `AuditStore` (can't be unit-tested against a protocol fake). **M6 propagation** — these callers branch on `verify_integrity()` which can *raise* (see Store), turning a tamper signal into an uncaught crash. **gaps.py null-deref** — `_stable_seis`/`find_lineage_integrity` do `payload.get("entity_key", {}).get(...)`; an explicit `"entity_key": null` raises `AttributeError` (inconsistent with `sei_backfill._entity_key` which guards). Non-atomic attach→record window. - ---- - -## Wardline Integration — `src/legis/wardline/` - -**Responsibility:** Ingest an agent-supplied Wardline scan, validate its shape, select the active-defect population, and route each finding into a configured 2×2 cell — Wardline analyses, legis governs. - -**Key Components:** -- `ingest.py` (226) — `WardlineSeverity`, `WardlineFinding.from_wire` (carries `properties` **verbatim**, tier-conformance deliberately not enforced); `active_defects` (defect + active; agent-suppressed states require proof); `MAX_FINDINGS=500`; `verify_wardline_artifact` (optional HMAC provenance when `artifact_key` set). -- `governor.py` (142) — `route_findings`: requires exactly one of `policy`/`cell_map`; pre-write validation guard **rejects** batches whose cells span block_escalate AND surface_*; resolves each entity via injected `resolve(qualname)`; dispatches to `signoff.request` / `engine.submit_override` / `engine.record_event`. -- `policy.py` (17) — `resolve_cell` (severity ≥ `fail_on` → gate cell, else SURFACE_ONLY). - -**Dependencies:** Outbound: `ingest -> enforcement.signing.verify`; `governor -> enforcement.{engine,signoff}`, `-> identity.entity_key` (type only — resolution injected via callable, no static resolver edge). Inbound: `api`, `mcp`, `service/wardline` (the orchestrator wiring `resolve`). - -**Patterns:** Single-judge governance (tiers verbatim, never re-derived); properties as write-only evidence; validate-all-before-any-write + cross-store-split rejection; optional artifact authentication. - -**Concerns:** **M3 refined** — across-store version closed by the cross-store-split guard; **intra-store** non-atomicity remains (N sequential appends, no transaction; mid-loop failure persists earlier findings). **Ingest relaxation (bbed0ba)** live — three backward-compatible relaxations; only retained governance control is "agent-suppressed defects must carry proof." Artifact provenance optional by default. - ---- - -## Filigree Integration — `src/legis/filigree/` - -**Responsibility:** Bind a cleared, SEI-keyed sign-off to a Filigree issue as an opaque entity-association (`entity_id` = SEI) so the binding survives rename/move — without mutating Filigree issue lifecycle. - -**Key Components:** `client.py` (123) — `FiligreeClient` Protocol + `HttpFiligreeClient` over stdlib `urllib`; `attach` POSTs `{entity_id, content_hash, actor, signoff_seq?, signature?}`; `associations_for_entity` GETs. (Binding orchestration lives in `governance/signoff_binding.py`.) - -**Dependencies:** Outbound: none to `legis.*` (stdlib only). Inbound: `api`, `governance/signoff_binding` (the `attach` caller). - -**Patterns:** Same transport posture as Loomweave client; opaque-pointer binding; authority separation (attests, never mutates issue status). - -**Concerns:** **M4 confirmed** — `bind_signoff_to_issue` rejects locator keys (intentional, avoids rename-orphan), but the consequence is **Filigree binding availability is coupled to Loomweave SEI capability**: a degraded seam silently removes the binding surface for those sign-offs. **Unsigned transport** — `HttpFiligreeClient` carries no Weft-component HMAC (unlike the signed Loomweave client); the `attach` `signature` is an app-level attestation, not transport auth. - ---- - -## Git Domain — `src/legis/git/` - -**Responsibility:** Answer "what changed?" over a real repo by shelling out to `git` (stateless), and produce a structured rename/history feed for Loomweave's SEI matcher; define the injectable forge-PR seam shape. - -**Key Components:** -- `surface.py` (207) — `GitSurface` over `subprocess git -C` (10 s timeout): `branches`, `commit(s)`, `merge_base` (honest `None`), `renames` (committed `-M`), `working_tree_renames` (uncommitted). Every ref/SHA regex-validated + leading-`-` rejected (arg-injection guard). -- `rename_feed.py` (48) — `build_rename_feed`: superset of `GET /git/renames`; `status` (found) vs `worktree_checked` (checked) disambiguation. Contract-locked Loomweave provider. -- `pull_request.py` (27) — `PullRequestSource` Protocol (injectable forge seam). -- `models.py` (45) — passive `BranchInfo`/`CommitInfo`/`RenameEvidence` (path-level only; disclaims symbol-level — that's Loomweave's). - -**Dependencies:** Outbound: none to `legis.*` (internal `surface→models`, `rename_feed→surface`; stdlib subprocess). Inbound: `api`, `mcp`. - -**Patterns:** Stateless reader (git is truth); defensive arg validation; honest tri-state reporting; contract-locked additive provider. - -**Concerns:** M2 does **not** apply (reads facts from repo, no untrusted writer). `re` re-imported per method (style nit). `working_tree_renames` shells `hash-object` per file (unbounded for very large rename sets). - ---- - -## Checks — `src/legis/checks/` - -**Responsibility:** Record/serve CI check-run facts in an indexed relational table queryable by commit/branch/PR — deliberately NOT the hash-chained governance audit log. - -**Key Components:** `surface.py` (122) — `CheckSurface` over its **own** SQLAlchemy engine; `check_runs` table; idempotent `recorded_by` migration; `record`/`for_commit`/`for_branch`/`for_pr`/`latest_state`. `models.py` (34) — `CheckOutcome` enum, frozen `CheckRun`. - -**Dependencies:** Outbound: none to `legis.*` (own engine, SQLAlchemy). Inbound: `api`, `mcp`. - -**Patterns:** Operational facts vs governance trail (separate engine); idempotent schema-evolution; last-write-wins. - -**Concerns:** **M2 confirmed (checks half)** — `CheckRun` built from client `model_dump()` with only `recorded_by=actor`; outcome/commit_sha facts accepted on the writer's word, no signature/provenance. By design (operational table), but a consumer treating check outcomes as authoritative governance input trusts an unauthenticated writer. - ---- - -## Pulls — `src/legis/pulls/` - -**Responsibility:** Record/serve forge-reported PR metadata (number/title/base/head/state) in its own relational table. - -**Key Components:** `surface.py` (68) — `PullSurface` over its own engine; `pull_requests` table; idempotent `recorded_by` migration; `record` (delete-then-insert upsert by number)/`get`. `models.py` (23) — `PullRequestState` enum, frozen `PullRequest`. - -**Dependencies:** Outbound: none to `legis.*`. Inbound: `api`, `mcp`. - -**Patterns:** Same operational-table posture as checks; upsert-by-number. - -**Concerns:** **M2 confirmed (pulls half)** — `PullRequest` built from client `model_dump()` with only `recorded_by=actor`; PR state/base/head accepted unauthenticated. - ---- - -## Service Layer — `src/legis/service/` - -**Responsibility:** Transport-agnostic governance business logic — the shared decision/enforcement primitives the HTTP, MCP, and CLI frontends route through, raising `ServiceError` subclasses (never `HTTPException`/JSON-RPC) so each adapter owns its error translation. - -**Key Components:** -- `__init__.py` (47) — public re-export contract (`evaluate_policy`, `compute_override_rate`, `submit_override`/`submit_protected_override`/`submit_operator_override`, `request_signoff`, `resolve_for_record`, `verified_records`, `explain_policy`, `route_wardline_scan`, errors). -- `errors.py` (28) — `ServiceError` + `AuditIntegrityError`/`NotEnabledError`/`NotFoundError`/`InvalidArgumentError` (adapters switch on type, never message text). -- `governance.py` (248) — `resolve_for_record` (single resolve-then-key boundary); `verified_records` (fail-closed verified-trail read); `compute_override_rate` (binds ADR-0002 params, not caller input); `submit_override`/`submit_protected_override`/`submit_operator_override` (each protected path gated by source-binding); `request_signoff`; `evaluate_policy`. -- `source_binding.py` (89) — `verify_current_source_binding` (re-hashes on-disk file under `source_root`); `require_verified_source_binding` (fails closed only for `.py`-shaped entities). -- `explain.py` (122) — `explain_policy` (policy→cell explanation; drives MCP `policy_explain`; not consumed by HTTP). - -**Dependencies:** Outbound: `-> enforcement.{engine,lifecycle,protected,signoff}`, `-> governance.params`, `-> identity.{entity_key,resolver}`, `-> policy.{grammar,cells}`, `-> canonical`, `-> wardline.{governor,ingest,policy}`. **No `-> store` edge** (store-agnostic via duck-typed gate/verifier). Inbound: `api`, `mcp`. (`cli` does NOT import service.) - -**Patterns:** Explicit DI (no globals); keyword-only args after the positional gate (transposition-proof); fail-closed verification; policy constants from `params` not caller; duck-typing at the enforcement seam. - -**Concerns:** **M1 refined** — `require_verified_source_binding` only enforces for `.py`-shaped entities; a non-`.py`/opaque-SEI protected entity yields `status:unverified` and still produces an HMAC-signed protected record. **M2** — `evaluate_policy` flags `provenance_gap` only on UNKNOWN; writer-supplied `target` facts otherwise trusted. `explain.py` `del entity` — accepted-but-ignored parameter. `NotFoundError` defined/exported but never raised in `service/`. - ---- - -## HTTP API — `src/legis/api/` - -**Responsibility:** FastAPI `create_app` factory exposing git/check read surfaces plus mutating governance surfaces, enforcing bearer auth (writer/operator scopes) and translating `ServiceError` subclasses to HTTP status codes. - -**Key Components:** `app.py` (830) — single `create_app(...)` factory (~16 DI params) with lazy env-driven fallback wiring (builds `AuditStore`/`TrailVerifier`/`ProtectedGate`/`SignoffGate`/`BindingLedger` when `LEGIS_HMAC_KEY` set). Auth: `_token_actor_from_mapping`, `_verify_secret`, `verify_writer`/`verify_operator`. **26 routes** (full table in cluster-D partial), e.g.: read surfaces (`GET /git/*`, `/checks/*`, `/overrides`, `/governance/*`) unscoped; `POST /overrides|/checks|/git/pulls|/policy/evaluate|/wardline/scan-results|/signoff/request` = **writer**; `POST /protected/operator-override`, `POST /signoff/{seq}/sign` = **operator**. - -**Dependencies:** Outbound: `-> service.*` (primary seam), `-> enforcement.{engine,protected,signoff}` (**direct reach-through** for sign-off + trail verify), `-> checks/pulls/git`, `-> governance.{gaps,binding_ledger,signoff_binding,filigree_gate}`, `-> filigree`, `-> identity`, `-> policy.grammar`, `-> wardline`, `-> store/clock/judge_factory` (lazy). Inbound: `cli` (launcher via factory string), `mcp` (imports `DEFAULT_GOVERNANCE_DB`/`DEFAULT_CHECK_DB` constants — sibling-frontend coupling). - -**Patterns:** Application factory with exhaustive DI + lazy fallback; adapter error-translation (404/422/500/409); ACCEPTED/BLOCKED → 201/409; server-owned authority (rate constants, wardline cell, recorded actor). - -**Concerns:** **C2/H1 — HTTP is the reference; now has parity with MCP** (server routing wins + forbids caller fields → 403; caller routing behind `LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING=1`; artifact HMAC via `LEGIS_WARDLINE_ARTIFACT_KEY`). **H7 mitigated** — unscoped `TOKEN_ACTORS` entries rejected unless `LEGIS_ALLOW_UNSCOPED_API_TOKENS=1`. **NEW — H7-adjacent (single-secret mode):** `_verify_secret` (`:108-116`) returns the actor on a `LEGIS_API_SECRET` match **without consulting `required_scope`** — so writer and operator routes are satisfied by the same token; the writer/operator split is a real control ONLY in TOKEN_ACTORS mode. **M1/M2 surface here**. **Drift signal** — sign-off routes call `SignoffGate` directly, bypassing the exported `service.request_signoff`, and re-implement the `verified_records` tamper-check inline. Unauthenticated governance read surfaces. - ---- - -## CLI — `src/legis/cli.py`, `__init__.py` - -**Responsibility:** The `legis` console script — an argparse dispatcher (`serve`, `mcp`, `check-override-rate`, `governance-gate`, `sei-backfill`, `policy-boundary-check`) wiring flags into `LEGIS_*` env and deferring to frontends/gates. - -**Key Components:** `build_parser` (6 subcommands); `_check_override_rate` (the override-rate CI gate — **reads the audit store directly**, inlines its own protected-record detection, builds its own `TrailVerifier`, then `evaluate_override_rate`); `_apply_judge_env`. `__init__.py` — `__version__ = "1.0.0rc2"`. - -**Dependencies:** Outbound: `-> api.app:create_app` (launcher), `-> mcp.main` (launcher), `-> store.audit_store`, `-> enforcement.{lifecycle,protected}`, `-> governance.{sei_backfill,params}`, `-> identity.loomweave_client`, `-> policy.boundary_scan`, `-> clock`. **`-> service.*` = NONE.** Inbound: console-script entry point only. - -**Patterns:** Env-var seam (flags → `LEGIS_*` → frontend re-reads); lazy local imports in dispatch branches; fail-closed CI posture (missing DB / integrity failure / unverifiable protected records → exit 1, guarded by `CI=true`/`LEGIS_ALLOW_MISSING_GOVERNANCE_DB`). - -**Concerns:** **Service-layer bypass (adapter drift, CLI side)** — `_check_override_rate` routes through no `service.*` function; it hand-rolls parallel copies of `verified_records` + `compute_override_rate`. This duplication already forced a divergent fix (`07cf54e`). MCP's `override_rate_get` *does* go through the service. `print`-only, no structured observability around gate outcomes. - ---- - -## MCP Server — `src/legis/mcp.py` - -**Responsibility:** A stdlib-only, hand-rolled MCP-over-stdio JSON-RPC server (protocols `2024-11-05`/`2025-03-26`) exposing governance + git/CI tools to agents under a launch-bound `agent_id`, mapping governance *decisions* onto `service/` and *reads* onto their owning surfaces. - -**Key Components:** `McpRuntime` (per-launch state); `build_runtime` (wires gates + `TrailVerifier` together under `LEGIS_HMAC_KEY` — no "gate without verifier" hole); `tool_definitions` (schemas, all `additionalProperties:false`); `call_tool` (dispatch, begins with `_validate_argument_keys`); `handle_request`/`run_jsonrpc`/`main`. **Tool routing:** the 5 governance-decision tools (`policy_explain`, `override_submit`, `policy_evaluate`, `scan_route`, `override_rate_get`) route through `service/`; read/poll surfaces (`signoff_status_get`, `filigree_closure_gate_get`, `git_*`, `pull_request_get`, `check_list`) reach owning surfaces directly (consistent with HTTP). - -**Dependencies:** Outbound: `-> api.app` (**sibling-frontend coupling** — `DEFAULT_GOVERNANCE_DB`/`DEFAULT_CHECK_DB`), `-> service.{governance,wardline,explain,errors}`, `-> enforcement.*`, `-> governance.{binding_ledger,filigree_gate}`, `-> policy.{cells,grammar}`, `-> wardline.{governor,ingest}`, `-> git/checks/pulls`, `-> store/identity/canonical`. Inbound: `cli` only. - -**Patterns:** Service-for-decisions, direct-surface-for-reads; launch-bound identity (schemas never accept actor identity); lazy resource construction; discriminated outcome envelopes + recovery hints; idempotency-replay machinery. - -**Concerns — adapter-drift audit verdicts (all RESOLVED in current source):** -- **C2 RESOLVED** — `scan_route` rejects caller routing under server routing (`INVALID_CELL_SPEC`), mirroring HTTP; caller routing only behind `LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING=1`. *Caveat: closed in `call_tool`, not the schema (schema still advertises the keys).* -- **C3 RESOLVED** — `_verified_records` → `service.verified_records` → `trail_verifier.verify` raising `AuditIntegrityError`; gate + verifier always co-constructed. -- **H1 RESOLVED** — passes `artifact_key` → `verify_wardline_artifact` requires signed provenance when key set. -- **M9 RESOLVED** — `_validate_argument_keys` rejects unknown keys (`InvalidArgumentError`). -- **M10 RESOLVED** — `poll_handle`/`seq` both integer; `_require_int` tolerant. -- **M11 RESOLVED** (commit `b4285dc`) — request-hash idempotency binding + recorded-outcome replay; rejects key reuse with a different request; replay reads the verified trail. - -**Non-drift concerns:** sibling-frontend coupling to `api.app` (cleanest single coupling to break); hand-rolled JSON-RPC framing with no stdin line-size bound; 464-stmt `call_tool` single if/elif (table-driven candidate as tools grow). diff --git a/docs/arch-analysis-2026-06-06-0158/03-diagrams.md b/docs/arch-analysis-2026-06-06-0158/03-diagrams.md deleted file mode 100644 index 2731e5f..0000000 --- a/docs/arch-analysis-2026-06-06-0158/03-diagrams.md +++ /dev/null @@ -1,271 +0,0 @@ -# 03 — Architecture Diagrams - -C4-style views (Context → Container → Component) plus the internal dependency layering. -All edges are derived from the `file:line` import evidence collected in the cluster passes -(`temp/catalog-*.md`). Rendered as Mermaid. - ---- - -## Level 1 — System Context - -Legis inside the Weft suite. Legis governs *change* and consumes the other tools' authorities. - -```mermaid -graph TB - agent["Coding Agent
(operates & extends)"] - human["Human Operator
(supervises, signs off, governs)"] - - subgraph legis["Legis — git/CI + governance layer"] - L["Governance 2×2 engine
+ git/CI operating picture"] - end - - loom["Loomweave
(SEI authority + structure)"] - ward["Wardline
(policy findings, taint, dossier)"] - fil["Filigree
(issue / workflow state)"] - repo[("Git repository")] - llm["LLM judge provider
(OpenRouter, optional)"] - - agent -->|"override / scan-route / policy-evaluate
(HTTP · MCP · CLI)"| L - L -->|"block + escalate"| human - human -->|"operator sign-off"| L - - L -->|"resolve locator → SEI
(HMAC, HTTPS)"| loom - L -->|"rename/history feed (provider)"| loom - ward -->|"scan results (findings)"| L - L -->|"attach SEI-keyed binding"| fil - L -->|"shell: what changed?"| repo - L -->|"judge override (fail-closed)"| llm -``` - -**Key boundary facts:** Legis is an SEI *consumer* (treats SEI as opaque). Loomweave traffic is -HMAC-signed over HTTPS; **Filigree traffic is unsigned** (app-level attestation only). Wardline -findings are *produced* by Wardline and *routed to cells* by Legis ("one judge, not two"). - ---- - -## Level 2 — Container (frontends → service → domain → foundations) - -Three frontends are *intended* to converge on one transport-agnostic service layer. Solid edges -follow that intent; **dashed red edges are the drift** where a frontend bypasses or cross-couples. - -```mermaid -graph TB - subgraph frontends["Frontends (adapters)"] - api["HTTP API
api/app.py (830)"] - mcp["MCP Server
mcp.py (≈1123)"] - cli["CLI
cli.py (318)"] - end - - svc["Service Layer
service/ — transport-agnostic (WP-M1)"] - - subgraph domain["Domain"] - enf["Enforcement
2×2 engine + judge + protected"] - pol["Policy grammar"] - gov["Governance
binding · backfill · gaps"] - wl["Wardline integration"] - end - - subgraph integ["Integration surfaces"] - idy["Identity (SEI)"] - figc["Filigree client"] - git["Git domain"] - chk["Checks"] - pul["Pulls"] - end - - subgraph found["Foundations"] - store["Store (audit log)"] - rec["Records"] - can["canonical / clock"] - end - - api --> svc - mcp --> svc - api -.->|"direct reach-through:
SignoffGate, trail verify"| enf - cli -.->|"bypasses service:
hand-rolls verified_records
+ compute_override_rate"| enf - cli -.->|"reads store directly"| store - mcp -.->|"sibling-frontend coupling:
DEFAULT_*_DB constants"| api - cli -->|"launches (factory)"| api - cli -->|"launches"| mcp - - svc --> enf - svc --> pol - svc --> wl - svc --> idy - svc --> gov - - enf --> store - enf --> rec - enf --> can - enf --> idy - gov --> store - gov --> enf - gov --> idy - gov --> figc - wl --> enf - wl --> idy - pol --> can - rec --> idy - idy --> can - store --> can - - api --> chk - api --> pul - api --> git - mcp --> chk - mcp --> pul - mcp --> git - - classDef drift stroke:#c0392b,stroke-width:2px,color:#c0392b; -``` - -> The dashed red edges are the report's central architectural finding: **the service layer is a -> partial seam.** It owns governance decisions cleanly for `api` and `mcp`, but `api` reaches past -> it for sign-off, `cli` doesn't use it at all, and `mcp` couples to `api` for shared constants. - ---- - -## Level 3 — Component: the Protected cell (the "full machinery") - -The most security-critical path — a protected override from submission to tamper-evident record. - -```mermaid -graph TB - caller["Frontend
(api / mcp)"] - sgov["service.governance
submit_protected_override"] - sb["service.source_binding
require_verified_source_binding"] - pg["enforcement.protected
ProtectedGate.submit"] - judge["enforcement.judge
LLMJudge (fail-closed)"] - llm["llm_client
OpenRouter (SSRF-hardened)"] - sign["enforcement.signing
HMAC-SHA256 v2"] - can["canonical_json"] - store[("AuditStore
append-only + hash chain")] - tv["TrailVerifier.verify
(read path)"] - - caller --> sgov - sgov --> sb - sb -->|".py entity: re-hash on-disk source"| sgov - sgov --> pg - pg --> judge - judge --> llm - llm -->|"ACCEPTED / BLOCKED"| judge - pg --> sign - sign --> can - pg -->|"signing_fields() →
entity+policy+fingerprint+ast_path+lineage"| store - store -->|"chain_hash = sha256(prev + content_hash)"| store - tv -->|"protected-policy set from config (ADR-0002),
not the record → no flag-flip downgrade"| store -``` - -**Invariants enforced on this path:** judge fails closed (BLOCKED on ambiguity / no provider); -every protected record is HMAC-signed via the *same* `signing_fields()` the verifier reads (signer/verifier -can't drift); the protected-policy set is config-owned so a record can't declare itself unprotected. -**Known gap on this path:** a non-`.py` entity passes source binding as `unverified` yet still gets -signed (M1); `verify_integrity` can raise instead of returning `False` on non-finite-float tampering (M6). - ---- - -## Internal dependency layering (the DAG) - -No import cycles exist. Modules form a clean DAG; the layer index is the longest path to a leaf. - -```mermaid -graph LR - subgraph L0["L0 — leaves"] - can["canonical"] - clk["clock"] - ek["identity.entity_key"] - lwc["identity.loomweave_client"] - figc["filigree.client"] - gitm["git.*"] - chk["checks"] - pul["pulls"] - prm["governance.params"] - end - subgraph L1["L1"] - res["identity.resolver"] - rec["records"] - st["store"] - pol["policy"] - end - subgraph L2["L2"] - enf["enforcement"] - end - subgraph L3["L3"] - gov["governance"] - wl["wardline"] - end - subgraph L4["L4"] - svc["service"] - end - subgraph L5["L5"] - api["api"] - end - subgraph L6["L6"] - mcp["mcp"] - end - subgraph L7["L7"] - cli["cli"] - end - - res --> can - rec --> ek - st --> can - pol --> can - enf --> st - enf --> rec - enf --> can - enf --> clk - enf --> ek - gov --> st - gov --> enf - gov --> figc - wl --> enf - svc --> enf - svc --> pol - svc --> wl - svc --> gov - api --> svc - mcp --> svc - mcp --> api - cli --> api - cli --> mcp -``` - -**Layer-violation notes (not cycles, but smells):** -- `mcp (L6) -> api (L5)` — a frontend depends on a sibling frontend for shared DB-default constants. The only cross-frontend static edge; should resolve to a shared config module. -- `cli (L7) -> api/mcp` — launcher edges (acceptable), but `cli` also reaches `enforcement (L2)`/`store (L1)` directly, skipping `service (L4)`. -- `api (L5) -> enforcement (L2)` — direct reach-through for sign-off, skipping its own `service (L4)`. - ---- - -## Trust-boundary map - -```mermaid -graph TB - subgraph untrusted["Untrusted / semi-trusted inputs"] - a1["agent rationale (override)"] - a2["wardline scan payload"] - a3["writer-supplied check/PR facts"] - a4["LLM judge output"] - end - subgraph controls["Controls at the boundary"] - c1["judge: data-framed input, fail-closed parse"] - c2["artifact HMAC (opt-in via key)"] - c3["bearer auth: writer/operator scopes"] - c4["structured-JSON verdict, BLOCKED-wins"] - end - subgraph trail["Tamper-evident record"] - t1[("hash chain + append-only triggers")] - t2["HMAC signature (protected)"] - end - - a1 --> c1 --> t1 - a2 --> c2 --> t1 - a3 --> c3 --> t1 - a4 --> c4 --> t1 - t1 --> t2 -``` - -**Residual boundary weaknesses (carried to 05):** writer/operator split is vacuous in single-secret -mode; check/PR facts are recorded on the writer's word (no fact provenance); Filigree transport is -unsigned; LLM judge output is parsed as gate authority (prompt-injection surface in coached/protected). diff --git a/docs/arch-analysis-2026-06-06-0158/04-final-report.md b/docs/arch-analysis-2026-06-06-0158/04-final-report.md deleted file mode 100644 index f540d62..0000000 --- a/docs/arch-analysis-2026-06-06-0158/04-final-report.md +++ /dev/null @@ -1,211 +0,0 @@ -# 04 — Final Report - -**Target:** Legis `1.0.0rc2` — the git/CI + governance layer of the Weft suite -**Scope:** `src/legis/` (63 files, ~7,353 LOC), cross-referenced against `tests/`, `docs/`, prior audits, and live tooling -**Method:** 6 parallel codebase-explorer passes along architectural seams + synthesis; tooling run live; two prior read-only audits used as a known-issues baseline -**Date:** 2026-06-06 - ---- - -## 1. Executive summary - -Legis implements a **governance 2×2** — two agent-set dials (structure: simple/complex; judge: off/on) -yielding four enforcement cells (Chill, Coached, Structured, Protected) — over a tamper-evident, -SEI-keyed audit trail. The codebase is small, disciplined, and architecturally coherent: a clean -dependency DAG with no import cycles, pervasive fail-closed defaults, dependency injection at every -seam, and a single canonicalization/signing choke point. mypy is clean across all 63 files and line -coverage is 90%. - -The architecture's organizing idea is sound and largely realized: **Wardline analyses, Legis governs; -Loomweave owns identity, Legis consumes it; Filigree owns issue lifecycle, Legis attests to it.** Every -governance decision produces one append-only hash-chained record, and the protected cell layers HMAC -signing bound to the inspected source. - -The dominant *architectural* finding is that the **transport-agnostic service layer (WP-M1) is a partial -seam**. It cleanly owns governance decisions for the HTTP and MCP frontends, but three drifts remain: the -HTTP API reaches *past* its own service layer for sign-off, the CLI bypasses the service entirely (hand-rolling -its own trail-verification and override-rate logic), and the MCP server couples to the HTTP module for shared -constants. The prior audits' dominant theme — **adapter drift, where MCP omitted HTTP/CLI server-side -constraints** — has been **substantially remediated**: all six tracked MCP-drift findings (C2, C3, H1, M9, -M10, M11) are RESOLVED in the current tree. The residual drift is now structural (seam discipline), not a -live security bypass. - -The remaining *security-relevant* findings cluster around **evidence binding and authentication of inputs**: -protected records for non-`.py` entities sign an `unverified` source binding; check/PR facts are recorded on -the writer's word; the Filigree transport is unsigned; the LLM judge parses model output as gate authority (a -prompt-injection surface in coached/protected); and the writer/operator scope split is enforced only in -`TOKEN_ACTORS` mode, not in single-secret mode (its severity hinges on whether single-secret is a supported -split-promising production mode — see §5/§6). None of these block the rc, but each is a sharp edge an -architect should schedule before GA. - -**Overall assessment: a well-built, honest, internally consistent rc.** The bones are good. The work ahead -is seam-tightening and input-authentication hardening, not rearchitecture. - ---- - -## 2. Subsystem map - -13 subsystems + a foundations pair, in a 7-layer DAG (full catalog in `02`, diagrams in `03`): - -| Layer | Modules | Role | -|---|---|---| -| L0 (leaves) | `canonical`, `clock`, `identity.entity_key`, `identity.loomweave_client`, `filigree.client`, `git.*`, `checks`, `pulls`, `governance.params` | primitives + leaf integration surfaces | -| L1 | `identity.resolver`, `records`, `store`, `policy` | resolution, schema, persistence, grammar | -| L2 | `enforcement` | the 2×2 engine + judge + protected/signoff/lifecycle | -| L3 | `governance`, `wardline` | binding/backfill/gaps; scan-to-cell routing | -| L4 | `service` | transport-agnostic decision layer (WP-M1) | -| L5–L7 | `api`, `mcp`, `cli` | three frontends | - -**Largest / hottest modules:** `policy` (1072 LOC) and `enforcement` (1062 LOC) carry the domain weight; -`api/app.py` (830) and `mcp.py` (~1123) are the dense frontends. `identity`, `canonical`, and `clock` are -the most-depended-upon foundations (14 / 9 / many inbound edges respectively). - ---- - -## 3. Cross-subsystem flows (the wiring that *is* the product) - -A bottom-up catalog under-serves a system whose value is the *combination* of its parts. These four -end-to-end traces are the load-bearing paths. - -### 3.1 Agent override → graded cell → tamper-evident record (the core loop) - -``` -agent → [frontend: api POST /overrides | mcp override_submit | (cli is gate-only)] - → service.governance.submit_override / submit_protected_override / request_signoff - → service.resolve_for_record → identity.resolver.resolve(locator) - → Loomweave (HMAC/HTTPS): SEI-keyed EntityKey + alive + content_hash + lineage_snapshot, - or honest locator-keyed degradation - → policy.cells.cell_for(policy) selects the 2×2 cell - → cell dispatch: - chill → enforcement.engine.submit_override(judge=None) → record ACCEPTED_SELF - coached → enforcement.engine.submit_override(judge=LLMJudge) → judge BEFORE write - structured→ enforcement.signoff.SignoffGate.request → PENDING_SIGNOFF (does not clear) - protected → enforcement.protected.ProtectedGate.submit → judge + HMAC sign + source-binding - → store.audit_store.append → content_hash → chain_hash = sha256(prev + content_hash) -``` - -Every branch terminates in exactly one append-only record on the same hash chain. The cell is chosen -**server-side** from policy config, never from caller input — the anti-downgrade guarantee. The chill cell's -"recordable override" is what makes *humans-not-in-the-loop* safe: an attributable event, never a silent pass. - -### 3.2 Wardline finding → governance cell (the "Wardline + Legis" combination) - -``` -Wardline scan payload → [api POST /wardline/scan-results | mcp scan_route] - → service.wardline.route_wardline_scan - → wardline.ingest.verify_wardline_artifact(scan, artifact_key?) # HMAC provenance IF key configured - → wardline.ingest.active_defects # kind==defect & suppressed==active; agent-suppressed needs proof - → wardline.governor.route_findings # exactly one of policy|cell_map; rejects block_escalate∪surface_* batch - per finding: resolve(qualname) → EntityKey ; build `wardline` ext (fingerprint, properties verbatim) - dispatch → signoff.request | engine.submit_override | engine.record_event -``` - -This is the unification of two vocabularies into one: Wardline's trust tiers ride **verbatim** into the -record (`properties` write-only), and Legis decides the cell. **Routing ownership is server-side** on both -frontends now (the C2 fix). The seam's weak spot is **intra-store batch non-atomicity** (M3): a multi-finding -same-cell batch is N sequential appends with no surrounding transaction. - -### 3.3 Sign-off → SEI-keyed Filigree binding (the "Filigree + Legis" combination) - -``` -operator → api POST /signoff/{seq}/sign (operator scope) → SignoffGate.sign_off → SIGNED_OFF record -agent → api POST /signoff/{seq}/bind-issue - → governance.signoff_binding.bind_signoff_to_issue - guard: reject identity_stable=False (locator) keys # avoids rename-orphan - → filigree.client.attach(entity_id=SEI, content_hash, signature) # UNSIGNED transport - → governance.binding_ledger.record (signed, dedicated AuditStore) # non-atomic vs attach - later: api GET /filigree/issues/{id}/closure-gate - → governance.filigree_gate.evaluate_issue_closure(ledger) # closable only w/ verified binding -``` - -The binding survives rename because it keys on SEI. The structural consequence (M4): **binding availability -is coupled to Loomweave SEI capability** — when Loomweave is degraded the sign-off can be *recorded* but -cannot be *bound*. And the Filigree HTTP channel itself is unauthenticated (the `signature` is an app-level -attestation, not transport auth). - -### 3.4 The override-rate CI gate — same decision, three implementations - -``` -api GET /governance/override-rate → service.compute_override_rate(service.verified_records(...)) ✅ via service -mcp override_rate_get → service.compute_override_rate(_verified_records(...)) ✅ via service -cli governance-gate → AuditStore.read_all() + own TrailVerifier + inline evaluate_override_rate ❌ bypass -``` - -This is the cleanest illustration of the partial-seam finding: the *same governance computation* is reached -three ways, and the CLI's hand-rolled copy already required a divergent fix (`07cf54e`, "fail closed on -protected override-rate trails") that the service path got for free. - ---- - -## 4. Architectural strengths - -1. **Clean DAG, no cycles.** Enforcement depends on neither governance nor policy; the dependency arrows all point downward to leaves. A genuine layered architecture, not a ball of mud. -2. **Fail-closed as a default discipline.** Unregistered policy → UNKNOWN; no judge provider → `FailClosedJudge` (always BLOCKED); malformed config → error not false-green; ambiguous judge output → BLOCKED. The system's resting state is "deny." -3. **Single-source-of-truth choke points.** One `canonical_json`/`content_hash` underlies every hash and HMAC; `signing_fields()` is shared by signer and verifier so they cannot drift; `evidence.py` is shared by the runtime gate and the static scanner. -4. **Dependency injection everywhere.** Store, clock, judge, LLM transport, identity, forge-PR source — all injected Protocols. The only non-test concretes are the HTTP clients. Highly testable (90% coverage, mypy-clean). -5. **Honest degradation.** Identity resolution distinguishes "not alive" (`False`) from "no capability" (`None`); the rename feed distinguishes "found" from "checked." The system tells the truth about what it doesn't know. -6. **Config-owned trust boundary.** The protected-policy set and override-rate constants live in config (ADR-0002), not in the records they govern — a record cannot declare itself unprotected. - ---- - -## 5. Architectural concerns (consolidated; detail + remediation in `05`/`06`) - -| Theme | Finding | Severity | -|---|---|---| -| Seam discipline | Service layer is a partial seam: api reaches past it (sign-off), cli bypasses it entirely, mcp couples to api for constants | High (architectural) | -| Input authentication | Writer/operator scope split enforced only in `TOKEN_ACTORS` mode; single-secret mode does not separate them | High *if* single-secret is a split-promising prod mode, else Medium (§5 calibration) | -| Evidence binding | Protected records for non-`.py` entities sign `source_binding: unverified` (M1) | Medium | -| Input authentication | Check/PR facts recorded on the writer's word, no fact provenance (M2) | Medium | -| Input authentication | Filigree transport unsigned (asymmetric vs signed Loomweave) | Medium | -| Tamper handling | `verify_integrity` can *raise* on non-finite-float tampering instead of returning `False` (M6) | Medium | -| Prompt injection | LLM judge parses model output as gate authority; untrusted rationale embedded (H3 baseline) | Medium | -| Atomicity | Intra-store Wardline batch non-atomicity (M3); non-atomic Filigree attach→record (M4-adjacent) | Medium | -| Robustness | `gaps.py` null-`entity_key` `AttributeError`; `decay_sweep` aborts whole sweep on one bad row | Low–Med | -| Default-open | In-code default cell is self-clearing `chill` (H6); only `cells.toml` makes it `structured` | Medium | -| Honesty gate | Policy-co-occurrence check is substring-in-assert, not semantic (M7) | Low–Med | -| Coupling | Governance modules type against concrete `AuditStore`, not the protocol (M12 residual) | Low | - ---- - -## 6. Remediation delta since the 2026-06-04 audits - -The two prior audits (3 Critical, 7 High, 14 Medium, 5 Low) are a moving baseline. Confirmed deltas: - -| Prior finding | Status now | Evidence | -|---|---|---| -| C1 CI gate passes on absent trail | **Mostly closed** | `07cf54e` + `8b15320` — CLI fails closed under `CI=true`/missing-trail unless `LEGIS_ALLOW_MISSING_GOVERNANCE_DB` | -| C2 MCP caller-chosen routing | **RESOLVED** | `mcp.py` server-owned routing guard mirrors HTTP | -| C3 MCP skips HMAC trail verify | **RESOLVED** | `_verified_records` → `service.verified_records` → `TrailVerifier` | -| H1 MCP skips artifact HMAC | **RESOLVED** | `scan_route` passes `artifact_key` | -| H5 BindingLedger skips chain integrity | **RESOLVED** | `verify()` calls `store.verify_integrity()` first | -| H7 unscoped tokens grant operator | **Mitigated** | rejected unless `LEGIS_ALLOW_UNSCOPED_API_TOKENS=1` | -| M9 unknown MCP args accepted | **RESOLVED** | `_validate_argument_keys` | -| M10 poll_handle type mismatch | **RESOLVED** | both integer | -| M11 MCP no idempotency | **RESOLVED** | `b4285dc` request-hash replay | -| M12 enforcement → concrete store | **Partially** | enforcement uses protocol; governance still concrete | -| M13 no `allow_nan` | **Partially** | `allow_nan=False` present; RFC-8785 still deferred | -| M5 EntityKey coerces stability | **Not reproduced** | `from_dict` validates `bool` | -| M1/M2/M3/M4/M7/H3/H6 | **Confirmed live** (M3/M4 refined) | see `05` | - -**New findings surfaced this pass (not in prior audits):** `gaps.py` null-`entity_key` `AttributeError`; -unsigned Filigree transport asymmetry; CLI service-layer bypass as the third drift vector. (Two clarifications -from a post-validation cross-check of *both* prior audits: M6 — the unguarded `content_hash` in the verify -loop — is a *prior-audit* finding, re-confirmed here as only partially closed, not new. And **Q-H1** -(single-secret writer/operator split) is a *sharpening/localization* of the readonly audit's scope-separation -finding (AUDIT-readonly §High, lines 166-188), not a net-new discovery; its severity is conditional — see §5.) - ---- - -## 7. Confidence & limitations - -**Confidence: High** on structure, edges, and finding locations — every subsystem read at 100% by its cluster -pass, every dependency edge grepped with `file:line`, mypy/coverage run live, and each prior-audit finding -discriminated against current source (several empirically reproduced). - -**Limitations:** -- The Loomweave / Wardline / Filigree **wire contracts are taken from docstrings and Legis-side clients**, not the sibling repos. Cross-repo conformance (the live oracle test) is opt-in and not exercised here. -- Runtime behavior of injected concretes defined outside a cluster (e.g. an exotic LLM provider) was not executed. -- No tests were run beyond the existing coverage artifact; this is a static + tooling analysis, not a dynamic audit. -- The two prior audits' *severity* judgments were accepted as framing; this pass re-verified *presence*, not re-scored severity from scratch. - -`05-quality-assessment.md` quantifies the quality signals; `06-architect-handover.md` sequences the remediation. diff --git a/docs/arch-analysis-2026-06-06-0158/05-quality-assessment.md b/docs/arch-analysis-2026-06-06-0158/05-quality-assessment.md deleted file mode 100644 index 789f99d..0000000 --- a/docs/arch-analysis-2026-06-06-0158/05-quality-assessment.md +++ /dev/null @@ -1,124 +0,0 @@ -# 05 — Code Quality Assessment - -Quantitative signals run live against the working tree (HEAD `2e69141`), combined with the -finding inventory from the six cluster passes and the two prior read-only audits. - ---- - -## 1. Tooling signals (measured this pass) - -| Signal | Result | Notes | -|---|---|---| -| **mypy** (`mypy src/legis`) | ✅ **Clean** — "no issues found in 63 source files" | strict-ish config (`warn_unused_configs`, `show_error_codes`) | -| **ruff** (`ruff check src/`) | ⚠️ **2 errors** — both `F401` unused import (`Hashable` in `policy/grammar.py:15`; one more) | auto-fixable; **ruff is NOT in CI** | -| **Line coverage** | ✅ **90%** (3,453 stmts, 329 missed) | high for a governance codebase | -| **Tests** | **492 test functions across 68 files** | unit + contract + conformance + mcp lanes | -| **pytest warnings** | `filterwarnings = ["error", ...]` | warnings are errors (one scoped Starlette ignore) | - -### Coverage by subsystem (security-critical paths are well covered) - -| Subsystem | Cov | | Subsystem | Cov | -|---|---|---|---|---| -| `records` | 100% | | `store` | 90% | -| `pulls` | 98% | | `api` | 90% | -| `git` | 97% | | `policy` | 88% | -| `checks` | 97% | | `(root: cli+mcp+canonical+clock)` | 85% | -| `identity` | 95% | | **`filigree`** | **75%** ← lowest | -| `enforcement` | 95% | | | | -| `service` | 94% | | | | -| `governance` | 93% | | | | -| `wardline` | 91% | | | | - -The two heaviest single files drag the "root" bucket: `mcp.py` 82%, and `cli.py`'s gate paths. -`filigree/client.py` at 75% is the weakest — and it is also the **unsigned transport** surface, so its -uncovered branches are exactly the error/transport paths a security reviewer cares about. - ---- - -## 2. CI pipeline review (`.github/workflows/ci.yml`) - -The pipeline is unusually governance-aware — it runs the project's own gates as CI steps: - -| Step | Assessment | -|---|---| -| `pytest --cov=legis --cov-fail-under=70` | ✅ runs tests + coverage… ⚠️ **threshold 70% while actual is 90%** — 20 points of silent-regression headroom (prior **M14**, still live) | -| SEI conformance oracle (`test_sei_oracle.py`) | ✅ always runs | -| Live Loomweave oracle | ⚠️ **gated on `vars.LOOMWEAVE_URL != ''`** — opt-in; absent var = silently skipped (prior **M14**) | -| `mypy src/legis` | ✅ enforced | -| `legis policy-boundary-check` | ✅ the honesty gate runs in CI (good — dogfoods its own grammar) | -| `legis governance-gate --db sqlite:///legis-governance.db` | ✅ override-rate gate; now fails closed under `CI=true`/missing-trail (prior **C1**, mostly closed by `07cf54e`/`8b15320`) | - -**Gaps:** (1) **no ruff/lint step** — the 2 F401 errors prove lint isn't gating; (2) **coverage threshold (70%) far below reality (90%)** — should be raised, ideally with per-package floors for `enforcement`/`service`/`governance`/`api`/`mcp`; (3) live cross-repo conformance is opt-in, so Loomweave endpoint/header drift passes default CI. - ---- - -## 3. Finding inventory (current tree) - -Severity reflects this pass's re-verification, not the prior audits' original scores. "Status" reconciles -against the 2026-06-04 baseline. - -### High - -| ID | Finding | Location | Status | -|---|---|---|---| -| **Q-H1** | **Single-secret mode does not enforce the writer/operator scope split** — `_verify_secret` returns the actor on a `LEGIS_API_SECRET` match without consulting `required_scope` (`:116`); operator-only routes (`/protected/operator-override` `:559`, `/signoff/{seq}/sign` `:677`) are satisfied by any holder of the single secret. **Severity is conditional (see calibration note).** | `api/app.py:103,108-116` | Sharpens AUDIT-readonly scope-separation finding (§High, lines 166-188); the specific single-secret mechanism is newly localized | -| **Q-H2** | **Service layer is a partial seam** — `api` reaches past it for sign-off (`SignoffGate` direct, inline trail-verify); `cli` bypasses it entirely (hand-rolled `verified_records` + `compute_override_rate`); `mcp` couples to `api` for `DEFAULT_*_DB` constants | `api/app.py:588,605-618,680`; `cli.py:170-244`; `mcp.py:115,496,505` | Architectural; partly NEW | -| **Q-H3** | **LLM judge parses model output as gate authority** with untrusted rationale embedded as text — prompt-injection surface in coached/protected | `enforcement/judge.py` | Baseline H3, confirmed (mitigated by structured-JSON-first + BLOCKED-wins, but advisory-as-authority remains) | - -> **Q-H1 severity calibration.** The writer/operator split is a *promised, tested* contract **only in `LEGIS_API_TOKEN_ACTORS` mode** — `tests/api/test_auth.py:100` (`test_scoped_tokens_separate_writer_and_operator_authority`) asserts a writer token gets 403 on `/protected/operator-override` while an operator token succeeds. **No test asserts single-secret mode denies operator routes**; `test_mutating_routes_require_secret_when_configured` (`:91`) only checks that the secret gates *write access*. So single-secret (`LEGIS_API_SECRET` alone) is, as built, a *one-credential* mode that does not offer the split. **Severity therefore depends on a product decision** (carried to `06`): if single-secret is a supported production mode that *promises* operator separation → **High, GA-blocking**; if single-secret means "solo/one-credential deployment" → this is a **Medium documentation-and-gate** item (label the limitation; require `TOKEN_ACTORS` or an explicit operator credential for any deployment relying on the split). This analysis does **not** assert High unconditionally. - -### Medium - -| ID | Finding | Location | Status | -|---|---|---|---| -| **Q-M1** | Protected records for **non-`.py` entities sign `source_binding: unverified`** | unverified-return `service/source_binding.py:46-53`; fail-closed guard skips non-`.py` `:82-89`; signed at `service/governance.py:170` | Baseline M1, confirmed | -| **Q-M2** | **Check/PR facts recorded on the writer's word** — no fact provenance/signature | `api/app.py:448,466`; `checks/surface.py`; `pulls/surface.py` | Baseline M2, confirmed | -| **Q-M3** | **`verify_integrity` can raise** (`ValueError`) on non-finite-float tampering instead of returning `False` — unguarded `content_hash(rec.payload)` in the verify loop; propagates into `sei_backfill`/`binding_ledger.verify` | `store/audit_store.py:168` | Baseline M6, PARTIALLY closed | -| **Q-M4** | **Filigree transport unsigned** (asymmetric vs HMAC-signed Loomweave); `attach` `signature` is app-level only | `filigree/client.py` | NEW (audit noted binding non-atomicity, not transport) | -| **Q-M5** | **Intra-store Wardline batch non-atomicity** — N sequential appends, no transaction; mid-loop failure persists earlier findings | `wardline/governor.py:60-65` | Baseline M3, refined | -| **Q-M6** | **Filigree binding availability coupled to Loomweave SEI capability** — degraded seam silently removes the binding surface for locator-keyed sign-offs | `governance/signoff_binding.py:38-42` | Baseline M4, confirmed | -| **Q-M7** | **In-code default cell is self-clearing `chill`** — fails open if `cells.toml` (`structured`) is absent | `policy/cells.py:44`; `mcp.py:111` | Baseline H6, confirmed | -| **Q-M8** | **Honesty-gate policy-co-occurrence is a substring-in-assert match**, not a semantic check that the boundary *result* is asserted | `policy/evidence.py:46-53,135-152` | Baseline M7, confirmed | - -### Low - -| ID | Finding | Location | Status | -|---|---|---|---| -| **Q-L1** | `gaps.py` raises `AttributeError` on explicit `"entity_key": null` (no `isinstance(dict)` guard; inconsistent with `sei_backfill`) | `governance/gaps.py:51,75` | NEW | -| **Q-L2** | `decay_sweep` has no per-record try/except — one malformed `entity_key` row aborts the whole sweep | `enforcement/lifecycle.py:55-62` | NEW | -| **Q-L3** | Governance modules type against **concrete `AuditStore`**, not the protocol (can't fake in unit tests) | `governance/{binding_ledger,sei_backfill,gaps}.py` | Baseline M12, residual relocated | -| **Q-L4** | Canonicalization not RFC-8785 hardened (cross-language verify); `ensure_ascii=False` byte-encoding footgun | `canonical.py` | Baseline M13, partially closed | -| **Q-L5** | Fingerprint extraction diverges between runtime gate and static scanner for class-method/decorated test_refs | `decorator.py:125-135` vs `boundary_scan.py:156-159` | Baseline L4, confirmed | -| **Q-L6** | Identity capability cache per-instance, never invalidated once `True` | `identity/resolver.py:42-48` | NEW | -| **Q-L7** | 2× `F401` unused imports; lint not in CI | `policy/grammar.py:15` + 1 | NEW (tooling) | -| **Q-L8** | `mcp.py` `call_tool` is a 464-stmt single if/elif; hand-rolled JSON-RPC has no stdin line-size bound | `mcp.py` | NEW (maintainability) | - ---- - -## 4. Maintainability & design-quality observations - -**Strengths (these are real and worth preserving):** -- **Testability is designed-in.** DI at every seam + Protocol-typed dependencies → 90% coverage and clean mypy are *consequences* of the architecture, not bolt-ons. -- **The fail-closed default** is consistent enough to be a property of the system, not a per-site choice. -- **Single choke points** (`canonical`, `signing_fields`, `evidence`) mean security-relevant changes touch one place. -- **Honest naming and docstrings.** Modules document their own trade-offs (e.g. the non-atomic attach→record window is admitted in-code, not hidden). - -**Debt / friction:** -- **Seam erosion** (Q-H2) is the highest-leverage maintainability debt: three implementations of "read the verified trail," already proven to diverge under fixes. -- **`mcp.py` size** (~1123 lines, 464-stmt dispatch) is the single-file complexity hotspot. -- **Concrete-store coupling in governance** (Q-L3) is the residual of an otherwise-completed protocol migration. -- **Lint not gating** lets trivial debt (unused imports) accumulate. - ---- - -## 5. Quality verdict - -**Grade: B+ / strong rc.** The codebase is well-engineered for its stage: clean types, high coverage, -governance-aware CI, disciplined fail-closed defaults, and a real layered architecture. The recent fix -velocity (six adapter-drift findings closed, C1/H5/M11 closed) shows an active, responsive maintenance loop. - -What separates it from an A is **input-authentication hardening** (Q-M1, Q-M2, Q-M4 — the system trusts -several inputs it records as governance evidence; plus Q-H1's single-secret split *if* that mode is meant to -promise it) and **seam discipline** (Q-H2 — the service layer must become the *only* way to reach a governance -decision). Neither is a rearchitecture; both are scheduling decisions for the path to GA. See -`06-architect-handover.md`. diff --git a/docs/arch-analysis-2026-06-06-0158/06-architect-handover.md b/docs/arch-analysis-2026-06-06-0158/06-architect-handover.md deleted file mode 100644 index c16f1b5..0000000 --- a/docs/arch-analysis-2026-06-06-0158/06-architect-handover.md +++ /dev/null @@ -1,104 +0,0 @@ -# 06 — Architect Handover - -Transition document from *analysis* to *improvement planning*. Sequences the findings from -`05-quality-assessment.md` into a risk-ordered roadmap with concrete entry points, and frames the -open architectural decisions an architect must own before GA. - -**Starting position:** Legis `1.0.0rc2` — a well-built rc (B+). Clean DAG, mypy-clean, 90% coverage, -governance-aware CI, active fix loop. The work here is **hardening + seam discipline**, not rearchitecture. - ---- - -## 1. The one architectural decision to make first - -**Decide what the service layer is *for*, then enforce it.** - -Today `service/` (WP-M1) is a *partial* seam: it owns governance decisions for `api` and `mcp`, but -`api` reaches past it (sign-off), `cli` ignores it, and `mcp` couples to `api`. The override-rate gate -exists in **three** implementations (§3.4 of `04`), and that duplication already caused a divergent fix -(`07cf54e`). This is the root cause behind a whole class of future drift. - -**The decision:** is the service layer the *single mandatory path* to every governance decision, or just -a convenience library two of three frontends happen to use? The architecture only pays off under the first -reading. Recommend ratifying **"every governance decision flows through `service/`; frontends are thin -adapters that translate transport ↔ `ServiceError`"** as an explicit invariant, then closing the three -drifts to match. Everything in Tier 1 below assumes this choice. - ---- - -## 2. Risk-ordered roadmap - -### Tier 1 — Before GA (security + the seam invariant) - -| # | Item | Entry point | Effort | Rationale | -|---|---|---|---|---| -| 1 | **Resolve single-secret scope split** (Q-H1) — *decision-gated.* The writer/operator split is tested only in `TOKEN_ACTORS` mode (`tests/api/test_auth.py:100`); single-secret mode does not separate them, and **no test promises it should**. **First decide (checklist item 2): is single-secret a supported split-promising production mode?** If **yes** → make `_verify_secret` consult `required_scope` so a single secret cannot satisfy `operator`; require an explicit operator credential (or opt-in `LEGIS_ALLOW_SINGLE_SECRET_OPERATOR=1` for dev) — **GA-blocking**. If **no** → document the limitation (single-secret = one-credential mode; use `TOKEN_ACTORS` for the split) and consider failing closed on operator routes without an operator-scoped credential — **not GA-blocking**. | `api/app.py:103,108-116` | S | Severity hinges on the product decision, not the code (which the validator confirmed). Don't ship the High framing unconditionally. | -| 2 | **Make `service/` the only path to a governance decision** (Q-H2). Route `api` sign-off through `service.request_signoff`/a new `service.sign_off`; replace the inline trail-verify block with `service.verified_records`; rebuild `cli`'s `_check_override_rate` on `service.compute_override_rate(service.verified_records(...))`. | `api/app.py:588,605-618,680`; `cli.py:170-244` | M | Collapses three trail-read implementations to one; kills the drift class at the source. | -| 3 | **Decide the protected source-binding contract** (Q-M1). Either fail closed unless `source_binding.status == "verified"` for source-code policies, or add server-side entity classification so the caller's locator shape can't choose the verification standard. | `service/source_binding.py:82-89`; `service/governance.py:163` | S–M | A protected record can be signed while not bound to current source bytes — "protected" ≠ "source verified." | -| 4 | **Harden `verify_integrity` to never raise** (Q-M3). Guard the loop-body `content_hash(rec.payload)` (catch `ValueError` → return `False`, or raise a domain `AuditIntegrityError`). Align api/cli/mcp error mapping. Add a non-finite-float tamper regression. | `store/audit_store.py:168` | S | The function can crash on exactly the tamper input it exists to detect; propagates into backfill/binding verify. | -| 5 | **Authenticate or quarantine recorded facts** (Q-M2, Q-M4). Split writer authority from forge-reporter authority; require signed webhook/HMAC envelope over check/PR facts, or mark them `provenance: unauthenticated` so consumers can't mistake them for governance evidence. Sign the Filigree transport (Weft-component HMAC) to match Loomweave. | `api/app.py:448,466`; `filigree/client.py` | M | Closes the "trust the writer's word" surface; removes the signed/unsigned asymmetry across suite seams. | - -### Tier 2 — Soon after GA (robustness + correctness) - -| # | Item | Entry point | Effort | -|---|---|---|---| -| 6 | **Production-default the policy cell to fail closed** (Q-M7). Make the in-code default `structured` (or a dedicated `unknown` cell), so an absent `cells.toml` can't silently downgrade to self-clear `chill`. | `policy/cells.py:44`; `mcp.py:111` | S | -| 7 | **Atomic Wardline batches** (Q-M5). Wrap `route_findings`' per-finding appends in one transaction, or record a scan-level batch envelope with per-finding status. | `wardline/governor.py:60-65` | M | -| 8 | **Robustness guards** (Q-L1, Q-L2). `isinstance(dict)` guard in `gaps.py`; per-record try/except in `decay_sweep` so one bad row doesn't abort the sweep. | `gaps.py:51,75`; `lifecycle.py:55-62` | S | -| 9 | **Strengthen the honesty gate** (Q-M8). Make the policy-co-occurrence check semantic — the boundary *result* must be the assertion subject, not a substring in a message. | `policy/evidence.py:135-152` | M | -| 10 | **Couple governance to the store protocol** (Q-L3). Type `binding_ledger`/`sei_backfill`/`gaps` against `AppendOnlyStore`, finishing the M12 migration so they're unit-testable against a fake. | `governance/*.py` | S | - -### Tier 3 — Maturity (process + maintainability) - -| # | Item | Entry point | Effort | -|---|---|---|---| -| 11 | **Raise the CI coverage floor** to ~88% global with per-package floors for `enforcement`/`service`/`governance`/`api`/`mcp`; **add ruff as a gating step**. | `.github/workflows/ci.yml:19`; `pyproject.toml` | S | -| 12 | **Make cross-repo conformance non-optional** for releases — a scheduled/pre-release live Loomweave job so endpoint/header drift can't pass default CI. | `ci.yml:22-28` | S | -| 13 | **Lift `filigree/client.py` coverage** (75% → parity) — the uncovered branches are the transport/error paths (ties to item 5). | `tests/filigree/` | S | -| 14 | **Tame `mcp.py`** — table-driven `call_tool` dispatch; bound the stdin JSON-RPC line size; lift the `DEFAULT_*_DB` constants into a shared config module (removes the `mcp -> api` edge). | `mcp.py` | M | -| 15 | **RFC-8785 canonicalization** (Q-L4) when cross-language verification is needed; reconcile the gate/scanner fingerprint extraction (Q-L5). | `canonical.py`; `decorator.py`/`boundary_scan.py` | M | -| 16 | **Reduce the LLM-judge attack surface** (Q-H3) — require non-LLM validation (or operator sign-off) for `ACCEPTED` in protected policies; treat the model as advisory, never sole gate authority. | `enforcement/judge.py`, `engine.py` | M | - ---- - -## 3. What NOT to do - -- **Don't rearchitect.** The DAG is clean, the layering is real, the choke points are correct. Resist the urge to "improve" the structure; the structure is the strength. Every Tier-1/2 item is a local change. -- **Don't add a config knob per finding.** Several findings exist because a dev-affordance (single secret, `chill` default, unsafe routing flag) leaks into production posture. Prefer *fail-closed defaults with an explicit opt-in flag* over new always-on configuration. -- **Don't trust the prior audits' severities verbatim.** Six of their findings are already fixed; this handover reflects the *current* tree. Re-verify before acting on any 2026-06-04 line not reconciled in `04 §6`. -- **Don't let `mcp.py` keep absorbing surface area** without the table-driven refactor (item 14) — it's the one file whose complexity is trending the wrong way. - ---- - -## 4. Suggested sequencing - -``` -Sprint A (GA-blocking): items 3, 4 (+ item 1 IF the checklist decision makes it GA-blocking) -Sprint B (GA-blocking): item 2 (the seam invariant — the structural fix; do after A so it's not entangled) -Sprint C (GA-blocking): item 5 (fact authentication + Filigree signing) -Sprint D (post-GA): items 6–10 (robustness + fail-closed defaults; item 1's document-and-gate path lands here if not GA-blocking) -Sprint E (maturity): items 11–16 (CI floors, mcp refactor, RFC-8785, judge hardening) -``` - -Items 3, 4 are small, independent security quick wins — a single focused sprint. Item 1's placement is -**decided by checklist item 2** (is single-secret split-promising?): GA-blocking in Sprint A if yes, a -document-and-gate task in Sprint D if no. Item 2 is the structural keystone and should land on its own so the -trail-read consolidation isn't tangled with security edits. Items 5 and 16 both touch suite-seam trust and -benefit from a Wardline/Loomweave/Filigree contract review alongside. - ---- - -## 5. Handover checklist for the receiving architect - -- [ ] Ratify (or reject) the **service-layer-is-mandatory** invariant (§1). Everything in Tier 1 assumes it. -- [ ] Confirm the **single-secret deployment** assumption — is single-secret a supported production mode? If yes, item 1 is GA-blocking; if it's dev-only, document that and gate it. -- [ ] Decide the **protected source-binding policy** for non-`.py` entities (item 3) — is a non-source protected policy a valid concept, or should those fail closed? -- [ ] Decide whether **check/PR facts** are governance-authoritative or operational-only (item 5) — this determines whether they need provenance or just a clear "unauthenticated" label. -- [ ] Schedule a **cross-repo contract review** with Loomweave/Wardline/Filigree owners (the wire contracts here are Legis-side only). -- [ ] Set the **CI coverage floor** and add lint (item 11) — cheap, immediate, prevents regression of the quality this analysis measured. - ---- - -*Inputs to this handover: `01`–`05` of this analysis set, the two 2026-06-04 read-only audits -(`temp/AUDIT-*.md`, recovered from HEAD), and live mypy/ruff/coverage runs. All findings carry `file:line` -evidence in `02` and `05`.* diff --git a/docs/arch-analysis-2026-06-06-0158/temp/AUDIT-comprehensive.md b/docs/arch-analysis-2026-06-06-0158/temp/AUDIT-comprehensive.md deleted file mode 100644 index ae992de..0000000 --- a/docs/arch-analysis-2026-06-06-0158/temp/AUDIT-comprehensive.md +++ /dev/null @@ -1,660 +0,0 @@ -# Comprehensive Read-Only Codebase Audit - -Date: 2026-06-04 -Repository: `/home/john/legis` -Mode: Strict read-only audit of codebase behavior. The only repository write performed by the coordinator is this requested markdown artifact. - -## Method - -Seven specialist subagents were dispatched with read-only instructions and explicit prompts to avoid write tools, MCP/connector tools, and escaped double quotes in tool arguments. The available subagent API did not expose literal `enable_write_tools=false` or `enable_mcp_tools=false` fields, so those constraints were enforced through the `explorer` role and prompt instructions. - -Specialist lanes completed: - -- Architecture Critic -- Systems Thinker -- Python Engineer -- Quality Engineer -- Security Architect -- Static Tools Analyst -- MCP and CLI Specialist - -No tests, formatters, or mutating commands were run. Local coordinator inspection used read-only shell commands. There is no shipped `scanner/ast_primitives.py`, `scanner/rules/`, PY-WL-101..111 rule implementation, trust-lattice implementation, SCC implementation, or Tarjan implementation in the current tree; those terms appear in roadmap/planning material and test fixtures, not shipped source. - -## Executive Summary - -The highest-risk theme is adapter drift: HTTP and MCP expose overlapping governance capabilities, but MCP omits several server-side constraints present in HTTP/CLI. In particular, MCP can route Wardline scan results using caller-selected cells, skip Wardline artifact HMAC verification, and read protected governance trails without the HMAC verification that HTTP/CLI perform. - -The second major theme is evidence loss or weak binding in governance records. Decay re-judging drops the original source/Loomweave context, signed sign-off approvals do not bind the full request evidence, protected source binding can remain `unverified` while still producing a signed protected record, and the CI governance gate can pass with no governance trail artifact at all. - -## Critical Findings - -### C1. CI governance gate passes when the governance trail is absent - -Locations: - -- [src/legis/cli.py](/home/john/legis/src/legis/cli.py:134) lines 134-140 -- [.github/workflows/ci.yml](/home/john/legis/.github/workflows/ci.yml:23) lines 23-33 -- [.gitignore](/home/john/legis/.gitignore:13) lines 13-14 -- [tests/test_cli.py](/home/john/legis/tests/test_cli.py:92) lines 92-97 - -Evidence: - -`governance-gate` returns success with `PASS_WITH_NOTICE` when the SQLite DB does not exist. The CI job runs `uv run legis governance-gate --db sqlite:///legis-governance.db`, while `*.db` is ignored and the workflow does not create or download that database. - -Impact: - -In a fresh CI checkout, the governance gate can be a successful no-op. The project receives a clean CI signal even when there is no override-rate trail to evaluate. - -Remediation: - -Add an explicit CI mode that fails closed on a missing or empty governance trail. For example, change `_check_override_rate()` to return non-zero when the DB is absent unless a local-development flag such as `--allow-missing-trail` or `LEGIS_ALLOW_MISSING_GOVERNANCE_DB=1` is set. Update CI to either produce/download the expected trail artifact or fail before evaluating override rate. Replace `test_governance_gate_missing_sqlite_db_is_pass_with_notice_without_creating_it` with separate dev-mode and CI-mode tests. - -### C2. MCP Wardline routing lets the governed agent choose the governance cell - -Locations: - -- [src/legis/mcp.py](/home/john/legis/src/legis/mcp.py:213) lines 213-226 -- [src/legis/mcp.py](/home/john/legis/src/legis/mcp.py:623) lines 623-665 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:693) lines 693-720 -- [src/legis/wardline/governor.py](/home/john/legis/src/legis/wardline/governor.py:120) lines 120-130 - -Evidence: - -The HTTP API rejects request-owned Wardline routing unless server routing is configured or `LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING=1`. MCP `scan_route` accepts `cell`, `severity_map`, and `fail_on` directly from tool arguments and routes with those values. - -Impact: - -An agent using MCP can downgrade a severe scan finding to a softer path such as `surface_only`, bypassing the server-owned escalation policy that HTTP enforces. - -Remediation: - -Move Wardline routing ownership into a shared service helper used by both HTTP and MCP. Load `LEGIS_WARDLINE_CELL` and `LEGIS_WARDLINE_CELL_BY_SEVERITY` in the MCP runtime. Reject caller-provided routing unless an explicit unsafe dev flag is set. Add MCP tests matching the HTTP server-owned routing tests, including rejection of request-owned `cell`, `severity_map`, and `fail_on` by default. - -### C3. MCP protected-trail reads skip HMAC verification - -Locations: - -- [src/legis/mcp.py](/home/john/legis/src/legis/mcp.py:452) lines 452-472 -- [src/legis/mcp.py](/home/john/legis/src/legis/mcp.py:734) lines 734-743 -- [src/legis/enforcement/protected.py](/home/john/legis/src/legis/enforcement/protected.py:123) lines 123-163 -- [src/legis/service/governance.py](/home/john/legis/src/legis/service/governance.py:78) lines 78-88 -- [src/legis/cli.py](/home/john/legis/src/legis/cli.py:149) lines 149-161 - -Evidence: - -MCP `_verified_records()` checks the unkeyed audit hash chain via `verify_integrity()` but never constructs or calls `TrailVerifier`. HTTP and CLI both have HMAC verification paths for protected records. - -Impact: - -An attacker with DB-file access can edit a protected record, recompute the unkeyed chain, and have MCP read/scoring tools such as `override_rate_get` consume the forged record. HTTP/CLI would fail closed. - -Remediation: - -Add `trail_verifier` to `McpRuntime` and build it from `LEGIS_HMAC_KEY` plus `LEGIS_PROTECTED_POLICIES`. Replace MCP-local `_verified_records()` with the shared `service.governance.verified_records()` path or equivalent HMAC verification. Add an MCP regression that mutates a signed protected record, recomputes the hash chain, and asserts `AUDIT_INTEGRITY_FAILURE`. - -## High Findings - -### H1. MCP skips configured Wardline artifact HMAC verification - -Locations: - -- [src/legis/mcp.py](/home/john/legis/src/legis/mcp.py:655) lines 655-664 -- [src/legis/service/wardline.py](/home/john/legis/src/legis/service/wardline.py:24) lines 24-36 -- [src/legis/wardline/ingest.py](/home/john/legis/src/legis/wardline/ingest.py:67) lines 67-107 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:751) lines 751-765 - -Evidence: - -The shared Wardline service can enforce `artifact_key`, and HTTP passes `LEGIS_WARDLINE_ARTIFACT_KEY`. MCP calls `route_wardline_scan()` without an artifact key, so signed-artifact enforcement is bypassed. - -Impact: - -A forged or provenance-stripped Wardline scan can enter governance through MCP even when HTTP deployment policy requires authenticated scanner artifacts. - -Remediation: - -Add `wardline_artifact_key` to `McpRuntime`, populated from `LEGIS_WARDLINE_ARTIFACT_KEY`, and pass it to `route_wardline_scan()`. Map `WardlinePayloadError` to an explicit tool error. Add MCP tests for unsigned rejection and signed acceptance with verified provenance. - -### H2. Decay re-judging drops source and Loomweave context - -Locations: - -- [src/legis/enforcement/lifecycle.py](/home/john/legis/src/legis/enforcement/lifecycle.py:36) lines 36-43 -- [src/legis/enforcement/protected.py](/home/john/legis/src/legis/enforcement/protected.py:229) lines 229-242 -- [src/legis/service/governance.py](/home/john/legis/src/legis/service/governance.py:151) lines 151-165 - -Evidence: - -`decay_sweep()` reconstructs an `OverrideRecord` with only policy, entity, rationale, agent, and time. It omits original extensions such as `file_fingerprint`, `ast_path`, `source_binding`, Loomweave content hash, and lineage snapshot. - -Impact: - -Renewal decisions can be made on less evidence than the original protected decision. A judge can re-accept or re-block based on incomplete context. - -Remediation: - -Preserve a sanitized copy of original evidence extensions for decay re-judging. Exclude prior verdict and signature fields, but include source binding, file fingerprint, AST path, Wardline provenance, Loomweave content hash, and lineage snapshot. Add a regression with a judge that asserts those fields are present during decay. - -### H3. LLM judge verdict is prompt-injectable through untrusted rationale - -Locations: - -- [src/legis/enforcement/judge.py](/home/john/legis/src/legis/enforcement/judge.py:52) lines 52-76 -- [src/legis/enforcement/engine.py](/home/john/legis/src/legis/enforcement/engine.py:80) lines 80-97 - -Evidence: - -The judge prompt embeds attacker-controlled rationale directly into text and parses the model's first-line verdict as the authoritative decision. Accepted model output becomes an accepted governance override. - -Impact: - -In coached/protected deployments with a real LLM judge, a malicious rationale can attempt prompt injection to force `ACCEPTED`. In protected mode, the resulting compromised judgment can then be HMAC-signed as tamper-evident evidence. - -Remediation: - -Treat the model as advisory for high-stakes protected decisions unless backed by deterministic checks or operator sign-off. Use structured output with strict schema validation. Encode user rationale as data, add prompt-injection regression cases, and require non-LLM validation for `ACCEPTED` in protected policies. - -### H4. Signed sign-off approvals do not bind the original request evidence - -Locations: - -- [src/legis/enforcement/signoff.py](/home/john/legis/src/legis/enforcement/signoff.py:28) lines 28-43 -- [src/legis/enforcement/signoff.py](/home/john/legis/src/legis/enforcement/signoff.py:93) lines 93-99 -- [src/legis/enforcement/signoff.py](/home/john/legis/src/legis/enforcement/signoff.py:110) lines 110-119 - -Evidence: - -Pending sign-off requests can carry Loomweave/Wardline extensions. The later `SIGNED_OFF` record includes only `signoff_state` and `request_seq`, yet `signoff_signing_fields()` is designed to include Loomweave evidence when present. - -Impact: - -The signed approval row proves an operator signed a sequence number, but the signature does not directly cover the evidence context from the original request. - -Remediation: - -Before signing the `SIGNED_OFF` record, bind it to the original request by adding either a canonical request payload hash or a copied immutable evidence block. Include that hash/evidence in `signoff_signing_fields()` and verify it on reads. Add tamper tests that modify the request evidence after sign-off. - -### H5. Binding ledger verification omits append-only hash-chain integrity - -Locations: - -- [src/legis/governance/binding_ledger.py](/home/john/legis/src/legis/governance/binding_ledger.py:59) lines 59-75 -- [src/legis/governance/binding_ledger.py](/home/john/legis/src/legis/governance/binding_ledger.py:76) lines 76-82 -- [src/legis/store/audit_store.py](/home/john/legis/src/legis/store/audit_store.py:161) lines 161-171 -- [src/legis/governance/signoff_binding.py](/home/john/legis/src/legis/governance/signoff_binding.py:54) lines 54-73 - -Evidence: - -`BindingLedger.verify()` validates per-record HMACs but does not call `AuditStore.verify_integrity()`. A deleted binding record can therefore degrade to “no binding” instead of an integrity failure. - -Impact: - -Legis can silently lose the local tamper-bound binding leg after Filigree has already accepted an association. - -Remediation: - -Make `BindingLedger.verify()` fail if the underlying audit store hash chain fails. Add deletion, reorder, and rechaining tamper tests. Consider a reconciliation command that compares Filigree associations with local binding ledger entries. - -### H6. Unmatched policies default to self-clear behavior - -Locations: - -- [policy/cells.toml](/home/john/legis/policy/cells.toml:5) lines 5-13 -- [src/legis/policy/cells.py](/home/john/legis/src/legis/policy/cells.py:33) lines 33-40 -- [src/legis/policy/cells.py](/home/john/legis/src/legis/policy/cells.py:43) lines 43-44 -- [src/legis/mcp.py](/home/john/legis/src/legis/mcp.py:504) lines 504-523 - -Evidence: - -Default policy routing is `chill`; unmatched policies fall through to the default cell. MCP `override_submit` treats chill as `ACCEPTED_SELF`. - -Impact: - -A typo, missing registry entry, or incomplete policy deployment silently downgrades governance to self-clear. - -Remediation: - -Introduce a production default that fails closed, such as an `unknown` or `structured` cell. Require explicit policy matches for `override_submit` in production mode. Keep `chill` only for local/dev registries, and add tests for unknown policy behavior. - -### H7. Unscoped API token mappings grant operator authority - -Locations: - -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:59) lines 59-85 -- [tests/api/test_complex_api.py](/home/john/legis/tests/api/test_complex_api.py:110) lines 110-122 - -Evidence: - -The token parser enforces scopes only when the actor spec contains a colon. An entry such as `op-a=token-a` returns `op-a` for any required scope, and a test confirms it can perform a protected operator override. - -Impact: - -A token intended for writer authority can accidentally become an all-scope token, crossing the operator boundary. - -Remediation: - -Reject unscoped `LEGIS_API_TOKEN_ACTORS` entries by default. Require `actor:writer=...`, `actor:operator=...`, or an explicit `actor:*=...` syntax gated behind `LEGIS_ALLOW_UNSCOPED_API_TOKENS=1` for compatibility. Add startup or first-request validation tests. - -## Medium Findings - -### M1. Protected source binding can be unverified while the record is still signed - -Locations: - -- [src/legis/service/source_binding.py](/home/john/legis/src/legis/service/source_binding.py:45) lines 45-66 -- [src/legis/service/governance.py](/home/john/legis/src/legis/service/governance.py:151) lines 151-164 -- [src/legis/service/governance.py](/home/john/legis/src/legis/service/governance.py:183) lines 183-197 -- [src/legis/enforcement/protected.py](/home/john/legis/src/legis/enforcement/protected.py:65) lines 65-77 - -Evidence: - -For non-Python locators, missing `source_root`, or missing source files, source binding returns `status: unverified`. Protected submission signs and records that status, but does not require verification. - -Impact: - -A protected record can be validly signed while not actually bound to current source bytes. The caveat is preserved, but downstream readers may equate “protected” with “source verified.” - -Remediation: - -For source-code policies, fail closed unless `source_binding.status == "verified"`. If non-source protected policies are valid, add server-side policy/entity classification so the caller’s entity-string shape cannot choose the verification standard. - -### M2. CI/check and recorded PR surfaces accept writer-supplied facts without provenance - -Locations: - -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:406) lines 406-430 -- [src/legis/checks/surface.py](/home/john/legis/src/legis/checks/surface.py:50) lines 50-65 -- [src/legis/pulls/surface.py](/home/john/legis/src/legis/pulls/surface.py:27) lines 27-39 - -Evidence: - -`POST /checks` and `POST /git/pulls` record caller-supplied operational facts. Check facts are inserted without signature/webhook proof, and PR metadata is delete-and-replace. - -Impact: - -A compromised writer token can record fake passing CI or rewrite PR metadata. If future gates depend on these surfaces, this becomes a policy bypass. - -Remediation: - -Split writer authority from CI/forge reporter authority. Require signed webhook ingestion, forge API verification, or an HMAC envelope over check/PR facts. Store PR/check changes as append-only provenance events and expose trust status to readers. - -### M3. Mixed Wardline batches are not atomic across governance stores - -Locations: - -- [src/legis/wardline/governor.py](/home/john/legis/src/legis/wardline/governor.py:58) lines 58-64 -- [src/legis/wardline/governor.py](/home/john/legis/src/legis/wardline/governor.py:88) lines 88-130 - -Evidence: - -The code comment explicitly states successful mixed batches can span engine and signoff stores, and mid-loop runtime failure leaves prior writes permanently persisted. - -Impact: - -A scan can produce a partial governance picture where early findings are recorded and later findings disappear. - -Remediation: - -Record a scan-level batch envelope with per-finding statuses, or route through an outbox/reconciliation process. Add tests where the second finding fails after the first write would succeed, and assert either all-or-nothing behavior or explicit partial-failure records. - -### M4. Locator-keyed sign-offs can later fail rename-stable Filigree binding - -Locations: - -- [src/legis/identity/resolver.py](/home/john/legis/src/legis/identity/resolver.py:66) line 66 -- [src/legis/governance/signoff_binding.py](/home/john/legis/src/legis/governance/signoff_binding.py:38) lines 38-42 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:587) lines 587-601 -- [src/legis/governance/sei_backfill.py](/home/john/legis/src/legis/governance/sei_backfill.py:44) line 44 - -Evidence: - -Identity degradation can create locator-keyed records. Filigree binding rejects locator keys. The bind endpoint uses the original sign-off request’s entity key, while backfill appends separate events rather than rewriting originals. - -Impact: - -A temporary Loomweave outage can permanently block a later issue binding, even after backfill, unless bind-time lookup accounts for backfill records. - -Remediation: - -For policies that require Filigree binding, fail closed when stable identity is unavailable. Alternatively, teach bind-time lookup to resolve through backfill events and document the rebinding contract. - -### M5. `EntityKey.from_dict()` coerces malformed stability values to true - -Locations: - -- [src/legis/identity/entity_key.py](/home/john/legis/src/legis/identity/entity_key.py:33) lines 33-34 -- [src/legis/governance/gaps.py](/home/john/legis/src/legis/governance/gaps.py:48) lines 48-54 - -Evidence: - -`EntityKey.from_dict()` uses `bool(d["identity_stable"])`, so a string like `"false"` becomes `True`. - -Impact: - -Malformed decoded payloads can be treated as stable SEI-keyed identities, affecting lineage/gap logic and binding decisions. - -Remediation: - -Validate that `value` is a non-empty string and `identity_stable` is exactly a `bool`. Raise `ValueError` for anything else. Add malformed payload tests. - -### M6. Audit integrity verification can raise decode exceptions instead of a controlled integrity failure - -Locations: - -- [src/legis/store/audit_store.py](/home/john/legis/src/legis/store/audit_store.py:130) lines 130-144 -- [src/legis/store/audit_store.py](/home/john/legis/src/legis/store/audit_store.py:161) lines 161-171 -- [src/legis/cli.py](/home/john/legis/src/legis/cli.py:142) lines 142-145 - -Evidence: - -`verify_integrity()` iterates `read_all()`, which JSON-decodes every payload before integrity checks. Malformed JSON can raise before returning `False`. - -Impact: - -Tampering can bypass the documented boolean integrity-failure path and produce inconsistent API/MCP/CLI errors. - -Remediation: - -Make `verify_integrity()` read raw rows or catch `json.JSONDecodeError` and return `False` or raise a domain `AuditIntegrityError`. Align HTTP, CLI, and MCP error mapping around that domain error. - -### M7. Policy-boundary AST honesty gate can accept weak or shadowed evidence - -Locations: - -- [src/legis/policy/decorator.py](/home/john/legis/src/legis/policy/decorator.py:198) lines 198-212 -- [src/legis/policy/decorator.py](/home/john/legis/src/legis/policy/decorator.py:214) lines 214-228 -- [tests/policy/test_honesty_gate.py](/home/john/legis/tests/policy/test_honesty_gate.py:10) line 10 - -Evidence: - -The gate walks the entire test AST and accepts any call whose name/attribute matches the boundary function, plus any string/name reference to a suppressed policy. It does not resolve bindings or prove the assertion is tied to the call result. - -Impact: - -A test can satisfy the gate without proving boundary behavior, especially with local shadows, helper calls, or tautological string references. - -Remediation: - -Make traversal scope-aware. Ignore nested functions/classes unless explicitly targeted. Resolve the call to the decorated function binding, and require an assertion or exception path connected to the call result and suppressed policy. - -### M8. Test suite autouse fixture enables unsafe auth and unsafe Wardline routing globally - -Locations: - -- [tests/conftest.py](/home/john/legis/tests/conftest.py:18) lines 18-22 -- [tests/api/test_auth.py](/home/john/legis/tests/api/test_auth.py:45) lines 45-90 - -Evidence: - -Every test starts with `LEGIS_UNSAFE_DEV_AUTH=1` and `LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING=1`. Auth coverage relies on a manually maintained route matrix. - -Impact: - -New mutating routes can be tested under unsafe mode and omitted from the auth matrix. - -Remediation: - -Replace the autouse unsafe fixture with explicit unsafe-client fixtures. Add route-introspection tests asserting every mutating route denies unauthenticated writes by default. - -### M9. MCP tool schemas claim `additionalProperties: false` but unknown arguments are accepted - -Locations: - -- [src/legis/mcp.py](/home/john/legis/src/legis/mcp.py:155) lines 155-161 -- [src/legis/mcp.py](/home/john/legis/src/legis/mcp.py:300) lines 300-309 -- [src/legis/mcp.py](/home/john/legis/src/legis/mcp.py:488) lines 488-514 -- [tests/mcp/test_server.py](/home/john/legis/tests/mcp/test_server.py:164) lines 164-170 - -Evidence: - -Schemas reject additional properties, but dispatch does not enforce allowed key sets. Tests show `agent_id` can be supplied and ignored. - -Impact: - -The structural “launch-bound identity only” invariant is weaker than the schema suggests. Future sensitive fields could be silently accepted. - -Remediation: - -Validate arguments against each tool schema before dispatch and reject unexpected keys with `INVALID_ARGUMENT`. Add tests for unknown fields on every mutating MCP tool. - -### M10. MCP sign-off poll handle has a type mismatch - -Locations: - -- [src/legis/mcp.py](/home/john/legis/src/legis/mcp.py:199) lines 199-202 -- [src/legis/mcp.py](/home/john/legis/src/legis/mcp.py:328) lines 328-336 -- [src/legis/mcp.py](/home/john/legis/src/legis/mcp.py:552) lines 552-553 -- [tests/mcp/test_server.py](/home/john/legis/tests/mcp/test_server.py:319) lines 319-321 - -Evidence: - -`override_submit` returns `poll_handle` as an integer, but `signoff_status_get` declares and requires `seq` as a string. - -Impact: - -An agent mechanically passing the advertised handle into the advertised poll tool gets an invalid argument error. - -Remediation: - -Make `seq` an integer schema and accept integers, or return `poll_handle` as a string. Add a round-trip MCP test using the returned handle without manual conversion. - -### M11. MCP `override_submit` has no idempotency protection - -Locations: - -- [src/legis/mcp.py](/home/john/legis/src/legis/mcp.py:181) lines 181-186 -- [src/legis/service/governance.py](/home/john/legis/src/legis/service/governance.py:106) lines 106-133 -- [src/legis/service/governance.py](/home/john/legis/src/legis/service/governance.py:200) lines 200-219 - -Evidence: - -`override_submit` and sign-off request flows are side-effecting, but the tool schema has no idempotency key and the description does not warn about retry duplication. - -Impact: - -Host or agent retries after timeout can duplicate audit records or human sign-off requests. - -Remediation: - -Add an idempotency key to side-effecting MCP tools, stored in extensions and de-duplicated before append. If v1 intentionally excludes idempotency, update tool descriptions to warn that retries create new records and return correlation data for clients. - -### M12. Core enforcement modules depend directly on the SQLAlchemy-backed audit store - -Locations: - -- [src/legis/enforcement/engine.py](/home/john/legis/src/legis/enforcement/engine.py:25) line 25 -- [src/legis/enforcement/signoff.py](/home/john/legis/src/legis/enforcement/signoff.py:19) line 19 -- [src/legis/enforcement/protected.py](/home/john/legis/src/legis/enforcement/protected.py:23) line 23 -- [src/legis/store/audit_store.py](/home/john/legis/src/legis/store/audit_store.py:22) lines 22-33 - -Evidence: - -Domain gates import concrete `AuditStore`, which imports SQLAlchemy and creates schema in its constructor. - -Impact: - -The enforcement layer is coupled to persistence and database lifecycle. This weakens package boundaries and undermines “dependency-free core” expectations. - -Remediation: - -Define a minimal audit-log protocol for append/read/verify behavior. Depend on that protocol in enforcement modules, and keep SQLAlchemy inside `store.audit_store`. - -### M13. Protected signing canonicalization is not hardened for cross-version/cross-language guarantees - -Locations: - -- [src/legis/canonical.py](/home/john/legis/src/legis/canonical.py:3) lines 3-5 -- [src/legis/canonical.py](/home/john/legis/src/legis/canonical.py:15) lines 15-18 -- [src/legis/enforcement/signing.py](/home/john/legis/src/legis/enforcement/signing.py:30) lines 30-34 - -Evidence: - -The code comments state RFC 8785 is future hardening before protected cryptographic guarantees ship. Current signing uses `json.dumps()` without `allow_nan=False`. - -Impact: - -Signatures are deterministic in this Python process, but less robust for cross-language verification and can encode non-standard `NaN`/`Infinity` values if they appear in `Any` extensions. - -Remediation: - -Introduce a versioned canonicalizer based on RFC 8785 or a documented strict subset. Reject non-standard JSON values with `allow_nan=False`. Keep compatibility verification for existing signatures. - -### M14. Critical-path coverage and live Loomweave conformance are not enforced in default CI - -Locations: - -- [pyproject.toml](/home/john/legis/pyproject.toml:19) lines 19-23 -- [.github/workflows/ci.yml](/home/john/legis/.github/workflows/ci.yml:18) lines 18-21 -- [tests/conformance/test_live_loomweave_oracle.py](/home/john/legis/tests/conformance/test_live_loomweave_oracle.py:16) line 16 - -Evidence: - -CI runs pytest and mypy, but there is no coverage dependency, branch coverage threshold, or required live Loomweave job. The live Loomweave oracle is opt-in. - -Impact: - -Security/governance behavior can regress without a measurable coverage signal, and Loomweave endpoint/header drift can pass default CI. - -Remediation: - -Add coverage tooling with branch thresholds for `api`, `mcp`, `service`, `enforcement`, `governance`, and `wardline`. Add a scheduled or pre-release live Loomweave job with `LOOMWEAVE_URL`, locator fixture, and HMAC credentials. - -## Low Findings - -### L1. MCP protocol lifecycle handling is permissive and version-pinned - -Locations: - -- [src/legis/mcp.py](/home/john/legis/src/legis/mcp.py:750) lines 750-776 - -Evidence: - -`handle_request()` does not validate `jsonrpc`, initialize params, or initialized lifecycle state, and hardcodes protocol version `2024-11-05`. - -Impact: - -Newer MCP clients may negotiate unexpectedly or proceed through malformed protocol use. - -Remediation: - -Validate initialize params, track initialized state before normal operations, and negotiate/echo a supported requested protocol version where possible. - -### L2. MCP tool errors lack recovery metadata - -Locations: - -- [src/legis/mcp.py](/home/john/legis/src/legis/mcp.py:274) lines 274-298 - -Evidence: - -Tool errors return only `error_code` and `message`. - -Impact: - -Agents cannot reliably distinguish retryable, user-fixable, and stop-and-escalate failures without hardcoded knowledge. - -Remediation: - -Add stable fields such as `category`, `retryable`, and `recovery`, preserving existing `error_code`. - -### L3. MCP runtime construction can create local state for read-oriented use - -Locations: - -- [src/legis/mcp.py](/home/john/legis/src/legis/mcp.py:101) lines 101-152 -- [src/legis/store/audit_store.py](/home/john/legis/src/legis/store/audit_store.py:85) lines 85-86 - -Evidence: - -`build_runtime()` constructs stores during startup, and `AuditStore.__init__()` creates tables/triggers. - -Impact: - -Starting an MCP process for read tools can create DB files, making “trail exists” ambiguous for other logic. - -Remediation: - -Split open-existing read handles from write-capable initializing handles. Make DB creation explicit in server/write mode. - -### L4. Indented test sources can fail policy-boundary AST parsing inconsistently - -Locations: - -- [src/legis/policy/decorator.py](/home/john/legis/src/legis/policy/decorator.py:187) lines 187-196 - -Evidence: - -`fingerprint()` dedents source before parsing elsewhere, but `check_policy_boundary()` reparses `inspect.getsource(test_fn)` without dedenting. - -Impact: - -Nested/local test functions can fingerprint successfully but fail the later AST heuristic. - -Remediation: - -Apply the same `textwrap.dedent()` and newline normalization in the second parse path. - -### L5. Runtime bytecode artifacts exist in the working tree - -Locations: - -- `/home/john/legis/src/legis/**/__pycache__/` -- `/home/john/legis/tests/**/__pycache__/` - -Evidence: - -`find` shows `__pycache__` and `.pyc` files under `src/` and `tests/`. `git ls-files '*__pycache__*' '*.pyc'` returned no tracked files, so this is working-tree hygiene rather than tracked-source corruption. - -Impact: - -Generated artifacts can pollute review context and packaging if ignore rules regress. - -Remediation: - -Clean local bytecode artifacts before releases and keep `.gitignore` enforcement in place. Consider a CI cleanliness check if release packaging consumes the working tree. - -## Cross-Cutting Notes - -### Static Analysis Scope - -The requested scanner-specific items are not present in shipped source: - -- No `scanner/ast_primitives.py` -- No `scanner/rules/` -- No local PY-WL-101..111 rule implementations -- No local taint propagation lattice -- No SCC/Tarjan implementation - -Closest shipped components are: - -- `wardline/ingest.py`: validates external Wardline scan payloads and trust-tier names. -- `wardline/governor.py`: routes external findings into governance cells. -- `policy/decorator.py`: performs AST-based policy-boundary evidence checks. -- `service/source_binding.py`: verifies current source fingerprints for recognized relative Python locators. - -### Main Trust Boundaries - -- HTTP clients to FastAPI: mutating routes use bearer auth, with writer/operator split weakened by unscoped token entries. -- MCP host/agent to stdio JSON-RPC: identity is launch-bound, but MCP skips several HTTP/CLI enforcement checks. -- Wardline scan payloads to governance: HTTP can enforce signed artifacts; MCP currently cannot. -- LLM judge to enforcement: model output is parsed as gate authority when a judge is wired. -- Loomweave to identity resolver: HTTPS is required except loopback or explicit insecure override; HMAC headers are used when key material exists. -- Filigree binding: binding tuples can be HMAC-signed, but local ledger and remote attach are not transactional. -- SQLite governance store: hash-chain plus HMAC for protected records, but MCP and binding ledger do not consistently apply all verification layers. - -## Prioritized Remediation Plan - -1. Fail closed in CI when the governance DB is missing, or explicitly provision the trail artifact before `governance-gate`. -2. Make MCP use the same Wardline routing ownership and artifact HMAC verification as HTTP. -3. Add protected-trail HMAC verification to MCP reads and regression-test rechained tampering. -4. Bind sign-off approval signatures to the original request evidence and fix decay re-judging to preserve source/Loomweave context. -5. Require explicit API token scopes and fail closed for unknown production policies. -6. Decide whether protected source-code policies require `source_binding.status == "verified"` and enforce that decision server-side. -7. Harden binding ledger integrity, audit-store malformed JSON handling, and `EntityKey.from_dict()` validation. -8. Add idempotency and stricter argument validation for side-effecting MCP tools. -9. Replace global unsafe test fixtures with explicit fixtures and add route-introspection auth tests. -10. Add coverage thresholds and a scheduled/pre-release live Loomweave conformance job. - -## Verification Limits - -This was a read-only audit. No tests were run, no formatters were run, and no application servers were started. Findings are based on source inspection by seven specialized agents plus coordinator validation of cited code locations. diff --git a/docs/arch-analysis-2026-06-06-0158/temp/AUDIT-readonly.md b/docs/arch-analysis-2026-06-06-0158/temp/AUDIT-readonly.md deleted file mode 100644 index f7e6be8..0000000 --- a/docs/arch-analysis-2026-06-06-0158/temp/AUDIT-readonly.md +++ /dev/null @@ -1,612 +0,0 @@ -# Legis Read-Only Codebase Audit - -Date: 2026-06-04 - -Repository: `/home/john/legis` - -Mode: strictly read-only audit of source/test/config surfaces. The only write performed was creation of this requested markdown artifact. - -## Method - -Seven specialized read-only subagents reviewed the codebase: - -- Architecture Critic -- Systems Thinker -- Python Engineer -- Quality Engineer -- Security Architect -- Static Tools Analyst -- MCP and CLI Specialist - -All subagents were instructed to operate with `enable_write_tools=false` and `enable_mcp_tools=false`, avoid write-generating commands, and avoid MCP tools. No test suite, mypy, formatter, server, or package build was run because those can create caches, sqlite files, or other artifacts. - -## Scope Notes - -- `src/legis/scanner/ast_primitives.py`, `src/legis/scanner/rules/`, PY-WL-101..111 rule implementations, SCC/Tarjan logic, and a trust-lattice engine are not present in this repository. The closest live surfaces are Wardline finding ingestion/routing, policy decorator/grammar, check facts, and governance records. -- A YAML policy surface is not present. The closest implementation is TOML exemption loading via `tomllib`. -- An MCP server implementation is not present. The repository has a transport-agnostic service layer and design notes for a future MCP adapter, but no stdio JSON-RPC server or MCP tool registry. - -## Executive Summary - -The highest risks are concentrated at trust boundaries where Legis records governance facts from request-body data supplied by the actor being governed. The main pattern is: caller-provided static-analysis payloads, caller-provided routing choices, caller-provided source bindings, and caller-provided identities become audit evidence without enough independent validation. - -The protected-cell cryptographic story also has material gaps. Some fields that readers will treat as audit evidence are not HMAC-bound, protected verification skips one malformed record class, and protected sign-off binding can use unsigned Loomweave metadata. - -No source code was changed during this audit. - -## Findings By Severity - -### Critical - -#### C1. Wardline governance can be bypassed or distorted by caller-shaped scan and routing input - -Locations: - -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:110) lines 110-114 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:517) lines 517-560 -- [src/legis/wardline/ingest.py](/home/john/legis/src/legis/wardline/ingest.py:45) lines 45-65 -- [src/legis/wardline/governor.py](/home/john/legis/src/legis/wardline/governor.py:76) lines 76-88 - -Evidence: - -- `ScanResultsIn.scan` is an untyped `dict`. -- `/wardline/scan-results` accepts `cell` or `cell_by_severity` from the same request that supplies the scan. -- `active_defects()` trusts `kind` and `suppressed` fields and drops anything not `kind == "defect"` and `suppressed == "active"`. -- `WardlineFinding.from_wire()` indexes required fields directly, so malformed payloads can become uncaught `KeyError` or `TypeError` instead of controlled validation failures. -- Per-severity routing defaults unmapped severities to `SURFACE_OVERRIDE`. - -Impact: - -A caller can omit a finding, mark it suppressed, change its kind/severity, choose softer routing, or submit malformed data that crashes the endpoint. For a governance system, that means critical findings can disappear or become soft audit events without an independently verifiable Wardline artifact. - -Remediation: - -1. Replace `ScanResultsIn.scan: dict` with typed Pydantic models for scan, finding, severity, suppression, rule id, fingerprint, qualname, and properties. -2. Make routing policy server-owned. Request bodies should not decide whether a finding is `surface_only`, `surface_override`, or `block_escalate` unless the caller has an explicitly authenticated policy-management scope. -3. Require a signed, hash-pinned, or otherwise authenticated Wardline artifact with scanner identity, commit/tree identity, rule-set version, finding count, active count, and suppression proof. -4. Record a raw scan digest and filtered-count provenance in every routed batch. -5. Treat unknown or unsupported suppression states as fail-closed, either rejected or recorded as a provenance gap. -6. Require total `cell_by_severity` mappings or an explicit configured default. Do not silently map omitted severities to `SURFACE_OVERRIDE`. - -Acceptance tests: - -- Posting a CRITICAL finding with `suppressed: "waived"` and no suppression proof must not disappear; it should reject or create a provenance-gap/block-escalate record. -- Posting `findings: "bad"`, a missing `rule_id`, and severity `BOGUS` should return 422 and write no governance record. -- A request attempting to route CRITICAL to `surface_only` contrary to server policy should be rejected or overridden by server policy. -- A partial severity map containing only `CRITICAL: block_escalate` plus an ERROR finding must not silently route the ERROR to `surface_override`. - -#### C2. Protected verdicts sign caller-supplied source bindings that the judge never evaluated - -Locations: - -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:75) lines 75-81 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:343) lines 343-355 -- [src/legis/enforcement/protected.py](/home/john/legis/src/legis/enforcement/protected.py:48) lines 48-59 -- [src/legis/enforcement/protected.py](/home/john/legis/src/legis/enforcement/protected.py:183) lines 183-201 - -Evidence: - -- `ProtectedIn` accepts `file_fingerprint` and `ast_path` from the request body. -- `ProtectedGate.submit()` builds the `OverrideRecord` sent to the judge without `file_fingerprint`, `ast_path`, or Loomweave extension context. -- `_record_signed()` later signs those fields into the stored payload. - -Impact: - -The HMAC proves Legis wrote a record containing those source-binding fields, but it does not prove the judge evaluated that source node or source bytes. A caller can bind a judge verdict to a different AST path or fingerprint and create misleading cryptographic audit evidence. - -Remediation: - -1. Stop treating `file_fingerprint` and `ast_path` as caller-authoritative fields. -2. Compute them inside Legis or accept them only as part of a trusted Loomweave/Wardline artifact whose digest is verified. -3. Include source-binding context and Loomweave lineage/content context in the `OverrideRecord` before `judge.evaluate()`. -4. Sign exactly the judged record and reject any mismatch between judged fields and persisted fields. -5. Add a typed `JudgedProtectedRecord` or equivalent value object so judge input and signed payload cannot drift. - -Acceptance tests: - -- A spy judge should receive the exact `file_fingerprint`, `ast_path`, and Loomweave context later signed. -- A request with a fingerprint not matching trusted current content should fail before HMAC signing. -- Mutating source-binding fields between judge evaluation and persistence should be impossible by construction or detected by a unit test. - -#### C3. Protected tamper evidence omits audit fields and skips malformed protected records - -Locations: - -- [src/legis/enforcement/protected.py](/home/john/legis/src/legis/enforcement/protected.py:39) lines 39-59 -- [src/legis/enforcement/protected.py](/home/john/legis/src/legis/enforcement/protected.py:76) lines 76-117 -- [src/legis/enforcement/signoff.py](/home/john/legis/src/legis/enforcement/signoff.py:60) lines 60-70 -- [src/legis/records/override_record.py](/home/john/legis/src/legis/records/override_record.py:30) lines 30-38 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:181) lines 181-187 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:413) lines 413-420 - -Evidence: - -- Protected override payloads store `agent_id` and `extensions.judge_rationale`, but `signing_fields()` does not sign either field. -- Protected sign-off signatures omit `extensions.loomweave`; later `/signoff/{seq}/bind-issue` reads `extensions.loomweave.content_hash` from the stored sign-off request. -- `TrailVerifier.verify()` skips protected-policy records lacking `entity_key` before requiring a signature. -- `LEGIS_PROTECTED_POLICIES` defaults to an empty set; `/protected/overrides` can write a signed record whose policy the verifier later skips because the policy is not configured as protected. - -Impact: - -An attacker with DB-file access can edit attribution, judge rationale, or sign-off Loomweave content hash, recompute the unkeyed hash chain, and still pass HMAC verification for some cases. A malformed protected record missing `entity_key` can be skipped entirely. This undermines the protected cell's non-repudiation and tamper-evidence guarantees. - -Remediation: - -1. Introduce `hmac-sha256:v2` signing fields for protected overrides that include `agent_id`, `judge_rationale`, source binding, Loomweave content/lineage fields, policy, entity, verdict, model, rationale, and recorded timestamp. -2. Introduce matching v2 signing fields for protected sign-offs, including Loomweave content hash and lineage snapshot where present. -3. For protected policies, missing required structural fields should raise `TamperError`; never `continue`. -4. Reject `/protected/overrides` for policies outside the configured protected set, or sign and verify an explicit protected-tier marker independent of policy-name configuration. -5. Before binding a sign-off to Filigree, verify the protected trail and verify the sign-off request payload whose content hash is being used. -6. Add migration/version handling so existing v1 records are either grandfathered explicitly or re-signed. - -Acceptance tests: - -- Tamper `agent_id` and `extensions.judge_rationale`, re-chain sqlite, and assert `TrailVerifier.verify()` plus `GET /overrides` fail closed. -- Remove `entity_key` and signature from a protected record, re-chain, and assert verified reads return an integrity error. -- Mutate a signed sign-off request's `extensions.loomweave.content_hash`, re-chain, and assert binding fails without calling Filigree. -- With HMAC enabled and an empty protected-policy set, protected writes should fail startup or fail the request; tampering a protected-endpoint record must still fail verified reads. - -### High - -#### H1. Mutating HTTP endpoints are broadly unauthenticated, and actor identity is request-controlled - -Locations: - -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:52) lines 52-61 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:68) lines 68-114 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:301) lines 301-316 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:367) lines 367-379 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:446) lines 446-454 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:492) lines 492-507 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:517) lines 517-560 -- [src/legis/enforcement/signoff.py](/home/john/legis/src/legis/enforcement/signoff.py:93) lines 93-110 - -Evidence: - -- `verify_operator()` only checks a shared bearer token when `LEGIS_API_SECRET` is set, otherwise it returns `"operator"` without requiring credentials. -- Only operator override and sign-off signing use `verify_operator()`. -- `POST /checks`, `/overrides`, `/protected/overrides`, `/signoff/request`, `/signoff/{seq}/bind-issue`, `/policy/evaluate`, and `/wardline/scan-results` can write state or trigger integrations without required credentials. -- `agent_id` and `operator_id` come from request bodies and are recorded into audit records. -- The authenticated operator value returned by `verify_operator()` is ignored; the body `operator_id` is recorded instead. - -Impact: - -Any reachable client can pollute CI facts, append governance records, create sign-off requests, submit protected override attempts, bind cleared sign-offs to issues, or spoof agent/operator attribution. This weakens audit non-repudiation and lets the governed actor choose who appears responsible. - -Remediation: - -1. Require authentication for every mutating route by default. -2. Add separate scopes/roles for agent writes, operator authority, CI/check ingestion, policy evaluation, Wardline ingestion, and Filigree binding. -3. Derive `agent_id` and `operator_id` from authenticated claims, mTLS identity, launch context, or token mapping. Do not accept them as authoritative JSON fields. -4. Use `hmac.compare_digest()` for shared-token comparison while a shared-token mode exists. -5. If unauthenticated local development is required, gate it behind an explicit unsafe-dev flag and make it noisy at startup. - -Acceptance tests: - -- With `LEGIS_API_SECRET` set, unauthenticated POSTs to every mutating route return 401/403 and write nothing. -- A token/claim for `op-a` with body `operator_id: op-b` records `op-a` or rejects with 403. -- MCP or future adapter schemas must not expose `agent_id` or `operator_id` as ordinary tool arguments. - -#### H2. Loomweave lineage failures silently degrade to clean-looking audit state - -Locations: - -- [src/legis/identity/resolver.py](/home/john/legis/src/legis/identity/resolver.py:38) lines 38-52 -- [src/legis/identity/resolver.py](/home/john/legis/src/legis/identity/resolver.py:55) lines 55-72 -- [src/legis/service/governance.py](/home/john/legis/src/legis/service/governance.py:21) lines 21-42 -- [src/legis/governance/gaps.py](/home/john/legis/src/legis/governance/gaps.py:56) lines 56-82 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:474) lines 474-488 - -Evidence: - -- Capability errors are cached as `_capable = False`. -- `IdentityResolver._snapshot()` returns `None` on lineage failure. -- `find_lineage_divergence()` skips records with no snapshot and catches lineage probe exceptions with `continue`. -- API lineage surfaces return empty lists when no client is configured and do not distinguish clean from unverified. - -Impact: - -A Loomweave outage, malformed response, or lineage failure can produce locator-keyed records or SEI-keyed records without snapshots. Later integrity checks can report no divergences even though lineage custody was unavailable. - -Remediation: - -1. Add explicit `identity_resolution_status` and `lineage_snapshot_status` fields to recorded governance extensions. -2. For protected/complex writes, decide whether lineage custody is mandatory. If mandatory, fail closed when unavailable. -3. Change lineage APIs to return statuses such as `verified`, `unavailable`, `unverified`, and `divergent`; do not conflate unavailable with clean. -4. Avoid permanently caching transient capability failures as incapable without TTL or retry semantics. -5. Validate lineage snapshot shape before use. - -Acceptance tests: - -- A fake Loomweave client resolving an alive SEI but raising on `lineage()` should cause protected writes to fail or record an explicit provenance gap. -- `/governance/lineage-integrity` should report an unavailable/unverified condition when lineage cannot be fetched, not `{"divergences": []}`. - -#### H3. Wardline block-escalate sign-offs drop Loomweave and Wardline metadata - -Locations: - -- [src/legis/wardline/governor.py](/home/john/legis/src/legis/wardline/governor.py:18) lines 18-21 -- [src/legis/wardline/governor.py](/home/john/legis/src/legis/wardline/governor.py:83) lines 83-95 -- [src/legis/wardline/governor.py](/home/john/legis/src/legis/wardline/governor.py:96) lines 96-116 -- [src/legis/enforcement/signoff.py](/home/john/legis/src/legis/enforcement/signoff.py:75) lines 75-90 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:413) lines 413-420 - -Evidence: - -- `route_findings()` resolves `loomweave_ext` and builds `wardline_ext`. -- `SURFACE_OVERRIDE` and `SURFACE_ONLY` merge and persist those extensions. -- `BLOCK_ESCALATE` calls `signoff.request()` without extensions even though `SignoffGate.request()` accepts them. -- The module docstring still says carrying Wardline tiers is deferred because `SignoffGate.request` has no extensions field, which is now stale. - -Impact: - -The highest-friction human sign-off path loses fingerprint, severity, trust tiers, Loomweave content hash, and lineage snapshot. Later Filigree binding may fall back to an empty content hash, and lineage-integrity checks cannot inspect what was signed off. - -Remediation: - -1. Pass `extensions={**loomweave_ext, "wardline": wardline_ext}` into the `BLOCK_ESCALATE` branch. -2. Update the stale docstring. -3. Require or explicitly record missing content hash when binding sign-offs to Filigree. -4. Add regression coverage at both unit and API levels. - -Acceptance tests: - -- Route a critical finding through `block_escalate` with an SEI/Loomweave resolver and assert the pending sign-off contains `extensions.loomweave` and `extensions.wardline`. -- Binding that sign-off should use the Loomweave content hash from the signed record. - -#### H4. `check-override-rate` can create an empty database and pass - -Locations: - -- [src/legis/cli.py](/home/john/legis/src/legis/cli.py:43) lines 43-51 -- [src/legis/cli.py](/home/john/legis/src/legis/cli.py:84) lines 84-119 -- [src/legis/store/audit_store.py](/home/john/legis/src/legis/store/audit_store.py:53) lines 53-86 -- [src/legis/store/audit_store.py](/home/john/legis/src/legis/store/audit_store.py:88) lines 88-104 -- [src/legis/enforcement/lifecycle.py](/home/john/legis/src/legis/enforcement/lifecycle.py:73) lines 73-102 - -Evidence: - -- `AuditStore.__init__()` always creates tables and installs triggers. -- `check-override-rate` constructs `AuditStore(args.db)` before verifying and reading. -- Empty records verify cleanly and evaluate as `PASS_WITH_NOTICE`; CLI returns nonzero only for `FAIL`. - -Impact: - -A CI gate pointed at a missing or wrong SQLite path can silently create an empty governance trail and return success. This can make a misconfigured governance gate look clean. - -Remediation: - -1. Add an `AuditStore.open_existing_readonly(url)` mode that refuses missing DB files, missing schema, and write-capable side effects. -2. Use the read-only/open-existing mode in `check-override-rate` and read-only verification paths. -3. Make missing governance DBs a configuration error for CI. -4. Consider a distinct exit code for `PASS_WITH_NOTICE` in CI mode. - -Acceptance tests: - -- Running `check-override-rate` against a nonexistent sqlite URL should exit nonzero and not create a file. -- Running against an existing valid DB should preserve current evaluation behavior. - -#### H5. Policy honesty gate accepts string mentions as behavioral evidence - -Locations: - -- [src/legis/policy/decorator.py](/home/john/legis/src/legis/policy/decorator.py:188) lines 188-229 -- [tests/policy/test_honesty_gate.py](/home/john/legis/tests/policy/test_honesty_gate.py:9) lines 9-12 -- [tests/policy/test_honesty_gate.py](/home/john/legis/tests/policy/test_honesty_gate.py:33) lines 33-36 - -Evidence: - -- `check_policy_boundary()` treats `ast.Name` and string constants containing the function or policy name as evidence that a test exercises a boundary. -- The existing positive test only assigns a string containing `handler` and `no-eval` and asserts the string contains `no-eval`. - -Impact: - -A policy boundary can pass the honesty gate with a pinned test that never calls the decorated function and never asserts behavior at the boundary. That weakens the anti-vibe guarantee the decorator is intended to provide. - -Remediation: - -1. Remove string-constant fallback as positive proof for function calls. -2. Require an actual `ast.Call` to the decorated function or a configured helper known to exercise it. -3. Require at least one meaningful assertion path tied to the suppressed policy. -4. Keep fingerprint pinning, but treat it as freshness proof, not behavioral proof. - -Acceptance tests: - -- The current string-only fake test should fail. -- A real test that calls the decorated function and asserts the relevant policy behavior should pass. - -#### H6. MCP server is absent and the service layer is too narrow for MCP/HTTP parity - -Locations: - -- [pyproject.toml](/home/john/legis/pyproject.toml:15) lines 15-16 -- [src/legis/cli.py](/home/john/legis/src/legis/cli.py:11) lines 11-52 -- [CHANGELOG.md](/home/john/legis/CHANGELOG.md:40) lines 40-45 -- [src/legis/service/__init__.py](/home/john/legis/src/legis/service/__init__.py:1) lines 1-26 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:343) lines 343-455 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:492) lines 492-561 - -Evidence: - -- The only console script is `legis = "legis.cli:main"`. -- CLI subcommands are `serve` and `check-override-rate`; no `legis mcp` exists. -- No `src/legis/mcp.py` or `src/legis/mcp/` implementation exists. -- Changelog states MCP WP-M2..M6 are not yet built. -- The service layer exports only resolution, verified records, override rate, and simple submit override. Protected overrides, sign-off, binding, policy evaluation, Wardline routing, git/check reads, and many error mappings remain inline in FastAPI closures. - -Impact: - -There is no MCP-over-stdio protocol to audit: no `initialize`, `tools/list`, `tools/call`, tool schemas, launch-bound agent identity, or structured MCP error mapping. If implemented now by reusing HTTP route code directly, MCP would likely duplicate logic or expose a partial behavior surface. - -Remediation: - -1. Extract service functions for protected override, operator override, sign-off request/sign, binding, policy evaluation, Wardline scan routing, git reads, and check reads. -2. Make HTTP and MCP thin transport adapters over the same services. -3. Add `legis mcp` and a stdlib JSON-RPC server with an explicit tool registry and schemas. -4. Bind MCP `agent_id` at process launch or authenticated session context, not per tool-call JSON. -5. Add table-driven parity tests comparing service, HTTP, and MCP mapped outcomes. - -Acceptance tests: - -- Spawn `legis mcp --agent-id agent-1`, send `initialize` and `tools/list`, and assert expected tools appear while operator-authority tools do not. -- MCP tool schemas should not include `agent_id` or `operator_id`. -- Disabled protected cell, pending sign-off, unknown policy, invalid Wardline cell, and tampered audit trail should map consistently across service, HTTP, and MCP. - -### Medium - -#### M1. Static-analysis trust tiers and check facts are loosely validated mutable assertions - -Locations: - -- [src/legis/wardline/ingest.py](/home/john/legis/src/legis/wardline/ingest.py:15) lines 15-18 -- [src/legis/wardline/ingest.py](/home/john/legis/src/legis/wardline/ingest.py:53) lines 53-54 -- [src/legis/wardline/governor.py](/home/john/legis/src/legis/wardline/governor.py:87) lines 87-88 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:121) lines 121-132 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:281) lines 281-285 -- [src/legis/checks/surface.py](/home/john/legis/src/legis/checks/surface.py:50) lines 50-67 -- [src/legis/checks/surface.py](/home/john/legis/src/legis/checks/surface.py:101) lines 101-105 - -Evidence: - -- `TRUST_TIERS` is declared but not enforced. -- `properties` is copied verbatim and recorded as `wardline.tiers`. -- `/checks` accepts caller-supplied `commit_sha`, `run_id`, `rule_set`, and `policy_version`; `latest_state()` is last-write-wins by `check_name`. - -Impact: - -Governance records can contain non-lattice tier values, and check facts can be spoofed or overwritten unless the deployment adds external authentication and integrity controls. - -Remediation: - -1. Parse known Wardline trust fields into a typed structure. -2. Validate every tier against `TRUST_TIERS`; preserve unknown fields separately as untrusted metadata. -3. Authenticate CI/check writers and require run identity uniqueness. -4. Validate commit SHAs against the git surface where practical. -5. Model supersession explicitly instead of implicit last-write-wins. - -Acceptance tests: - -- `properties={"actual_return": "ROOT"}` should be rejected or recorded as invalid/untrusted, not as `extensions.wardline.tiers`. -- Unauthenticated duplicate `wardline` pass for a failed commit should be rejected or retained only as a non-authoritative separate event. - -#### M2. External URL, DB, secret, and response boundaries are weakly confined - -Locations: - -- [src/legis/cli.py](/home/john/legis/src/legis/cli.py:14) lines 14-40 -- [src/legis/cli.py](/home/john/legis/src/legis/cli.py:66) lines 66-79 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:157) lines 157-168 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:177) lines 177-224 -- [src/legis/identity/loomweave_client.py](/home/john/legis/src/legis/identity/loomweave_client.py:40) lines 40-55 -- [src/legis/filigree/client.py](/home/john/legis/src/legis/filigree/client.py:31) lines 31-45 - -Evidence: - -- CLI/env strings flow directly into SQLAlchemy URLs and urllib base URLs. -- Loomweave and Filigree clients only `rstrip("/")` the base URL. -- `urlopen(...).read()` loads full response bodies. -- `--hmac-key` accepts a raw signing secret on the command line and copies it to `LEGIS_HMAC_KEY`. - -Impact: - -Misconfiguration or compromised launch environment can point Legis at unexpected services or DB locations. Large responses can consume memory. Raw HMAC keys can leak through shell history, process lists, or CI logs. - -Remediation: - -1. Validate URL scheme and host at startup; require HTTPS except explicit loopback/dev allowlist. -2. Add auth headers or mTLS for Loomweave/Filigree where deployments are not strictly loopback. -3. Add response byte caps and content-type checks before JSON parsing. -4. Validate DB URLs and optionally confine state paths. -5. Remove `--hmac-key`; replace with env-only, secret file with strict permissions, or secret manager/KMS integration. - -Acceptance tests: - -- `file://` Loomweave/Filigree URLs should fail at client construction. -- Non-allowlisted remote hosts should fail unless an explicit remote opt-in is set. -- Oversized responses should raise controlled client errors. -- Parser should reject `--hmac-key`; `--hmac-key-file` should reject group/world-readable files. - -#### M3. Git read error mapping and DoS controls are incomplete - -Locations: - -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:252) lines 252-265 -- [src/legis/git/surface.py](/home/john/legis/src/legis/git/surface.py:26) lines 26-31 -- [src/legis/git/surface.py](/home/john/legis/src/legis/git/surface.py:127) lines 127-130 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:68) lines 68-114 -- [src/legis/wardline/ingest.py](/home/john/legis/src/legis/wardline/ingest.py:58) lines 58-65 - -Evidence: - -- `/git/commits/{sha}` catches `GitError`, but `/git/branches` and `/git/renames` do not. -- `GitSurface.renames()` raises `GitError` for invalid revision ranges. -- Git subprocesses have no timeout. -- Request models have no string length, body size, or findings count limits. - -Impact: - -Invalid git inputs can become 500s instead of structured 4xx errors. Large scan bodies, broad rename ranges, or slow git operations can consume CPU, memory, sqlite writes, or process slots. - -Remediation: - -1. Catch `GitError` consistently for all git endpoints and map invalid refs/ranges to 400 or 422. -2. Add subprocess timeouts and, if needed, max output caps. -3. Add Pydantic `Field` length constraints and batch size limits. -4. Add request body limits and rate limits at the ASGI/server layer. -5. Ensure batch routing either prevalidates all findings or has transactional/no-partial-write semantics. - -Acceptance tests: - -- `GET /git/renames?rev_range=--version` should return a stable 4xx JSON error, not 500. -- Oversized Wardline scan input should return 413/422 without partial writes. -- A deliberately slow git command in a controlled test double should time out with a structured error. - -#### M4. Loomweave and Filigree JSON response shapes are not validated at the transport seam - -Locations: - -- [src/legis/identity/loomweave_client.py](/home/john/legis/src/legis/identity/loomweave_client.py:40) lines 40-49 -- [src/legis/identity/loomweave_client.py](/home/john/legis/src/legis/identity/loomweave_client.py:71) lines 71-74 -- [src/legis/identity/resolver.py](/home/john/legis/src/legis/identity/resolver.py:59) lines 59-72 -- [src/legis/filigree/client.py](/home/john/legis/src/legis/filigree/client.py:31) lines 31-40 -- [src/legis/filigree/client.py](/home/john/legis/src/legis/filigree/client.py:56) lines 56-59 - -Evidence: - -- `_urllib_fetch()` is annotated as returning `dict`, but `json.loads()` can decode any JSON type. -- `resolve()` uses `res["sei"]` when `alive` is true without validating required fields. -- `lineage()` and `associations_for_entity()` call `.get()` without checking decoded body is a mapping. - -Impact: - -Malformed upstream responses can cause raw `AttributeError`, `KeyError`, or incorrect degradation instead of controlled client errors or documented fail-closed behavior. - -Remediation: - -1. Validate decoded JSON type immediately in `_urllib_fetch()`. -2. Add response-shape validators for capability, resolve, SEI resolve, lineage, attach, and associations. -3. Convert malformed responses into `LoomweaveError` or `FiligreeError`. -4. Decide which call sites should degrade and which should fail closed. - -Acceptance tests: - -- Fake fetch returning `[]`, `{"alive": true}`, and `{"lineage": "not-list"}` should produce controlled errors or explicit degradation, never raw key/attribute errors. - -#### M5. API composition is tightly coupled and uses private/internal state across layers - -Locations: - -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:26) lines 26-47 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:141) lines 141-236 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:301) lines 301-560 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:303) line 303 -- [src/legis/service/governance.py](/home/john/legis/src/legis/service/governance.py:63) lines 63-67 -- [src/legis/enforcement/protected.py](/home/john/legis/src/legis/enforcement/protected.py:72) lines 72-74 -- [src/legis/enforcement/protected.py](/home/john/legis/src/legis/enforcement/protected.py:120) lines 120-127 - -Evidence: - -- `api/app.py` imports and orchestrates almost every domain package. -- The API reads `trail_verifier._protected`. -- The service layer uses `getattr(protected_gate, "_store", None)` to verify hash-chain integrity. - -Impact: - -Future adapters can easily duplicate or bypass behavior. Fake or alternate gate/verifier implementations can satisfy visible methods but skip protected-policy rejection or hash-chain verification because those requirements live in private attributes. - -Remediation: - -1. Extract a runtime/application service layer that owns workflow orchestration. -2. Add public contracts such as `TrailVerifier.protected_policies`, `ProtectedGate.verify_integrity()`, or a `VerifiedTrail` protocol. -3. Keep HTTP route handlers limited to request parsing and transport error mapping. -4. Add import-boundary tests for API modules. - -Acceptance tests: - -- Fake gate/verifier implementations exposing only public protocols should still let `/overrides` reject protected policies and verified reads fail closed on hash-chain failure. -- Route tests should be able to use injected runtime/service fakes without importing concrete stores. - -#### M6. Public typing surface is not ready for `py.typed` - -Locations: - -- [pyproject.toml](/home/john/legis/pyproject.toml:18) lines 18-22 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:105) lines 105-114 -- [src/legis/service/governance.py](/home/john/legis/src/legis/service/governance.py:21) lines 21-49 -- [src/legis/checks/surface.py](/home/john/legis/src/legis/checks/surface.py:69) lines 69-83 - -Evidence: - -- `src/legis/py.typed` exists, but `pyproject.toml` has no mypy config or mypy dev dependency. -- Several boundaries use bare `dict`, bare `list`, untyped `records`, untyped `whereclause`, and untyped row parameters. - -Impact: - -Downstream consumers will treat Legis as a typed package, but important APIs leak implicit `Any` and strict checking will be noisy or unreliable. - -Remediation: - -1. Add mypy or pyright configuration and dev dependency. -2. Replace bare containers with `dict[str, Any]`, `Mapping[str, Any]`, `TypedDict`, Pydantic models, or Protocols. -3. Type store record iterables and SQLAlchemy row conversions. -4. Gate CI on the chosen type checker once baseline is clean. - -Acceptance tests: - -- `uv run mypy src/legis` or the chosen equivalent should pass under the agreed config. -- Built distributions should include `legis/py.typed`. - -#### M7. CI and test hygiene gaps reduce regression protection - -Locations: - -- [.github/workflows/override-rate.yml](/home/john/legis/.github/workflows/override-rate.yml:15) lines 15-17 -- [pyproject.toml](/home/john/legis/pyproject.toml:28) lines 28-36 -- [tests/enforcement/test_regressions.py](/home/john/legis/tests/enforcement/test_regressions.py:42) lines 42-58 -- [tests/enforcement/test_regressions.py](/home/john/legis/tests/enforcement/test_regressions.py:61) lines 61-139 -- [src/legis/api/app.py](/home/john/legis/src/legis/api/app.py:170) lines 170-211 -- [src/legis/store/audit_store.py](/home/john/legis/src/legis/store/audit_store.py:85) lines 85-86 - -Evidence: - -- The GitHub workflow installs the package and runs only `legis check-override-rate`. -- There is no CI `pytest`, lint, or static type job. -- Some tests mutate `os.environ` directly and `pop` keys instead of restoring prior values. -- One regression test enables HMAC without setting governance/binding DB env vars, so `create_app()` can fall back to repo-root DB files. -- Pytest has no marker split, while tests mix sqlite, threading, git subprocesses, and FastAPI clients. - -Impact: - -Substantial local test coverage is not a merge gate. Tests can become order-dependent, pollute local repo state, or hide regressions in skipped CI surfaces. - -Remediation: - -1. Add CI jobs for `uv run pytest` and selected static checks. -2. Use `monkeypatch` for environment changes. -3. Clear or isolate all `LEGIS_*`, `LOOMWEAVE_API_URL`, and `FILIGREE_API_URL` settings per test. -4. Point default DBs to `tmp_path` in tests that create app state. -5. Add pytest markers such as `unit`, `integration`, `api`, and `contract`. - -Acceptance tests: - -- A PR with a deliberately failing test fails CI. -- Pre-seeded environment variables are restored after app setup tests. -- Running targeted app setup tests creates no repo-root `.db` files. -- Pytest collection can select pure unit tests separately from sqlite/git/API tests. - -## Remediation Roadmap - -1. Secure the evidence boundary first: type and authenticate Wardline scan ingestion, remove caller-owned routing, and record artifact digests/provenance. -2. Repair protected-cell HMAC semantics: v2 signing fields, no verifier skips for malformed protected records, sign-off Loomweave metadata bound into signatures, and protected endpoint/config alignment. -3. Move actor identity out of request bodies: authenticated agent/operator/CI scopes, adapter launch context for MCP, and default-deny mutating endpoints. -4. Decide Loomweave fail-closed policy: record explicit identity/lineage statuses, and make protected/complex writes fail or loudly mark provenance gaps when custody is unavailable. -5. Extract shared services before MCP implementation: protected override, sign-off, binding, policy evaluation, Wardline routing, git/check reads, and structured errors. -6. Add read-only store open modes and fix CI gate behavior for missing DB/schema. -7. Harden integration and transport inputs: URL allowlists, response caps, DB path validation, git subprocess timeouts, and request size limits. -8. Add missing regression tests and CI: pytest, type checking, API error mapping, HMAC tamper cases, Wardline metadata preservation, and MCP/HTTP parity tests once MCP lands. - -## Residual Risks - -- This audit did not run tests or dynamic probes by request, so execution-time behavior was inferred from source. -- Live Loomweave and Filigree contract drift was not tested; current tests use fakes. -- The absent scanner/rules and MCP server mean those specific implementations could not be audited; only the closest present code and design seams were reviewed. diff --git a/docs/arch-analysis-2026-06-06-0158/temp/catalog-A-enforcement.md b/docs/arch-analysis-2026-06-06-0158/temp/catalog-A-enforcement.md deleted file mode 100644 index 7aa6ceb..0000000 --- a/docs/arch-analysis-2026-06-06-0158/temp/catalog-A-enforcement.md +++ /dev/null @@ -1,54 +0,0 @@ -## Enforcement Engine -**Location:** `src/legis/enforcement/` -**Responsibility:** Grades a policy firing through the governance 2×2 (simple/complex × judge off/on), writing exactly one append-only, hash-chained audit record per submission and — in the protected cell — binding each verdict to its inspected source with an HMAC signature plus lifecycle gates (decay re-judge + override-rate). - -**Key Components:** -- `engine.py` (115 LOC) — `EnforcementEngine.submit_override`: the simple-tier chill/coached cells. `judge=None` → chill (record accepted as-is); `judge` present → coached (judge evaluates *before* write; verdict + model + rationale stamped into `extensions`, `accepted = verdict is ACCEPTED`). Also `trail()`, `records()`, `record_event()` (raw governance events e.g. UNKNOWN_POLICY). `EnforcementResult` dataclass. -- `verdict.py` (28 LOC) — shared value types: `Verdict` str-enum (ACCEPTED / BLOCKED / OVERRIDDEN_BY_OPERATOR), `SignoffState` str-enum (PENDING_SIGNOFF / SIGNED_OFF), `JudgeOpinion` dataclass (verdict, model, rationale). -- `judge.py` (111 LOC) — `Judge`/`LLMClient` Protocols; `LLMJudge` (structured-JSON-first, fail-closed). `build_prompt` frames request data as untrusted input. `parse_verdict` / `_parse_structured_response`: BLOCKED wins on any ambiguity; legacy free-text parse only behind `allow_legacy_text`. -- `judge_factory.py` (31 LOC) — `build_judge_from_env`: wires `OpenRouterLLMClient` from env, else returns `FailClosedJudge` (always BLOCKED) when no provider configured. Surface-scoped fallback rationale. -- `llm_client.py` (168 LOC) — deployable `OpenRouterLLMClient` + `llm_client_config_from_env`. SSRF/transport hardening: HTTPS-or-loopback-only base URL, no-redirect opener, 1 MB response cap, strict response-shape validation, `LLMTransportError` on any malformed reply. Injectable `Fetch` seam for tests. -- `protected.py` (288 LOC) — the protected cell. `ProtectedGate.submit` (judge-gated) / `operator_override` (human bypass → OVERRIDDEN_BY_OPERATOR, no model). Every record HMAC-signed via `signing_fields()` (single source of the signed dict, binds entity+policy+source fingerprint+ast_path+loomweave lineage). `TrailVerifier.verify`: load-time signature check; protected-policy set comes from config (ADR-0002) not the record, so a flag-flip can't downgrade. `legacy_signing_fields` for v1 records. `TamperError`. -- `signoff.py` (151 LOC) — `SignoffGate`: structured/protected block+escalate, **no LLM in path**. `request` records PENDING_SIGNOFF (does NOT clear); `sign_off` records SIGNED_OFF referencing `request_seq` + `request_payload_hash` and clears. Optional `signer`+`key` → tamper-bound signed sign-off via `signoff_signing_fields`. `is_cleared` / `request_record` scan the trail. -- `lifecycle.py` (122 LOC) — protected-cell lifecycle gates over the read-only trail. `decay_sweep`: re-judges only judge-ACCEPTED suppressions (strips prior decision fields before re-judging), flags any that no longer pass. `evaluate_override_rate`: `OVERRIDDEN_BY_OPERATOR / (ACCEPTED+OVERRIDDEN_BY_OPERATOR)` over recent `window`; `PASS`/`FAIL`/`PASS_WITH_NOTICE` (small-sample). `GateStatus`, `GateResult`, `DecayFlag`. -- `signing.py` (47 LOC) — keyed HMAC-SHA256 tamper-evidence over `canonical_json(fields)`. Versioned prefixes (`v2` default, `v1` legacy). `sign` / `verify` (verify accepts v2 or v1; `compare_digest` constant-time). -- `__init__.py` (1 LOC) — package docstring only. - -**Dependencies:** -- Inbound: - - `legis.service.governance` -> enforcement — imports EnforcementEngine/EnforcementResult, evaluate_override_rate, ProtectedGate/ProtectedResult/TamperError, SignoffGate/SignoffResult (`src/legis/service/governance.py:14-17`) - - `legis.service.wardline` -> enforcement — EnforcementEngine, SignoffGate (`src/legis/service/wardline.py:9-10`) - - `legis.service.explain` -> enforcement — EnforcementEngine (`src/legis/service/explain.py:8`) - - `legis.mcp` -> enforcement — EnforcementEngine, build_judge_from_env, ProtectedGate/TrailVerifier/TamperError, SignoffGate, SignoffState/Verdict (`src/legis/mcp.py:23-27`) - - `legis.api.app` -> enforcement — EnforcementEngine, ProtectedGate/TamperError/TrailVerifier, SignoffGate, build_judge_from_env (`src/legis/api/app.py:31-33,325,333-334,341`) - - `legis.cli` -> enforcement — GateStatus/evaluate_override_rate, TrailVerifier/TamperError (`src/legis/cli.py:172,228`) - - `legis.wardline.governor` -> enforcement — EnforcementEngine, SignoffGate (`src/legis/wardline/governor.py:33-34`) - - `legis.wardline.ingest` -> enforcement — signing.verify (`src/legis/wardline/ingest.py:14`) - - `legis.governance.signoff_binding` -> enforcement — signing.sign (`src/legis/governance/signoff_binding.py:20`) - - `legis.governance.binding_ledger` -> enforcement — signing.sign, signing.verify (`src/legis/governance/binding_ledger.py:19`) -- Outbound: - - enforcement -> `legis.clock` (Clock) — engine.py:20, protected.py:16, signoff.py:15 - - enforcement -> `legis.identity.entity_key` (EntityKey) — engine.py:23, protected.py:21, signoff.py:18, lifecycle.py:17 - - enforcement -> `legis.records.override_record` (OverrideRecord) — engine.py:24, judge.py:17, judge_factory.py:12, protected.py:22, signoff.py:19, lifecycle.py:18 - - enforcement -> `legis.store.protocol` (AppendOnlyStore) — engine.py:25, protected.py:23, signoff.py:20 - - enforcement -> `legis.canonical` (canonical_json, content_hash) — signing.py:15, signoff.py:14 - - NOTE: cluster does NOT import `legis.governance` or `legis.policy` — those depend on enforcement, not vice versa (one-directional, clean). - -**Patterns Observed:** -- Dependency injection / ports-and-adapters: store (`AppendOnlyStore` protocol), `Clock`, `Judge` and `LLMClient` are all injected Protocols; the only non-test concrete is `OpenRouterLLMClient`. The chill/coached distinction is literally a single nullable `judge` arg (engine.py:42,70). -- Single-source-of-signed-fields: `signing_fields` / `signoff_signing_fields` are called by both the writing gate and the reading `TrailVerifier`, so signer and verifier cannot drift (protected.py:40,206,150; signoff.py:29,81,138). -- Fail-closed everywhere: unreadable/ambiguous judge output → BLOCKED (judge.py:40,106); unconfigured provider → `FailClosedJudge` (judge_factory.py:30); structurally malformed protected record → `TamperError` (protected.py:151). -- Append-only single trail: every submission, every governance event, and every sign-off step is one immutable hash-chained record; no silent path (engine.py:12 docstring, record_event). -- Config-driven trust boundary: protected-policy set lives in config not the record (ADR-0002), preventing flag-flip downgrade (protected.py:96-102). -- Layered verdict provenance: simple verdicts stamp extensions; protected layers HMAC over the same extensions; lifecycle reads the trail read-only without re-writing. -- Security-hardened egress: HTTPS/loopback-only, no-redirect, size-capped, shape-validated LLM transport (llm_client.py:76-129). - -**Concerns:** -- Verifier coupling to `extensions` shape: `TrailVerifier._requires_verification` keys off in-record markers (`file_fingerprint`, `ast_path`, `protected_cell`, signature presence) in *addition* to the config protected set (protected.py:112-121). The config set is the authoritative anti-downgrade guard, but the OR-with-record-markers means a record that omits both the protected policy and all markers is treated as unprotected — correct only if the config protected-policy set is always complete/current. Coupling between signing-field layout and verifier is implicit (dict-shape, not a typed schema). -- Dual signing-field functions (`signing_fields` vs `legacy_signing_fields`, v1/v2 prefixes) create a migration surface: `verify` tries v2 then falls back to legacy v1 fields (protected.py:155-159), widening the accept set during the legacy window. Acceptable as transitional but worth a deprecation/removal milestone. -- `EntityKey.from_dict(p["entity_key"])` in `decay_sweep` and `sign_off` will `KeyError`/raise on a malformed historical record; decay_sweep has no per-record try/except, so one bad row aborts the whole sweep (lifecycle.py:55-62). The protected write path guards this (TamperError) but the lifecycle read path does not. -- `evaluate_override_rate` and `decay_sweep` silently include/exclude records by `judge_verdict` extension presence; a protected record missing that key is simply skipped — denominator/sweep coverage depends on upstream always stamping it. -- HMAC key lifecycle (rotation, provenance) is out of cluster scope — `key: bytes` is injected; no rotation/versioned-key support visible here (signing.py only versions the algorithm, not the key). -- `record_event` (engine.py:107) bypasses the judge/verdict path entirely for raw events; if a protected-policy event were routed here it would not be signed — relies on callers not misusing it. - -**Confidence:** High — Read all 12 files in `src/legis/enforcement/` end-to-end (engine.py 115, protected.py 288, signoff.py 151, lifecycle.py 122, judge.py 111, llm_client.py 168, judge_factory.py 31, signing.py 47, verdict.py 28, __init__.py 1; judge_factory.py and llm_client.py are mode 0600 but readable). Outbound edges cross-verified by `grep -n '^from legis\.'` over the cluster (5 distinct targets, zero governance/policy imports). Inbound edges grepped across `src/` with file:line for all 10 importing modules. The only uncertainty is runtime behaviour of injected concretes defined outside the cluster (store impls, Clock, EntityKey internals), which were not read. diff --git a/docs/arch-analysis-2026-06-06-0158/temp/catalog-B-policy.md b/docs/arch-analysis-2026-06-06-0158/temp/catalog-B-policy.md deleted file mode 100644 index 662cb21..0000000 --- a/docs/arch-analysis-2026-06-06-0158/temp/catalog-B-policy.md +++ /dev/null @@ -1,40 +0,0 @@ -## Policy Grammar -**Location:** `src/legis/policy/` -**Responsibility:** Defines the agent-programmable policy-boundary grammar — boundary types that evaluate a target to CLEAR/VIOLATION/UNKNOWN (fail-closed), the policy→governance-cell routing, one-off exemptions, and an AST-based honesty gate that verifies a `@policy_boundary` decoration is backed by a real, pinned test that actually exercises the boundary. - -**Key Components:** -- `grammar.py` (123 LOC) — Core contract. `PolicyResult` (CLEAR/VIOLATION/UNKNOWN), `PolicyEvaluation` (frozen, carries `provenance_gap`), `BoundaryType` Protocol, and `PolicyGrammar` registry. `register()` is append-only and raises `PolicyConflictError` on shadowing (grammar.py:53-60); `evaluate()` returns UNKNOWN+gap for unregistered policies, and wraps boundary calls in `except Exception` to fail closed on garbage/raises (grammar.py:74-85). Applies exemptions only on VIOLATION when `target['value']` is a str (grammar.py:86-97). Ships `AllowlistBoundary` builtin and `default_grammar()` preloading `import-allowlist` ⇒ {json, os, sys}. -- `cells.py` (99 LOC) — `PolicyCellRegistry.cell_for(policy)` resolves a policy to one of {chill, coached, structured, protected}: exact-pattern rules first, then glob rules (`fnmatch.fnmatchcase`), else `default_cell` (cells.py:33-40). `default_policy_cells()` sets default `chill` (cells.py:44). `load_policy_cells()` parses TOML and fails closed on malformed `[[policy]]` entries (cells.py:47-77). -- `decorator.py` (212 LOC) — `@policy_boundary` strict-passthrough decorator attaching frozen `PolicyBoundaryMetadata` (source/suppresses/invariant/test_ref/test_fingerprint); decoration-time TypeErrors on empty source/suppresses/invariant and on stacking (decorator.py:62-83). `check_policy_boundary()` is the runtime honesty gate: checks metadata-transplant (object identity, decorator.py:157-159), qualname scope (161-162), citation shape via `_CITATION_RE` (36, 165), presence of invariant/test_ref/test_fingerprint, resolves the test via a caller-supplied `resolver`, recomputes `fingerprint()` and rejects drift (185-186), then delegates the semantic check to `evaluate_test_evidence` (209). -- `evidence.py` (152 LOC) — Single shared judgement used by BOTH the runtime gate and the static scanner so they cannot drift. `evaluate_test_evidence()` enforces three checks: (1) shadowing — boundary name rebound as def/arg/assign/for-target ⇒ fail (evidence.py:81-126); (2) exercise — boundary call must appear outside uninvoked nested defs (`_walk_without_nested_definitions`, 56-61, 69-75); (3) policy co-occurrence — a suppressed-policy reference must appear inside the same `assert` as boundary evidence (135-152). -- `exemptions.py` (128 LOC) — `Exemption` (policy/value/reason with entity/rationale aliases), `ExemptionRegistry` keyed by (policy, value), plus two loaders: `ExemptionAllowlist.from_file` (YAML, requires policy/entity/rationale, missing file exempts nothing) and `load_exemptions` (TOML `[[exemption]]`). Both fail closed on malformed entries (exemptions.py:79-82, 123-126). -- `boundary_scan.py` (357 LOC) — Static `@policy_boundary` scanner (`scan_policy_boundaries`) emitting `BoundaryFinding`s with rule IDs. `_BoundaryVisitor` walks the AST, requires literal-only decorator kwargs (179-210), validates `suppresses`, resolves `test_ref` with strict path sandboxing (must be relative `tests/*.py`, no traversal, must resolve under repo_root — `_resolve_test_ref`, 243-322), recomputes the fingerprint from `get_source_segment`, and reuses `evaluate_test_evidence` for the semantic verdict (169). Driven by CLI `policy-boundary-check`. -- `policy/cells.toml` (repo-root data file) — Local startup routing: `default_cell = "structured"`, with `import-allowlist`⇒coached, `protected.*`⇒protected, `human.*`⇒structured. Note: overrides the in-code `chill` default; loaded by `mcp.py:_load_policy_cell_registry`. - -**Dependencies:** -- Inbound: - - `legis.mcp` imports `PolicyCellRegistry, default_policy_cells, load_policy_cells` (mcp.py:30-34) and `PolicyGrammar, default_grammar` (mcp.py:35); builds runtime cell registry from `policy/cells.toml` (mcp.py:101-111, 161, 165). Surfaces `policy_explain`/`policy_evaluate`/`override_submit`. - - `legis.service.governance` imports `PolicyEvaluation, PolicyGrammar, PolicyResult` (governance.py:21); `evaluate_policy()` calls `grammar.evaluate` and records UNKNOWN provenance gaps (governance.py:230-239). - - `legis.service.explain` imports `PolicyCellRegistry` (explain.py:9); `explain_policy()` calls `registry.cell_for` (explain.py:72). - - `legis.api.app` imports `PolicyGrammar, default_grammar` (app.py:52) and re-exports `evaluate_policy` from the service (app.py:45). - - `legis.cli` imports `scan_policy_boundaries` (cli.py:11); wired to the `policy-boundary-check` subcommand (cli.py:132-138, 305-313). -- Outbound: - - `legis.canonical.content_hash` — used by `decorator.py:23` and `boundary_scan.py:11` for test fingerprints. ONLY non-stdlib intra-legis outbound dependency. - - Intra-package: `grammar.py:20` → `exemptions.ExemptionRegistry`; `decorator.py:24` → `evidence.evaluate_test_evidence`; `boundary_scan.py:12-13` → `decorator.get_normalized_ast_str` + `evidence.evaluate_test_evidence`. - - Third-party/stdlib: `yaml` (exemptions.py:17); stdlib `ast`, `re`, `tomllib`, `fnmatch`, `functools`, `inspect`, `textwrap`. - -**Patterns Observed:** -- Provider-seam / open-instance-set: `BoundaryType` Protocol + append-only registry mirrors Wardline `TaintSourceProvider` / Loomweave `Transport` (grammar.py docstring), letting agents add boundaries with no human config. -- Fail-closed everywhere: unregistered policy, raising boundary, non-`PolicyResult` return, malformed TOML/YAML all collapse to UNKNOWN/error rather than false-green (grammar.py:65-99; cells.py/exemptions.py loaders). -- Single-source-of-truth for evidence judgement: `evidence.py` is deliberately shared by runtime gate and static scanner to prevent gate drift (evidence.py module docstring; consumed at decorator.py:209 and boundary_scan.py:169). -- Anti-vibe provenance: decoration-time TypeErrors reject empty source/invariant/suppresses; gate enforces citation shape + pinned test fingerprint + metadata-transplant/qualname scope checks. -- Two-tier (exact-then-glob) declarative routing with strict cell-name validation against a closed `VALID_CELLS` set. - -**Concerns:** -- (Confirmed, prior H6) In-code default cell is self-clearing `chill`: `default_policy_cells()` returns `default_cell="chill"` (cells.py:44), so any unmatched policy falls through to the least-governed cell. This is the failure-open default in the code path; mitigated only when `policy/cells.toml` (default `structured`) is loaded (mcp.py:101-111). If config is absent/unset, `_load_policy_cell_registry` falls back to `default_policy_cells()` ⇒ chill (mcp.py:111). -- (Confirmed, prior M7) Honesty gate's policy co-occurrence check is weak / not semantically scope-aware: `_contains_policy_reference` matches the suppressed policy name as any `\b`-bounded substring inside a string constant (or a bare Name) co-located in the same `assert` as a boundary call/result (evidence.py:46-53, 135-152). It does not verify the boundary's *result* is what is asserted, nor that the policy string is the assertion subject — a test asserting boundary truthiness with the policy name merely mentioned in a message string passes. The shadow + exercise checks raise the bar but the assertion-meaning check remains shallow. -- (Confirmed, narrow, prior L4) Fingerprint is computed from two different extraction paths that can diverge: the runtime gate uses `inspect.getsource(test_fn)` then `textwrap.dedent` (decorator.py:125-135), while the static scanner uses `ast.get_source_segment(...)` then `textwrap.dedent` (boundary_scan.py:156-159). For top-level test functions these agree; for class-method test_refs or decorator-bearing tests the segment vs. full-source extraction (and dedent of a segment whose first line is not least-indented) can mismatch, producing a `POLICY_BOUNDARY_TEST_FINGERPRINT_MISMATCH` in one gate but not the other. -- Exemption application in `grammar.evaluate` only fires when `"value" in target` and is a `str` (grammar.py:86-91); a VIOLATION on a target keyed differently than `value` can never be exempted, and exemptions silently flip VIOLATION→CLEAR with `provenance_gap=False` (grammar.py:94-96) — a deliberate but un-logged self-clear at the grammar layer. -- `get_normalized_ast_str` strips docstrings before hashing (decorator.py:104-114): editing only a test's docstring will not change its fingerprint, so docstring-only drift is invisible to the gate (likely intentional, noted for completeness). - -**Confidence:** High — Read 100% of all 7 source files (grammar.py, cells.py, decorator.py, evidence.py, exemptions.py, boundary_scan.py, __init__.py) and the `policy/cells.toml` data file in full. Outbound deps verified by reading the imports; inbound deps cross-checked with grep across `src/` and confirmed by reading the consumer call sites in mcp.py, service/governance.py, service/explain.py, api/app.py, cli.py with line numbers. All three prior-audit concerns (H6 cells.py:44, M7 evidence.py:46-53/135-152, L4 decorator.py:125-135 vs boundary_scan.py:156-159) verified against current source. (Advisor consult attempted but unavailable this turn.) diff --git a/docs/arch-analysis-2026-06-06-0158/temp/catalog-C-governance.md b/docs/arch-analysis-2026-06-06-0158/temp/catalog-C-governance.md deleted file mode 100644 index f833097..0000000 --- a/docs/arch-analysis-2026-06-06-0158/temp/catalog-C-governance.md +++ /dev/null @@ -1,160 +0,0 @@ -# Cluster C — Governance & Persistence Foundations - -Catalog for the foundational governance + persistence layer of Legis (Weft suite). -Four separate entry blocks: Governance, Store, Records, Foundations. - ---- - -## Governance - -**Location:** `src/legis/governance/` - -**Responsibility:** Tamper-bound binding of sign-offs to Filigree issues, append-only SEI re-keying/backfill of pre-SEI records, lineage-spine gap/divergence detection, and pure closure-gate decisions — all layered on the record-agnostic audit store. - -**Key Components:** -- `binding_ledger.py` (93 lines) — `BindingLedger` records signed (`issue_binding`) bindings to a *dedicated* `AuditStore` and verifies them at read time. `verify()` (L59–76) now checks `store.verify_integrity()` first (hash chain) then HMAC-verifies each record's signing fields. `get`/`get_by_issue_id` (L78–93) are fail-closed: they call `verify()` before returning. `BindingError` raised on tamper/forgery. Signing fields fixed by `binding_signing_fields` (L30–37). -- `signoff_binding.py` (74 lines) — `bind_signoff_to_issue` (L28–74): validate (rejects `identity_stable=False` locator keys, L38) → `filigree.attach` → optional `ledger.record`. Returns `binding_seq`. Documents the non-atomic attach-then-record trade-off (L64–73): no compensating delete; orphaned attach surfaced by ledger `verify()`. -- `sei_backfill.py` (259 lines) — `run_pre_sei_backfill` (L44): scans audit records, finds locator-keyed (`identity_stable=False`, non-SEI) records, resolves via Loomweave batch, and **appends** `SEI_BACKFILL` / `SEI_BACKFILL_UNRESOLVED` events referencing `original_seq` (never rewrites). Idempotent via `_backfilled_original_sequences` (L152). Fails closed on integrity failure (L58). `SeiBackfillReport` dataclass. -- `gaps.py` (115 lines) — `find_orphan_gaps` (L57): SEIs Loomweave reports `alive: false`. `find_lineage_integrity` (L68): REQ-L-01 Option-3 custody — verifies stored `lineage_snapshot` is still a *prefix* of current lineage (`content_hash(current[:n]) == snap["hash"]`, L105); prefix-break = divergence, growth is legitimate. Returns `LineageIntegrity` (divergences + unavailable). -- `filigree_gate.py` (32 lines) — `evaluate_issue_closure` (L14): pure decision; closable only if ledger holds a verified binding. Missing binding → structured `allowed: False`; tampered ledger → `BindingError` propagates. -- `params.py` (11 lines) — Reviewed governance constants (ADR-0002): `OVERRIDE_RATE_THRESHOLD`, `_WINDOW`, `_MIN_SAMPLE`. Policy, read server-side only. -- `__init__.py` (1 line) — package docstring. - -**Dependencies:** -- Inbound: - - `cli.py:9` → `sei_backfill.run_pre_sei_backfill`; `cli.py:173` → `governance.params` - - `mcp.py:29` → `binding_ledger.BindingError`; `mcp.py:146` → `BindingLedger`; `mcp.py:969` → `filigree_gate.evaluate_issue_closure` - - `service/governance.py:18` → `governance.params` - - `api/app.py:37` → `gaps.find_lineage_integrity, find_orphan_gaps`; `api/app.py:39` → `binding_ledger.BindingError, BindingLedger`; `api/app.py:40` → `signoff_binding.bind_signoff_to_issue`; `api/app.py:345` → `BindingLedger`; `api/app.py:664` → `filigree_gate.evaluate_issue_closure` -- Outbound: - - `binding_ledger.py:18` → `legis.clock.Clock`; `:19` → `legis.enforcement.signing.sign, verify`; `:20` → `legis.identity.entity_key.EntityKey`; `:21` → `legis.store.audit_store.AuditStore` - - `signoff_binding.py:20` → `enforcement.signing.sign`; `:21` → `filigree.client.FiligreeClient`; `:22` → `governance.binding_ledger.BindingLedger`; `:23` → `identity.entity_key.EntityKey` (intra-cluster edge: signoff_binding → binding_ledger) - - `sei_backfill.py:14` → `legis.canonical.content_hash`; `:15` → `clock.Clock`; `:16` → `identity.loomweave_client.LoomweaveIdentity`; `:17` → `identity.entity_key.EntityKey`; `:18` → `store.audit_store.AuditRecord, AuditStore` - - `gaps.py:17` → `legis.canonical.content_hash`; `:18` → `identity.loomweave_client.LoomweaveIdentity`; `:19` → `store.audit_store.AuditRecord` - - `filigree_gate.py` — none (takes `ledger: Any`, structurally typed) - -**Patterns Observed:** -- Fail-closed throughout: integrity failure raises before any data is returned (`binding_ledger.get*` L79/87, `sei_backfill` L58, `filigree_gate` propagates `BindingError`). -- Append-only migration: SEI re-keying never rewrites history; new events reference `original_seq` (`sei_backfill` L97–127, L195–217). -- Prefix-monotonic custody: lineage growth is legitimate, only a broken prefix is tamper (`gaps` L105). -- Pure decision functions separated from I/O (`filigree_gate`). -- Dedicated isolated ledger store so binding rows never pollute the override/gap trail (`binding_ledger` docstring L9–11). - -**Concerns:** -- **H5 — RESOLVED.** `BindingLedger.verify()` now invokes `store.verify_integrity()` (binding_ledger.py:60) before the per-record HMAC pass; the prior hash-chain omission is closed. -- **M12 — residual relocated to governance.** M12-as-flagged (enforcement → concrete `AuditStore`) is addressed: enforcement now imports the `AppendOnlyStore` protocol (engine.py:25, protected.py:23, signoff.py:20). The concrete coupling now lives *here*: `binding_ledger.py:21`, `sei_backfill.py:18`, and `gaps.py:19` type against concrete `AuditStore`/`AuditRecord` rather than the protocol — so these modules cannot be unit-tested against a protocol fake. (Concrete *construction* in api/app.py, cli.py, mcp.py is the composition root, not a violation.) -- **M6 propagation (governance impact).** `sei_backfill.run_pre_sei_backfill` (L58) and `binding_ledger.verify` (L60) both branch on `if not store.verify_integrity()`. Because `verify_integrity` can still *raise* on non-finite-float tampering (see Store block), these callers would receive an unexpected `ValueError`/exception instead of a clean `False`/`BindingError` — turning a tamper signal into an uncaught crash. -- **gaps.py null-entity_key crash.** `_stable_seis` (L51) and `find_lineage_integrity` (L75) do `payload.get("entity_key", {}).get(...)`. If a payload contains `"entity_key": null` (explicit), `.get` returns `None` and `.get` raises `AttributeError`. Inconsistent with `sei_backfill._entity_key` (L144) which guards `isinstance(raw, dict)`. Real robustness inconsistency between sibling modules. -- **signoff_binding non-atomic attach→record.** Acknowledged in-code (L64–73): if `ledger.record()` raises after `filigree.attach()` succeeds, Filigree holds a pointer with no local ledger entry; no compensating delete. Surfaced by `verify()`, but a runtime inconsistency window exists. - -**Confidence:** High — read all 7 files in full (binding_ledger.py:1–94, signoff_binding.py:1–75, sei_backfill.py:1–260, gaps.py:1–116, filigree_gate.py:1–33, params.py, __init__.py); cross-checked outbound imports against actual `from`-lines and inbound via repo-wide grep; empirically reproduced the M6 propagation path (`json.loads('{"x": Infinity}')` → `content_hash` raises `ValueError`). - ---- - -## Store (persistence) - -**Location:** `src/legis/store/` - -**Responsibility:** Record-agnostic, append-only, hash-chained SQLAlchemy audit log with DB-level mutation rejection and a structural integrity verifier; plus the `AppendOnlyStore`/`AuditRecordLike` protocols that consumers depend on. - -**Key Components:** -- `audit_store.py` (186 lines) — `AuditStore` over SQLAlchemy + `NullPool` (L57). SQLite PRAGMAs (WAL/NORMAL/busy_timeout) via connect listener (L60–71). Append-only enforced by `BEFORE UPDATE`/`BEFORE DELETE` triggers raising `RAISE(ABORT…)` (L88–104); no mutation method exists. `append` (L106): computes `content_hash`, reads last `chain_hash` (genesis if empty), inserts `chain_hash = sha256(prev_hash + content_hash)` under `BEGIN IMMEDIATE` (L110). `verify_integrity` (L161): re-walks chain checking content_hash, prev_hash linkage, and `_chain`. `AuditRecord` frozen dataclass; `read_all`/`read_by_seq`/`get_latest_sequence_and_hash`. -- `protocol.py` (30 lines) — `AuditRecordLike` and `AppendOnlyStore` `Protocol`s (append/read_all/read_by_seq/verify_integrity). This is the abstraction enforcement modules type against. -- `__init__.py` (1 line) — package docstring. - -**Dependencies:** -- Inbound: - - Concrete `AuditStore`: `governance/sei_backfill.py:18`, `governance/binding_ledger.py:21`, `governance/gaps.py:19` (AuditRecord), `api/app.py:318`, `api/app.py:373`, `api/app.py:345` (BindingLedger ctor path), `cli.py:12`, `cli.py:174`, `mcp.py:54` - - Protocol `AppendOnlyStore`: `enforcement/engine.py:25`, `enforcement/protected.py:23`, `enforcement/signoff.py:20` -- Outbound: - - `audit_store.py:35` → `legis.canonical.canonical_json, content_hash` (intra-cluster: store → foundations) - - external: `sqlalchemy`, `hashlib`, `json` - - `protocol.py` — stdlib `typing`/`collections.abc` only - -**Patterns Observed:** -- Two complementary integrity layers: DB triggers (reject in-band mutation) + hash chain (detect out-of-band file tampering) — documented L7–12. -- Record-agnostic boundary: store persists opaque `dict` payloads; schema knowledge lives in `records`/`governance`. -- Protocol-first consumption seam (`protocol.py`) — enforcement layer depends on the abstraction, not the concretion. -- `NullPool` + `BEGIN IMMEDIATE` for clean, lock-minimal append semantics. - -**Concerns:** -- **M6 — PARTIALLY closed.** `verify_integrity` wraps `read_all()` in `try/except (JSONDecodeError, TypeError, ValueError)` (L163–166), so decode-time malformed JSON now returns `False` cleanly. BUT the loop body `content_hash(rec.payload)` (L168) is **unguarded**, and `read_all` uses default `json.loads`, which accepts `Infinity`/`NaN` literals. A directly-tampered `payload` column containing `{"x": Infinity}` decodes fine, then `content_hash` → `canonical_json(allow_nan=False)` raises `ValueError` *inside the loop* — propagating out of `verify_integrity` instead of returning `False`. Empirically reproduced. This is exactly the tamper case `verify_integrity` is meant to flag, so the function can crash on the input it exists to defend against. -- **HMAC framing correction.** `AuditStore` itself holds **no HMAC** — it is hash-chain only. HMAC tamper-evidence lives in `enforcement/signing.py` and is applied by `BindingLedger`/protected-verdict callers writing *into* the store; the store persists the signature as just another payload field. The cluster brief's "HMAC for protected records [in store]" is slightly off: the store provides chaining + append-only triggers, not keyed signing. -- **Pragma failures silently swallowed.** The PRAGMA block (L64–69) catches and `pass`es all exceptions, so a WAL/busy_timeout misconfiguration is invisible (no log/observability). - -**Confidence:** High — read audit_store.py:1–187 and protocol.py:1–30 in full; traced append/verify chain logic line-by-line; empirically confirmed the M6 raise path (`json.loads('{"x": Infinity}')` decodes to `inf`, `content_hash` raises `ValueError`); inbound/outbound verified by grep against actual import lines. - ---- - -## Records - -**Location:** `src/legis/records/` - -**Responsibility:** Defines the shared core `OverrideRecord` schema (the chill-cell recordable override) that serializes to a flat dict for the record-agnostic audit store, with judge/HMAC fields attaching via `extensions`. - -**Key Components:** -- `override_record.py` (39 lines) — `OverrideRecord` frozen dataclass: `policy`, `entity_key: EntityKey`, `rationale`, `agent_id`, `recorded_at`, `extensions`. `identity_stable` property (L26) delegates to `entity_key`. `to_payload` (L30) emits the canonical flat dict (entity_key via `to_dict()`, copies extensions). -- `__init__.py` (1 line) — package docstring. - -**Dependencies:** -- Inbound (all in `enforcement/`): - - `enforcement/protected.py:22`, `judge_factory.py:12`, `lifecycle.py:18`, `engine.py:24`, `judge.py:17`, `signoff.py:19` → `OverrideRecord` - - (No governance/store module imports records — records is consumed by enforcement, which writes payloads into the store.) -- Outbound: - - `override_record.py:14` → `legis.identity.entity_key.EntityKey` - -**Patterns Observed:** -- Stable-core / extensible-edge: core schema fixed across the 2×2 cell matrix; Sprint-2 judge and Sprint-3 HMAC fields attach via `extensions` (docstring L1–7). -- Frozen dataclass + explicit `to_payload()` serialization boundary; record never touches the store directly (record → dict → store handoff). -- Identity delegation: `identity_stable` derived from `EntityKey`, single source of truth. - -**Concerns:** -- None observed (verified: schema immutability via `frozen=True`; serialization boundary explicit; extensions defensively copied at L38; no I/O, validation, or resource concerns in scope). One note: `to_payload` performs no validation of field types — it trusts construction-time correctness (acceptable for an internal frozen dataclass). - -**Confidence:** High — read override_record.py:1–39 and __init__.py in full; all 6 inbound edges confirmed by grep; single outbound (EntityKey) confirmed at L14. - ---- - -## Foundations (canonical + clock) - -**Location:** `src/legis/canonical.py`, `src/legis/clock.py` - -**Responsibility:** Leaf-level deterministic primitives — canonical JSON + content hashing (the basis of every hash/HMAC in the suite) and an injectable time source for deterministic, test-friendly timestamps. - -**Key Components:** -- `canonical.py` (22 lines) — `canonical_json` (L15): `json.dumps` with `sort_keys=True`, tight separators, `ensure_ascii=False`, **`allow_nan=False`**. `content_hash` (L21): sha256 of canonical JSON. Leaf module — no `legis` imports. v1 sorted-key; RFC-8785 convergence explicitly deferred (docstring L1–6, ADR-0001). -- `clock.py` (30 lines) — `Clock` Protocol (`now_iso`), `SystemClock` (UTC ISO via `datetime.now(timezone.utc)`), `FixedClock` (deterministic test injection). Production never calls `datetime.now()` directly. - -**Dependencies:** -- Inbound (canonical — foundation layer, many edges): - - `store/audit_store.py:35` → `canonical_json, content_hash` - - `enforcement/signing.py:15` → `canonical_json` - - `governance/sei_backfill.py:14` → `content_hash` - - `governance/gaps.py:17` → `content_hash` - - `service/wardline.py:8` → `content_hash` - - `identity/resolver.py:15` → `content_hash` - - `mcp.py:19` → `content_hash` - - `policy/decorator.py:23` → `content_hash` - - `policy/boundary_scan.py:11` → `content_hash` -- Inbound (clock): - - `enforcement/protected.py:16`, `enforcement/engine.py:20`, `enforcement/signoff.py:15` → `Clock` - - `governance/binding_ledger.py:18`, `governance/sei_backfill.py:15` → `Clock` - - `mcp.py:22`, `cli.py:8`, `api/app.py:317`, `api/app.py:372` → `SystemClock` -- Outbound: none (both are leaf modules; stdlib only — `hashlib`, `json`, `datetime`, `typing`). - -**Patterns Observed:** -- Leaf-module discipline: zero intra-`legis` imports, so they sit at the bottom of the dependency DAG (the foundation every hash/HMAC and timestamp resolves to). -- Dependency-injected clock with a deterministic test double (`FixedClock`) — same discipline cited from elspeth. -- Single canonicalization choke point: all content hashing routes through one function, so an RFC-8785 upgrade is a one-file change. - -**Concerns:** -- **M13 — PARTIALLY closed.** `canonical_json` already passes `allow_nan=False` (canonical.py:17), so the specific "no `allow_nan=False`" finding is addressed. The broader M13 — full RFC-8785 hardening — remains open and is explicitly deferred (docstring L3–6, ADR-0001). Until then, canonicalization is not interoperable with elspeth's RFC-8785 form and Unicode/number-edge normalization is not guaranteed. Note `ensure_ascii=False` makes byte-output encoding-dependent; the suite consistently `.encode("utf-8")` (audit_store L50, signing L33), so consistent today but a latent footgun if any caller hashes the str differently. -- `clock.py`: no concerns observed (Protocol + two trivial implementations; verified determinism via `FixedClock`). - -**Confidence:** High — read canonical.py:1–22 and clock.py:1–30 in full; confirmed `allow_nan=False` present at L17 (refining the prior M13 wording); enumerated all 9 canonical inbound edges and all clock inbound edges by grep against actual import lines. - ---- - -## Cross-cluster note (HMAC location) - -The HMAC tamper-evidence layer is **not** in this cluster's store — it lives in `src/legis/enforcement/signing.py` (`sign`/`verify`, versioned `hmac-sha256:v2:`, canonical-JSON v1). `BindingLedger` (governance) and protected-verdict writers apply it and persist the signature as an ordinary payload field. The store provides only hash-chaining + append-only triggers. diff --git a/docs/arch-analysis-2026-06-06-0158/temp/catalog-D-service-api.md b/docs/arch-analysis-2026-06-06-0158/temp/catalog-D-service-api.md deleted file mode 100644 index b13d2e1..0000000 --- a/docs/arch-analysis-2026-06-06-0158/temp/catalog-D-service-api.md +++ /dev/null @@ -1,121 +0,0 @@ -# Cluster D — Service Layer + HTTP API - -## Service Layer -**Location:** `src/legis/service/` -**Responsibility:** Transport-agnostic governance business logic — the shared decision/enforcement primitives that the HTTP, MCP, and CLI frontends all route through, raising `ServiceError` subclasses (never `HTTPException`/JSON-RPC) so each adapter owns its own error translation. - -**Key Components:** -- `__init__.py` (47 LOC) — Public re-export surface; defines the contract both adapters import (`evaluate_policy`, `compute_override_rate`, `submit_override`/`submit_protected_override`/`submit_operator_override`, `request_signoff`, `resolve_for_record`, `verified_records`, `explain_policy`, `route_wardline_scan`, error types). -- `errors.py` (28 LOC) — Domain exception taxonomy: `ServiceError` base + `AuditIntegrityError` (HTTP 500 / MCP `AUDIT_INTEGRITY_FAILURE`), `NotEnabledError` (gate not wired → 404), `NotFoundError`, `InvalidArgumentError` (→ 422). Adapters switch on type, never message text (`errors.py:8-28`). -- `governance.py` (248 LOC) — Core enforcement wrappers. `resolve_for_record` (`:29`) is the single resolve-then-key boundary (SEI-keyed via Loomweave `IdentityResolver`, locator-keyed standalone, emits `loomweave` extension with alive/content_hash/lineage). `verified_records` (`:63`) is the fail-closed verified-trail read (protected gate owns trail when wired, else simple-tier engine; `verify_integrity()` + `TrailVerifier.verify()` → `AuditIntegrityError` on tamper). `compute_override_rate` (`:95`) binds threshold/window/floor to ADR-0002 `params` constants — NOT caller input. `submit_override` (`:109`) wraps `EnforcementEngine.submit_override` (simple-tier chill/coached). `submit_protected_override` (`:140`) + `submit_operator_override` (`:174`) wrap `ProtectedGate.submit`/`.operator_override`, each gated by `verify_current_source_binding` + `require_verified_source_binding`. `request_signoff` (`:207`) wraps `SignoffGate.request`. `evaluate_policy` (`:230`) wraps `PolicyGrammar.evaluate` and records an `UNKNOWN_POLICY` provenance-gap event when result is UNKNOWN. -- `source_binding.py` (89 LOC) — Current-source fingerprint verification for protected submissions. `verify_current_source_binding` (`:31`) re-hashes the on-disk file under `source_root`, rejecting stale fingerprints (`InvalidArgumentError`) and path escapes (`:24-28`); returns `{status: verified|unverified}`. `require_verified_source_binding` (`:82`) fails closed only for source-shaped (`.py` locator) entities. -- `explain.py` (122 LOC) — `explain_policy` (`:57`) maps a policy→cell (chill/coached/structured/protected) into a `PolicyExplanation` (judge_inline, self_clearable, human_in_loop, enabled, available_moves, required_inputs). Pure discovery; drives the MCP `policy_explain` tool. Not consumed by the HTTP API. - -**Dependencies:** -- Inbound: - - `src/legis/api/app.py:43-51` — HTTP adapter imports `compute_override_rate`, `evaluate_policy`, `resolve_for_record`, `submit_override`, `submit_protected_override`, `submit_operator_override`, `verified_records`, `route_wardline_scan`, and the three error types. - - `src/legis/mcp.py:37-53` — MCP adapter imports the error types, `explain_policy`, the governance helpers (`:45`), and `route_wardline_scan` (`:53`). Note: MCP additionally imports `DEFAULT_GOVERNANCE_DB`/`DEFAULT_CHECK_DB` constants *from* `legis.api.app` (`mcp.py:115,496,505`) — an api→service-peer coupling worth flagging. - - `cli.py` does NOT import `legis.service` directly; it launches the HTTP app (`cli.py:270` `legis.api.app:create_app`). CLI reaches the service layer transitively through HTTP, not in-process. -- Outbound (all file:line in `service/`): - - `service -> legis.enforcement.engine` (`governance.py:14` EnforcementEngine/EnforcementResult; `explain.py:8`) - - `service -> legis.enforcement.lifecycle` (`governance.py:15` evaluate_override_rate) - - `service -> legis.enforcement.protected` (`governance.py:16` ProtectedGate/ProtectedResult/TamperError) - - `service -> legis.enforcement.signoff` (`governance.py:17`, `wardline.py:10` SignoffGate) - - `service -> legis.governance.params` (`governance.py:18` ADR-0002 rate constants) - - `service -> legis.identity.entity_key` (`governance.py:19`, `wardline.py:11` EntityKey) - - `service -> legis.identity.resolver` (`governance.py:20`, `wardline.py:12` IdentityResolver) - - `service -> legis.policy.grammar` (`governance.py:21` PolicyGrammar/PolicyEvaluation/PolicyResult) - - `service -> legis.policy.cells` (`explain.py:9` PolicyCellRegistry) - - `service -> legis.canonical` (`wardline.py:8` content_hash) - - `service -> legis.wardline.governor` (`wardline.py:14` WardlineCellPolicy/route_findings) - - `service -> legis.wardline.ingest` (`wardline.py:15` verify_wardline_artifact/active_defects/wardline_artifact_fields/WardlineSeverity) - - `service -> legis.wardline.policy` (`wardline.py:21` resolve_cell) - - Internal: `governance.py:22-26` imports `service.errors` + `service.source_binding`; `wardline.py:13` imports `service.governance.resolve_for_record`. - - No outbound dependency on `legis.store` (the engine/gate own their stores); service stays store-agnostic via duck-typed `protected_gate`/`trail_verifier` in `verified_records`. - -**Patterns Observed:** -- Explicit-dependency injection: every helper takes its gates/engine/identity as parameters (no globals, no closures) — `governance.py:1-6` docstring states this as a rule. -- Keyword-only args after the positional gate (`submit_override(engine, *, ...)`) to prevent same-typed field transposition at the call site (`governance.py:126-128`). -- Fail-closed verification: `verified_records` and `require_verified_source_binding` raise rather than degrade. -- Policy constants sourced from `governance.params`, not caller input — gate-tuning resistance (`governance.py:98-106`). -- Duck-typing at the enforcement seam to avoid coupling to concrete gate types (`governance.py:77-80`). - -**Concerns:** -- **M1 (source binding can be `unverified` yet still sign a protected record)** — REFINED. `require_verified_source_binding` (`source_binding.py:82-89`) only enforces verification when `_source_path_from_entity` returns non-None, i.e. the locator's pre-`:` segment ends in `.py`. A protected entity whose locator is NOT a `.py` source path (e.g. an opaque SEI or non-`.py` locator) yields `status: unverified` and passes the guard, then `submit_protected_override` (`governance.py:163`) still produces an HMAC-signed protected record carrying `source_binding={status: unverified, reason: "entity is not a Python source locator"}`. Provenance is recorded honestly, but the "current-source must match before signing" invariant only binds `.py`-shaped entities. Confirmed. -- **M2 (provenance gaps)** — `evaluate_policy` records an `UNKNOWN_POLICY` event with `provenance_gap: True` only when grammar returns UNKNOWN (`governance.py:239-247`); writer-supplied `target` facts are otherwise trusted without provenance. The gap-flagging is grammar-driven, not provenance-of-input-driven. -- `explain.py:71` `del entity` — the ratified tool contract accepts `entity` but v1 registry routes by policy only; a no-op parameter that could mislead callers into thinking entity affects routing (documented at `:67-70`). -- Error-type completeness: `NotFoundError` is exported and defined but not raised anywhere in `service/` (only `NotEnabledError`/`InvalidArgumentError`/`AuditIntegrityError` are). Reserved for adapter use. - -**Confidence:** High — read 100% of all 6 service files; cross-validated inbound importers via grep across `src/` (`api/app.py:43-51`, `mcp.py:37-53`, `cli.py:270`) and outbound imports line-by-line. M1/M2 confirmed against `source_binding.py:82-89` and `governance.py:230-248`. - ---- - -## HTTP API -**Location:** `src/legis/api/` -**Responsibility:** The FastAPI application factory (`create_app`) exposing the git/check operating-picture read surfaces plus the mutating governance surfaces (overrides, protected/operator overrides, sign-off, wardline scan routing, binding, closure-gate), enforcing bearer auth with writer/operator scopes and translating `ServiceError` subclasses into HTTP status codes. - -**Key Components:** -- `__init__.py` (1 LOC) — package marker. -- `app.py` (830 LOC) — Single `create_app(...)` factory (`:277`); ~16 keyword DI params (repo_path, check/pull surfaces, enforcement engine, protected/signoff gates, trail_verifier, grammar, identity, filigree, binding_ledger, binding_key, pull sources). Lazy env-driven fallback wiring (`:296-347`): builds `IdentityResolver`, `FiligreeClient`, and — when `LEGIS_HMAC_KEY` is set — `AuditStore`, `TrailVerifier`, `ProtectedGate`, `SignoffGate`, `BindingLedger`. Auth helpers `_token_actor_from_mapping` (`:61`), `_verify_secret` (`:100`), `verify_writer`/`verify_operator` (`:138-143`). Pydantic request models `:150-225`. - -**Routes table** (METHOD PATH | scope | delegates-to): - -| METHOD PATH | scope | delegates-to | -|---|---|---| -| GET /health | none | inline (`:389`) | -| GET /git/branches | none | `GitSurface.branches` (`:395`) | -| GET /git/commits/{sha} | none | `GitSurface.commit` (`:402`) | -| GET /git/renames | none | `GitSurface.renames` (`:409`) | -| GET /git/rename-feed | none | `git.rename_feed.build_rename_feed` (`:416`) | -| GET /git/pull-requests/{number} | none | `PullRequestSource.get` + `checks().for_pr` (`:432`) | -| POST /git/pulls | **writer** | `PullSurface.record` (`:444`) | -| GET /git/pulls/{number} | none | `PullSurface.get` + `checks().for_pr` (`:452`) | -| POST /checks | **writer** | `CheckSurface.record` (`:464`) | -| GET /checks/commit/{sha} | none | `CheckSurface.for_commit` (`:470`) | -| GET /checks/branch/{name} | none | `CheckSurface.for_branch` (`:474`) | -| GET /checks/pr/{pr} | none | `CheckSurface.for_pr` (`:478`) | -| POST /overrides | **writer** | `service.submit_override` (`:484`) | -| GET /overrides | none | `service.verified_records` (`:522`) | -| POST /protected/overrides | **writer** | `service.submit_protected_override` (`:528`) | -| POST /protected/operator-override | **operator** | `service.submit_operator_override` (`:558`) | -| POST /signoff/request | **writer** | `SignoffGate.request` directly (NOT via `service.request_signoff`) (`:583`) | -| POST /signoff/{request_seq}/bind-issue | **writer** | `governance.bind_signoff_to_issue` (`:597`) | -| GET /signoff/{request_seq}/binding | none | `BindingLedger.get` (`:650`) | -| GET /filigree/issues/{issue_id}/closure-gate | none | `governance.filigree_gate.evaluate_issue_closure` (`:662`) | -| POST /signoff/{request_seq}/sign | **operator** | `SignoffGate.sign_off` directly (`:676`) | -| GET /governance/override-rate | none | `service.compute_override_rate` + `verified_records` (`:687`) | -| GET /governance/identity-gaps | none | `governance.gaps.find_orphan_gaps` + `verified_records` (`:704`) | -| GET /governance/lineage-integrity | none | `governance.gaps.find_lineage_integrity` (`:711`) | -| POST /policy/evaluate | **writer** | `service.evaluate_policy` (`:733`) | -| POST /wardline/scan-results | **writer** | `service.route_wardline_scan` (`:750`) | - -**Dependencies:** -- Inbound: - - `src/legis/cli.py:270` — `legis serve` launches `legis.api.app:create_app` via uvicorn (factory=True). CLI is the only in-process caller; it is a *launcher*, not a consumer. - - `src/legis/mcp.py:115,496,505` — imports the `DEFAULT_GOVERNANCE_DB`/`DEFAULT_CHECK_DB` constants from `api.app` (constant reuse, not a runtime call). Flag: a sibling adapter depending on the HTTP adapter's module for shared defaults. -- Outbound (file:line in `app.py`): - - `api -> legis.service.*` — `:43` errors, `:44-50` governance helpers, `:51` `route_wardline_scan` (primary business-logic seam). - - `api -> legis.enforcement.engine` (`:31`), `legis.enforcement.protected` (`:32` ProtectedGate/TamperError/TrailVerifier), `legis.enforcement.signoff` (`:33` SignoffGate) — **direct reach-through**: the API constructs and calls these gates directly for sign-off (`:588`,`:680`) and trail verification (`:605-618`). - - `api -> legis.checks.{models,surface}` (`:29-30`), `legis.pulls.{models,surface}` (`:53-54`), `legis.git.{pull_request,rename_feed,surface}` (`:34-36`). - - `api -> legis.governance.*` — `gaps` (`:37`), `binding_ledger` (`:39`), `signoff_binding` (`:40` bind_signoff_to_issue), `filigree_gate` (lazy `:664`). - - `api -> legis.filigree.client` (`:38`), `legis.identity.{entity_key,resolver}` (`:41-42`), `legis.policy.grammar` (`:52`), `legis.wardline.{governor,ingest}` (`:55-56`). - - `api -> legis.store.audit_store` (lazy `:318,373`), `legis.clock.SystemClock` (lazy `:317,372`), `legis.enforcement.judge_factory` (lazy `:333`). - -**Patterns Observed:** -- Application factory with exhaustive DI and lazy env-fallback construction; a no-arg app creates no state until a route needing a store is hit (`:358-384` lazy `checks()`/`pulls()`/`engine()`/`grammar_()`). -- Adapter error-translation: `NotEnabledError → 404`, `InvalidArgumentError → 422`, `AuditIntegrityError → 500`, `WardlinePayloadError → 422`, gate `ValueError → 409` (`:544-547`, `:824-827`, `:519-520`). -- ACCEPTED/BLOCKED → 201/409 status mapping so agents get the judge rationale either way (`:502-512`). -- Server-owned authority: override-rate constants, wardline routing cell, and the recorded actor are server-decided, not caller-supplied. -- Scope-gated dependencies via FastAPI `Depends(verify_writer|verify_operator)` — but the writer/operator split is enforced only in `LEGIS_API_TOKEN_ACTORS` mode; single-secret mode collapses both to one credential (see Concerns H7-adjacent). - -**Concerns:** -- **C2/H1 (server-owned wardline routing + artifact HMAC) — HTTP is the reference and now has PARITY with MCP.** HTTP enforces: server routing wins and forbids caller routing fields (`:757-760` → 403); when no server routing, caller routing requires the unsafe escape hatch `LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING=1` (`:761-766` → 403); artifact HMAC via `LEGIS_WARDLINE_ARTIFACT_KEY` (`:818-822`, verified in `wardline.py:36` `verify_wardline_artifact`). CROSS-CHECK (HTTP-authoritative; MCP is another cluster's read): verification itself lives in the shared `route_wardline_scan` (`wardline.py:36`), so any caller of the seam gets artifact HMAC. A grep of `mcp.py:863-928` SUGGESTS MCP now mirrors all three (server_cell/server_routing gate, same `LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING` escape hatch, same artifact_key plumbing) — but this is a grep, not a full read of that cluster. Synthesis owns confirming the prior MCP-skips-this gap is actually closed; do not treat it as closed on my word. -- **H7 (unscoped API token entries grant operator authority) — REFINED/MITIGATED.** `_token_actor_from_mapping` (`:80-91`): a `LEGIS_API_TOKEN_ACTORS` entry with NO `:scope` segment is now REJECTED with 403 (`:82-86`) UNLESS `LEGIS_ALLOW_UNSCOPED_API_TOKENS=1` is set. With that flag, an unscoped entry returns the actor for ANY `required_scope` (the `if scope_sep and required_scope not in scopes` check at `:87` is skipped when `scope_sep` is falsy) — so an unscoped token still grants operator authority, but only behind an explicit opt-in flag. Residual risk gated by env opt-in. Confirmed. -- **H7-adjacent (single-secret mode has NO scope split — same vulnerability class, more common deployment).** The `LEGIS_API_SECRET` branch of `_verify_secret` (`:108-116`) returns `LEGIS_API_ACTOR`/default actor on a `compare_digest` match WITHOUT ever consulting `required_scope`. So when a deployment uses a single shared secret (no `LEGIS_API_TOKEN_ACTORS` mapping), `verify_operator` (required_scope=`operator`, `:142`) and `verify_writer` (required_scope=`writer`, `:138`) are satisfied by the *same* token — the operator-only routes (`POST /protected/operator-override`, `POST /signoff/{seq}/sign`) are reachable by any holder of the writer secret. The writer/operator scope split is therefore a real control ONLY in TOKEN_ACTORS mode; in single-secret mode it is vacuous and the secret grants operator authority. Confirmed against `:104-116`. -- **M1 surfaces here** — `POST /protected/overrides` (`:528`) and `POST /protected/operator-override` (`:558`) pass `source_root` to the service, but non-`.py` entities still produce signed records with `source_binding: unverified` (see Service-layer M1). The HTTP layer adds no extra guard beyond the service helper. -- **M2 surfaces here** — `POST /checks` (`:464`), `POST /git/pulls` (`:444`), and `POST /policy/evaluate` (`:733`) accept writer-supplied facts (CheckRun outcome, PR state, policy target) with `recorded_by=actor` provenance but no fact-provenance attestation; a writer can record arbitrary check/PR outcomes. -- **Drift signal — sign-off bypasses the service seam.** `POST /signoff/request` (`:588`) and `POST /signoff/{seq}/sign` (`:680`) call `SignoffGate.request`/`.sign_off` directly rather than `service.request_signoff` (which exists and is exported, `__init__.py:42`). The bind-issue trail-verification block (`:605-618`) also re-implements the `verified_records` tamper-check pattern inline instead of reusing the service helper. This is the same class of HTTP↔service divergence the audit watches for — here the HTTP adapter reaches past its own service layer. -- Unauthenticated read surfaces (`GET /overrides`, `/governance/*`, `/signoff/{seq}/binding`) expose governance trail/binding data with no scope; acceptable for an operating-picture read API but worth noting governance records are readable by any client. -- `LEGIS_UNSAFE_DEV_AUTH=1` (`:130-131`,`:117`) bypasses auth entirely when no secret/token is configured — fail-open dev path; the default with nothing configured is 401 (`:119-123`), so this is opt-in. - -**Confidence:** High — read 100% of `app.py` (830 LOC) and enumerated every `@app.` decorator with its `Depends`/scope and delegate. Auth logic (`:61-143`) and wardline routing (`:750-828`) read in full. H7/C2/H1 cross-validated against `mcp.py:863-928` and `wardline.py:36`. Inbound importers confirmed via grep. diff --git a/docs/arch-analysis-2026-06-06-0158/temp/catalog-E-frontends.md b/docs/arch-analysis-2026-06-06-0158/temp/catalog-E-frontends.md deleted file mode 100644 index 1b65931..0000000 --- a/docs/arch-analysis-2026-06-06-0158/temp/catalog-E-frontends.md +++ /dev/null @@ -1,138 +0,0 @@ -# Cluster E — Agent/CLI Frontends - -Two of the three Legis frontends. The HTTP API (`api/app.py`) is the third, -covered by another explorer. All three are *supposed* to route governance -decisions through the transport-agnostic `service/` layer. - ---- - -## CLI Frontend - -**Location:** `src/legis/cli.py` (~161 stmts), `src/legis/__init__.py` - -**Responsibility:** Provides the `legis` console script — an argparse dispatcher that runs the HTTP server, launches the MCP stdio server, executes governance CI gates (override-rate, policy-boundary), and runs the SEI backfill — wiring CLI flags into the environment variables the frontends read. - -**Key Components:** -- `cli.py:build_parser` (32–143) — declares six subcommands: `serve`, `mcp`, `check-override-rate`, `governance-gate`, `sei-backfill`, `policy-boundary-check`. - - `serve` (36–63, dispatch 254–271) — sets `LEGIS_*`/`LOOMWEAVE_API_URL`/`FILIGREE_API_URL` env from flags, then `uvicorn.run("legis.api.app:create_app", factory=True)`. - - `mcp` (65–87, dispatch 287–303) — requires `--agent-id`, sets env, then calls `legis.mcp.main(agent_id)`. This is the launch-bound identity boundary for the MCP server. - - `check-override-rate` / `governance-gate` (91–106, dispatch 273–274) — both route to `_check_override_rate`; exit 1 on FAIL for CI. - - `sei-backfill` (107–130, dispatch 276–285) — resolves legacy locator-keyed records through Loomweave batch resolve (dry-run unless `--execute`). - - `policy-boundary-check` (132–141, dispatch 305–314) — fails when `@policy_boundary` metadata lacks current behavioural evidence; text or json output. -- `cli.py:_check_override_rate` (170–244) — the override-rate CI gate. **Reads the audit store directly** (`AuditStore(db_url).read_all()`, 194/199), inlines its own protected-record detection (`_requires_protected_verification`, 206–215), builds its own `TrailVerifier` and calls `verify()` (228–231), then `evaluate_override_rate` (236). Fail-closed on missing DB under CI (177–192) and on protected records without `LEGIS_HMAC_KEY` (220–226). -- `cli.py:_apply_judge_env` (159–167) — maps `--judge-*` flags onto `LEGIS_JUDGE_*` env for both `serve` and `mcp`. -- `__init__.py` (3) — `__version__ = "1.0.0rc2"`; consumed by `mcp.py` serverInfo. - -**Dependencies:** -- Inbound: console-script entry point (`legis = legis.cli:main`); top-level operator/CI invocation. No in-tree importers. -- Outbound (module-level + dispatch-time): - - `cli -> uvicorn` (`cli.py:6`, run target at 270) - - `cli -> legis.api.app:create_app` (`cli.py:270`, sibling frontend, by factory string) - - `cli -> legis.mcp.main` (`cli.py:301`, sibling frontend — CLI launches the MCP server) - - `cli -> legis.clock.SystemClock` (`cli.py:8`) - - `cli -> legis.governance.sei_backfill.run_pre_sei_backfill` (`cli.py:9`) - - `cli -> legis.identity.loomweave_client` (`cli.py:10`) - - `cli -> legis.policy.boundary_scan.scan_policy_boundaries` (`cli.py:11`) - - `cli -> legis.store.audit_store.AuditStore` (`cli.py:12`, also 194) - - `cli -> legis.enforcement.lifecycle` (GateStatus, evaluate_override_rate) (`cli.py:172`) - - `cli -> legis.governance.params` (`cli.py:173`) - - `cli -> legis.enforcement.protected` (TrailVerifier, TamperError) (`cli.py:228`) - - `cli -> legis.service.*` — **NONE** (verified: `grep legis.service src/legis/cli.py` → 0 hits). - -**Patterns Observed:** -- Env-var seam: every subcommand translates flags into `LEGIS_*` env vars, then defers to a frontend/service that re-reads env. Flags never pass through function arguments to the server, so server and CLI share one configuration surface. -- Lazy local imports inside dispatch branches (`enforcement.lifecycle`, `enforcement.protected`, `legis.mcp`) keep import cost and store side-effects off the cold path. -- Fail-closed CI posture: missing DB, integrity-chain failure, and unverifiable protected records all return exit 1 (guarded by `CI=true` / `LEGIS_ALLOW_MISSING_GOVERNANCE_DB`). - -**Concerns:** -- **Service-layer bypass (adapter drift, CLI side).** `_check_override_rate` (170–244) routes through *no* `service.*` function. It hand-rolls a parallel copy of `service.verified_records` (store read + `TrailVerifier.verify`, 199/228–231) and of `service.compute_override_rate` (inline `evaluate_override_rate` with the `params.*` constants, 236–241). MCP's `override_rate_get` (mcp.py:1023) *does* go through `service.compute_override_rate(_verified_records(...))`. So the CLI and MCP read the same gate two different ways. This duplication already forced a divergent fix: commit `07cf54e "fix(cli): fail closed on protected override-rate trails"` patched the CLI's inline protected-verification path alone. Recommend collapsing `_check_override_rate` onto `service.verified_records` + `service.compute_override_rate`. -- `import os` appears inside three dispatch branches (255, 288) and helpers (89, 160, 171) rather than at module top — harmless but inconsistent. -- No structured logging/observability around gate outcomes; results are `print`-only. - -**Confidence:** High — Read cli.py in full (318 lines) and `__init__.py` in full. Verified the service-bypass claim with `grep legis.service src/legis/cli.py` (0 hits) and cross-checked the MCP counterpart at mcp.py:1023. Every dependency edge is a literal import statement cited by line. Cross-referenced commit `07cf54e` to confirm the duplication already drove a CLI-only fix. - ---- - -## MCP Server Frontend - -**Location:** `src/legis/mcp.py` (~464 stmts — the largest module in the cluster) - -**Responsibility:** A stdlib-only, hand-rolled MCP-over-stdio JSON-RPC server (protocols `2024-11-05` / `2025-03-26`) that exposes Legis governance + git/CI read tools to agents under a launch-bound `agent_id`, mapping each tool call onto the transport-agnostic `service/` layer (or, for read surfaces, directly onto the owning surface). - -**Key Components:** -- `McpRuntime` dataclass (81–98) — per-launch state: `agent_id`, lazily-built engine/gates/surfaces, `trail_verifier`, `wardline_artifact_key`, `binding_ledger`. -- `build_runtime` (114–173) — wires gates only when `LEGIS_HMAC_KEY` is present: `TrailVerifier`, `ProtectedGate`, `SignoffGate`, and `BindingLedger` are all constructed together under the same key (133–152), so there is no "gate without verifier" hole. -- `tool_definitions` (185–307) — JSON schemas; every schema is built via `_schema` (176–182) with `additionalProperties: False`. -- `call_tool` (676–1036) — the dispatch table. Begins with `_validate_argument_keys` (678). -- `handle_request` / `run_jsonrpc` / `main` (1039–1123) — JSON-RPC framing, `initialize` gating, protocol negotiation. - -**MCP tools and their routing (Task #1):** - -| Tool | Routes through `service/`? | Target | -|------|---------------------------|--------| -| `policy_explain` | service | `service.explain.explain_policy` (680) | -| `override_submit` | service | `service.governance.submit_override` / `submit_protected_override` / `request_signoff` (743/771/808) | -| `policy_evaluate` | service | `service.governance.evaluate_policy` (848) | -| `scan_route` | service | `service.wardline.route_wardline_scan` (916) | -| `override_rate_get` | service | `service.governance.compute_override_rate` over `_verified_records` (1023–1024) | -| `signoff_status_get` | **direct** | `runtime.signoff_gate` (`enforcement.signoff`) — `request_record`/`is_cleared` (831–845) | -| `filigree_closure_gate_get` | **direct** | `governance.filigree_gate.evaluate_issue_closure` over `binding_ledger` (968–975) | -| `git_branch_list` / `git_commit_get` / `git_rename_list` | **direct** | `git.surface.GitSurface` (936–954) | -| `git_rename_feed_get` | **direct** | `git.rename_feed.build_rename_feed` (956–966) | -| `pull_request_get` | **direct** | `pulls.surface.PullSurface` (+ `checks.surface`) (977–990) | -| `check_list` | **direct** | `checks.surface.CheckSurface` (992–1021) | - -The five governance-decision tools all route through `service/`. The read/poll surfaces (`signoff_status_get`, `filigree_closure_gate_get`, `git_*`, `pull_request_get`, `check_list`) reach their owning surface directly — consistent with the HTTP adapter, which does the same for read surfaces. - -**Dependencies:** -- Inbound: `legis.cli` only (`cli.py:301 from legis.mcp import main`). The MCP server is launched exclusively by the CLI's `mcp` subcommand. -- Outbound (module-level unless noted): - - `mcp -> legis.api.app` — **sibling-frontend coupling.** Imports `DEFAULT_GOVERNANCE_DB` (`mcp.py:115`, `mcp.py:496`) and `DEFAULT_CHECK_DB` (`mcp.py:505`) from the *HTTP adapter* module for default DB URLs. (See Concerns.) - - `mcp -> legis.service.governance` (compute_override_rate, evaluate_policy, submit_override, submit_protected_override, request_signoff, verified_records) (`mcp.py:45`) - - `mcp -> legis.service.wardline.route_wardline_scan` (`mcp.py:53`) - - `mcp -> legis.service.explain.explain_policy` (`mcp.py:44`) - - `mcp -> legis.service.errors` (`mcp.py:37`) - - `mcp -> legis.enforcement.engine.EnforcementEngine` (`mcp.py:23`, 499) - - `mcp -> legis.enforcement.protected` (ProtectedGate, TrailVerifier, TamperError) (`mcp.py:25`) - - `mcp -> legis.enforcement.signoff.SignoffGate` (`mcp.py:26`) - - `mcp -> legis.enforcement.judge_factory.build_judge_from_env` (`mcp.py:24`) - - `mcp -> legis.enforcement.verdict` (SignoffState, Verdict) (`mcp.py:27`) - - `mcp -> legis.governance.binding_ledger` (BindingError; BindingLedger lazy at 146) (`mcp.py:29`) - - `mcp -> legis.governance.filigree_gate.evaluate_issue_closure` (lazy, `mcp.py:969`) - - `mcp -> legis.policy.cells` / `legis.policy.grammar` (`mcp.py:30–35`) - - `mcp -> legis.wardline.governor` / `legis.wardline.ingest` (`mcp.py:55–56`) - - `mcp -> legis.git.surface.GitSurface`, `legis.git.rename_feed.build_rename_feed` (`mcp.py:28`, lazy 957) - - `mcp -> legis.pulls.surface.PullSurface`, `legis.checks.surface.CheckSurface`, `legis.checks.models.CheckRun` (`mcp.py:36/20/21`) - - `mcp -> legis.store.audit_store.AuditStore` (`mcp.py:54`) - - `mcp -> legis.identity.*` (lazy in build_runtime, `mcp.py:122`) - - `mcp -> legis.canonical.content_hash` (`mcp.py:19`) - -**Patterns Observed:** -- Service-routing for decisions, direct-surface for reads (table above). Governance writes always cross the `service/` seam; cheap reads do not. -- Launch-bound identity: `agent_id` is supplied once at process start; tool schemas never accept actor identity (module docstring 1–7, enforced because every `submit_*` call passes `agent_id=runtime.agent_id`). -- Lazy resource construction (`_engine`/`_checks`/`_pulls`/`_git`, 486–518) so a protected-only deployment never initialises the simple-tier store. -- Discriminated outcome envelopes + structured recovery hints (`_tool_error` / `_recovery_for`, 317–345); per-cell payload shapers (`_judged_result_payload`, 532–559). -- Idempotency-replay machinery: request-hash binding + recorded-outcome replay (`_override_idempotency_request_hash` 562–583, `_existing_idempotent_record` 586–598, `_idempotent_override_response` 601–631). - -**Concerns:** - -*Adapter-drift audit verdicts (against current source — most important output):* - -- **C2 — RESOLVED.** MCP `scan_route` no longer blindly honors caller-chosen `cell`/`severity_map`/`fail_on`. The handler reads server routing from `LEGIS_WARDLINE_CELL` / `LEGIS_WARDLINE_CELL_BY_SEVERITY` (863–864) and, when server routing is configured, rejects any caller-supplied `cell`/`severity_map`/`fail_on` with `INVALID_CELL_SPEC` (872–876). Caller-chosen routing is only reachable behind the `LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING=1` escape hatch (878–894). This mirrors the HTTP handler `app.py:752–777` line-for-line. *Caveat:* the bypass is closed **behaviorally in `call_tool`**, not at the schema — the `scan_route` inputSchema still advertises `cell`/`severity_map`/`fail_on` as accepted properties (241–249), and the M9 key-validator therefore lets them through to the runtime guard. The guard, not the schema, is what enforces server-owned routing. - -- **C3 — RESOLVED.** Protected-trail reads now go through the HMAC `TrailVerifier`. `_verified_records` (649–673), when `protected_gate` is wired, delegates to `service.governance.verified_records(protected_gate, trail_verifier, lambda: [])` (651), which calls `trail_verifier.verify(records)` and raises `AuditIntegrityError` on `TamperError` (service/governance.py:86–90). `build_runtime` always constructs `trail_verifier` together with `protected_gate` under the same key (141–143), so there is no "gate set, verifier None" gap. The unkeyed-hash-chain-only read path is gone. - -- **H1 — RESOLVED.** MCP now passes the configured Wardline artifact key into routing. `scan_route` supplies `artifact_key=runtime.wardline_artifact_key or os.environ["LEGIS_WARDLINE_ARTIFACT_KEY"]` (925–932); `route_wardline_scan` calls `verify_wardline_artifact(scan, artifact_key)` (service/wardline.py:36), which, when a key is present, *requires* signed scanner/rule-set/commit/tree provenance and a verifying `artifact_signature`, raising `WardlinePayloadError` otherwise (ingest.py:86–107). Matches the HTTP path (app.py:818–822). - -- **M9 — RESOLVED.** Schemas claim `additionalProperties:false` (`_schema`, 179) *and* dispatch enforces it. `call_tool` calls `_validate_argument_keys(name, args)` as its first action (678); that helper diffs supplied keys against the schema's declared properties and raises `InvalidArgumentError("unexpected argument(s) …")` for any extra (375–382). Unknown keys are now rejected rather than silently ignored. - -- **M10 — RESOLVED.** The handle/seq type contract is now internally consistent. `override_submit` returns `poll_handle: signoff.seq` (791) where `SignoffResult.seq: int` (enforcement/signoff.py:25), and `signoff_status_get` declares `seq` as `{"type":"integer"}` (224 via the shared `integer` schema, 187). The reader `_require_int` (413–426) additionally tolerates an integer-valued *string*, so a caller round-tripping the int handle (or a stringified copy) both validate. No int-vs-string mismatch remains. - -- **M11 — RESOLVED.** `override_submit` now has idempotency protection (commit `b4285dc "fix: scope MCP idempotency replays"`, mcp.py +57 lines). When an `idempotency_key` is supplied, the handler computes a request hash binding agent/policy/entity/rationale/cell/fingerprint/ast_path (562–583), looks for a prior record with the same key (734–741), replays the recorded outcome on match (`_idempotent_override_response`, 601–631), and raises `InvalidArgumentError` if the same key is reused for a *different* request (595–597). Replay lookups read the verified trail (`_verified_records`, 589), so the protection is fail-closed against tampering. - -*Non-drift concerns:* -- **Sibling-frontend coupling.** MCP imports DB-default constants (`DEFAULT_GOVERNANCE_DB`, `DEFAULT_CHECK_DB`) from `legis.api.app` (115/496/505) — the HTTP adapter. Two peer frontends should not depend on each other for shared configuration; these constants belong in a shared config/store module. Architecturally the cleanest single coupling to break in this cluster. -- Hand-rolled JSON-RPC framing (`run_jsonrpc`, 1101–1118) with no message-size bound on a stdin line; acceptable for launch-bound local stdio but worth noting. -- The 464-stmt `call_tool` is a single long if/elif dispatch (676–1034); readable but a candidate for table-driven dispatch as the tool count grows. - -**Confidence:** High — Read mcp.py in full (1123 lines). Each adapter-drift verdict was cross-validated against the actual enforcement target: C2 against the HTTP handler (app.py:752–777); C3 against `service/governance.py:81–91`; H1 against `service/wardline.py:36` + `wardline/ingest.py:67–107`; M10 against `enforcement/signoff.py:25`; M11 against commit `b4285dc` (`git show --stat`). Tool-routing table built by reading every dispatch branch. The `api.app` coupling confirmed with `grep "from legis.api" src/legis/mcp.py`. diff --git a/docs/arch-analysis-2026-06-06-0158/temp/catalog-F-integrations.md b/docs/arch-analysis-2026-06-06-0158/temp/catalog-F-integrations.md deleted file mode 100644 index 37dc775..0000000 --- a/docs/arch-analysis-2026-06-06-0158/temp/catalog-F-integrations.md +++ /dev/null @@ -1,207 +0,0 @@ -# Catalog F — Suite Integrations & Git/CI Domain - -Cluster F covers the suite-seam integrations (Legis ↔ Loomweave / Wardline / -Filigree) plus the git and CI/PR domain surfaces. Read 100% of all 21 source -files in the six packages. Dependency edges grepped exhaustively across `src/`. - ---- - -## Identity (SEI) - -**Location:** `src/legis/identity/` - -**Responsibility:** Resolve a code locator to an SEI-keyed (or honestly-degraded, locator-keyed) opaque `EntityKey` by consuming Loomweave's SEI HTTP surfaces, never parsing the SEI and never guessing. - -**Key Components:** -- `entity_key.py` (40 lines) — `EntityKey` frozen dataclass: `value` (opaque locator or SEI) + `identity_stable` (False for locator, True for SEI). Factories `from_locator`/`from_sei`; `to_dict`/`from_dict`. `from_dict` (lines 34-40) validates `value` is a non-empty str and `identity_stable` is a `bool`, raising `ValueError` otherwise. -- `resolver.py` (96 lines) — `IdentityResolver.resolve(locator)` → `IdentityResolution` (entity_key, alive, content_hash, lineage_snapshot, two status strings). Probes capability once per instance (line 33, 40-48); on capability absent / no client / not-alive locator / non-dict response / transport exception, returns a locator-keyed degraded resolution. On a stable alive SEI, captures the REQ-L-01 lineage snapshot `{length, hash}` (lines 50-55). -- `loomweave_client.py` (219 lines) — HTTP transport seam. `LoomweaveIdentity` Protocol (capability/resolve_locator/resolve_batch/resolve_sei/lineage); `HttpLoomweaveIdentity` over stdlib `urllib` with injectable `fetch`. HMAC request signing (`sign_loomweave_request`, lines 67-87) emits `X-Weft-Component: loomweave:` + `X-Weft-Timestamp` + `X-Weft-Nonce` on protected (signed) routes; capability probe is unsigned (line 185). Base-URL validation requires HTTPS unless loopback (lines 143-150); 1 MB response cap; JSON-content-type enforcement. - -**Dependencies:** -- Inbound (heavily consumed foundation — 14 edges): - - `api/app.py:41` (`entity_key.EntityKey`), `:42` (`resolver.IdentityResolver`), `:299-300` (lazy `HttpLoomweaveIdentity`+`loomweave_hmac_key_from_env`, `IdentityResolver`) - - `cli.py:10` (`HttpLoomweaveIdentity`, `loomweave_hmac_key_from_env`) - - `mcp.py:122-123` (lazy `HttpLoomweaveIdentity`+key, `IdentityResolver`) - - `enforcement/engine.py:23`, `enforcement/lifecycle.py:17`, `enforcement/protected.py:21`, `enforcement/signoff.py:18` (all `entity_key.EntityKey`) - - `governance/binding_ledger.py:20` (`EntityKey`), `governance/gaps.py:18` (`LoomweaveIdentity`), `governance/sei_backfill.py:16-17` (`LoomweaveIdentity`, `EntityKey`), `governance/signoff_binding.py:23` (`EntityKey`) - - `records/override_record.py:14` (`EntityKey`) - - `service/governance.py:19-20` (`EntityKey`, `IdentityResolver`), `service/wardline.py:11-12` (`EntityKey`, `IdentityResolver`) - - `wardline/governor.py:35` (`EntityKey` type only) -- Outbound: `identity/resolver.py:15 → legis.canonical.content_hash` (lineage snapshot hashing). No other non-cluster outbound. `loomweave_client.py` and `entity_key.py` import only stdlib. - -**Patterns Observed:** -- SEI opacity discipline — `value` never parsed by legis; locator→SEI is a value change with no schema change (entity_key.py docstring). -- Honest degradation — every non-stable path returns `identity_stable=False` with an explicit status string; `alive` distinguishes `False` (known not-alive) from `None` (no capability/decision). -- Capability probed once per resolver instance, but a probe exception transiently degrades without caching (resolver.py:44-48), permitting retry on next resolve. -- Transport seam injectable (`fetch`) for offline tests; stdlib-only, no added dependency. - -**Concerns:** -- **M5 not reproduced (prior audit claim does not match current source).** `EntityKey.from_dict` (entity_key.py:38-39) rejects a non-`bool` `identity_stable` with `ValueError` rather than coercing malformed stability to `True`. Grep for any constructor bypassing the factories/`from_dict` (`EntityKey(` minus `from_*`) returns nothing — no path reconstructs an `EntityKey` while skipping validation. The malformed-stability-coerces-true defect is closed in the current tree. -- Capability cache is per-instance and never invalidated once `True` is latched (resolver.py:42-48): a Loomweave that loses the `sei` capability mid-life keeps being treated as capable by a long-lived resolver until a later call raises. Low severity (capability rarely revoked), but worth noting for long-lived service resolvers. -- `content_hash` field on a stable resolution is taken verbatim from the Loomweave response (`res.get("content_hash")`, resolver.py:92) with no type check (unlike `sei`). - -**Confidence:** High — read all 4 files (entity_key, resolver, loomweave_client, `__init__`) at 100%; cross-verified the 14 inbound edges by grep with file:line; ran the M5 bypass grep (clean). HMAC/degradation paths traced line-by-line. - ---- - -## Wardline Integration - -**Location:** `src/legis/wardline/` - -**Responsibility:** Ingest an agent-supplied Wardline MCP scan response, validate its shape, select the active-defect gate population, and route each finding into a configured 2×2 governance cell (surface+override / block+escalate / surface+only) — Wardline analyses, legis governs. - -**Key Components:** -- `ingest.py` (226 lines) — payload validation. `WardlineSeverity` (CRITICAL…NONE, ranked). `WardlineFinding.from_wire` validates required fields, severity enum, non-empty strings, optional `qualname`; carries `properties` **verbatim** (write-only evidence, tier-conformance deliberately NOT enforced — comment lines 142-145). `active_defects` selects `kind == "defect"` + `suppressed == "active"`; agent-suppressed states (`waived`/`suppressed`) require suppression proof (top-level or nested in `properties`), non-agent states (`baselined`/`judged`) are silently excluded, any other state rejected. `MAX_FINDINGS = 500` batch cap. `verify_wardline_artifact` optionally HMAC-verifies scanner/rule-set/commit/tree provenance when an `artifact_key` is configured; without a key it records supplied metadata as `artifact_status: "unverified"`. -- `governor.py` (142 lines) — `route_findings`. Requires exactly one of `policy` (whole-scan single cell) or `cell_map` (per-severity, every present severity must be mapped). Pre-write validation guard (lines 59-89) confirms engine/signoff presence and **rejects** any batch whose cells span block_escalate AND a surface_* cell (lines 86-89). Each finding resolves its entity via injected `resolve(qualname)` callable, builds a `wardline` extension (fingerprint, properties verbatim, severity, batch_provenance) merged with the loomweave lineage ext, and dispatches to `signoff.request` / `engine.submit_override` / `engine.record_event`. -- `policy.py` (17 lines) — `resolve_cell`: severity ≥ `fail_on` → `gate_cell`, else `SURFACE_ONLY`. - -**Dependencies:** -- Inbound: - - `api/app.py:55-56` (`WardlineCellPolicy`; `WardlinePayloadError`, `WardlineSeverity`) - - `mcp.py:55-56` (same) - - `service/wardline.py:14-15,21` (`WardlineCellPolicy`, `route_findings`; ingest symbols; `policy.resolve_cell`) — the orchestrator that wires the `resolve` callable from `IdentityResolver` -- Outbound: - - `wardline/ingest.py:14 → legis.enforcement.signing.verify` (artifact signature) - - `wardline/governor.py:33 → legis.enforcement.engine.EnforcementEngine`, `:34 → legis.enforcement.signoff.SignoffGate`, `:35 → legis.identity.entity_key.EntityKey` (type only) - - `wardline/policy.py` and `wardline/governor.py` import sibling `wardline.ingest`/`wardline.governor` - - Note: governor's identity coupling is the `EntityKey` *type* import only. Resolution arrives via the injected `resolve` callable (wired in `service/wardline.py`), NOT a static `IdentityResolver` import — there is no governor→resolver static edge. - -**Patterns Observed:** -- Single-judge governance: Wardline produces, legis decides the cell; trust tiers carried verbatim as the one suite vocabulary, never re-derived. -- Properties-as-write-only-evidence: tiers + diagnostics ride untyped into the record; nothing reads the values back. -- Validate-all-dependencies-before-any-write guard, plus an explicit cross-store-split rejection to keep a routed batch single-store. -- Optional artifact authentication: provenance verified only when a key is configured; otherwise honestly labelled unverified. - -**Concerns:** -- **M3 — refined (across-store version largely closed; intra-store non-atomicity remains).** The guard at governor.py:86-89 rejects any batch whose cells span block_escalate (signoff store) and surface_* (engine store), so a *routed* batch is structurally single-store — the across-stores M3 is closed by that guard. What remains (and is admitted in the comment at governor.py:60-65) is **intra-store** non-atomicity: a multi-finding same-cell batch performs N sequential appends to one append-only store, and a mid-loop runtime failure leaves the earlier findings permanently persisted. There is no transaction wrapping the loop. -- **Ingest validator relaxation (commit bbed0ba, 2026-06-05) — current state.** Three conscious, backward-compatible relaxations are live: (1) `properties` carried verbatim with tier-conformance dropped (ingest.py:139-145); (2) `baselined`/`judged` accepted as non-active without proof (lines 173, 221-222); (3) suppression proof read top-level OR in `properties` (lines 176-193). Structural validation (required fields, defect/active semantics, batch cap, signature-when-keyed) is unchanged. Net: the validator now accepts strictly more shapes; the only governance-relevant control retained is "agent-suppressed defects must carry proof." -- Artifact provenance is optional by default — when no `artifact_key` is configured, scanner/commit/tree provenance is accepted unverified (ingest.py:86-87). The verified path exists but is opt-in. - -**Confidence:** High — read all 4 files at 100%; traced `from_wire`, `active_defects`, and `route_findings` end-to-end; cross-checked commit bbed0ba's stated relaxations against the current source lines; verified the cross-store guard and the entity_key-type-only coupling by reading governor imports and `service/wardline.py` edges. - ---- - -## Filigree Integration - -**Location:** `src/legis/filigree/` - -**Responsibility:** Bind a cleared, SEI-keyed governance sign-off to a Filigree issue as an opaque entity-association (`entity_id` = SEI), so the code↔governance binding survives rename/move — without mutating Filigree issue lifecycle. - -**Key Components:** -- `client.py` (123 lines) — `FiligreeClient` Protocol (`attach`, `associations_for_entity`) and `HttpFiligreeClient` over stdlib `urllib` with injectable `fetch`. `attach` POSTs `{entity_id, content_hash, actor, signoff_seq?, signature?}` to `/api/issue/{id}/entity-associations`; `associations_for_entity` GETs `/api/entity-associations?entity_id=…`. Same base-URL HTTPS-unless-loopback validation, 1 MB cap, and JSON-content-type enforcement as the Loomweave client. -- (The binding orchestration lives outside this package, in `governance/signoff_binding.py:bind_signoff_to_issue` — read for the M4 trace below.) - -**Dependencies:** -- Inbound: - - `api/app.py:38` (`FiligreeClient`), `:308` (lazy `HttpFiligreeClient`) - - `governance/signoff_binding.py:21` (`FiligreeClient`) — the caller of `attach` -- Outbound: none to other `legis.*` modules. `client.py` imports only stdlib. - -**Patterns Observed:** -- Same transport posture as the Loomweave client (stdlib urllib, injectable fetch, no added dependency). -- Opaque-pointer binding: SEI handed as `entity_id`; Filigree never parses it; drift comparison stays legis's job (docstring). -- Authority separation: legis attaches an attestation but never mutates Filigree issue status (locked decision 5). - -**Concerns:** -- **M4 confirmed — deliberate rejection with a coupling consequence.** `bind_signoff_to_issue` (governance/signoff_binding.py:38-42) raises `ValueError` on any `identity_stable=False` (locator) key. This is intentional (docstring: an unstable binding would orphan on rename). The cataloguable consequence: when Loomweave is degraded or the locator has no alive SEI, the resolver returns a locator key, and the sign-off — though it can be *recorded* — **cannot be bound to Filigree at all**. Filigree binding availability is therefore coupled to Loomweave SEI capability; a degraded suite seam silently removes the binding surface for those sign-offs. The signoff_binding docstring acknowledges the rejection but not this availability coupling. -- **Transport is unsigned (asymmetry vs Loomweave).** `HttpFiligreeClient` carries no Weft-component HMAC — unlike `loomweave_client.py`, which signs protected routes with `X-Weft-Component`/timestamp/nonce. The `signature` passed to `attach` is an *application-level binding attestation* (produced by `enforcement.signing.sign` in `signoff_binding.py:44-53`), not transport authentication. The Filigree HTTP channel itself is unauthenticated. -- `attach`/`record` ordering in the caller is validate→attach→record with no compensating delete (signoff_binding.py:64-73): if the ledger `record` raises after a successful `attach`, Filigree holds a pointer with no local ledger entry (accepted trade-off — surfaced by the ledger's `verify()`). - -**Confidence:** High — read `client.py` and `__init__` at 100%, plus `governance/signoff_binding.py` (the M4 site) at 100%; cross-verified both inbound edges and the unsigned-transport asymmetry against the Loomweave client. - ---- - -## Git Domain - -**Location:** `src/legis/git/` - -**Responsibility:** Answer "what changed?" over a real repository by shelling out to `git` (stateless, repo-as-source-of-truth), and produce a structured rename/history feed for Loomweave's SEI identity matcher; also define the injectable forge-PR seam shape. - -**Key Components:** -- `surface.py` (207 lines) — `GitSurface` over `subprocess` `git -C`, 10 s timeout. `branches()` (ahead/behind via `rev-list --left-right`), `commit()`/`commits()` (numstat, US-delimited `--format`), `merge_base()` (honest `None` on no ancestor), `renames(rev_range)` (committed, `-M --diff-filter=R`, captures old/new blob SHAs), `working_tree_renames(base)` (uncommitted, hash-object for new blob). Every ref/SHA argument is regex-validated and rejects leading `-` (arg-injection guard, e.g. surface.py:80, 118, 137, 177). -- `rename_feed.py` (48 lines) — `build_rename_feed`: superset of `GET /git/renames`. Bundles base/head + committed renames, optionally working-tree renames. `status` reflects what was *found*; separate `worktree_checked` flag reflects what was *checked* (clean-vs-unchecked disambiguation). Contract-locked provider for Loomweave (committed-only consumer ignores worktree fields). -- `pull_request.py` (27 lines) — `PullRequestContext` dataclass + `PullRequestSource` Protocol: an injectable forge seam (no baked-in GitHub HTTP). -- `models.py` (45 lines) — passive `BranchInfo`, `CommitInfo`, `RenameEvidence` (path-level rename evidence; docstring explicitly disclaims symbol-level detection — that is Loomweave's). - -**Dependencies:** -- Inbound: - - `api/app.py:34` (`PullRequestSource`), `:35` (`build_rename_feed`), `:36` (`GitError`, `GitSurface`) - - `mcp.py:28` (`GitError`, `GitSurface`), `:957` (lazy `build_rename_feed`) -- Outbound: none to other `legis.*` modules. Internal only: `git/surface.py:13 → git.models`; `git/rename_feed.py:23 → git.surface`. Depends on stdlib `subprocess`/`re`/`pathlib`. - -**Patterns Observed:** -- Stateless reader; git is the source of truth, no added dependency. -- Defensive arg validation — regex + leading-dash rejection on every ref/range argument before it reaches `git`. -- Honest tri-state reporting (`status` found vs `worktree_checked` checked) so consumers never infer "clean" from "unchecked". -- Contract-locked additive provider: `rename_feed` is a superset of the committed-only endpoint; existing consumers unaffected. - -**Concerns:** -- **M2 (writer-facts-without-provenance) — does not apply to the git surface.** `GitSurface` reads facts directly from the repo, so there is no untrusted writer; the M2 concern is a checks/pulls property (see those blocks), not a git-domain one. -- `commit()` re-imports `re` inside each method (surface.py:79, 117, 124, 136, 176) rather than at module scope — minor style nit, no correctness impact. -- `working_tree_renames` shells `hash-object` per renamed file with no batch (surface.py:190); fine at PR scale, unbounded with a very large working-tree rename set. - -**Confidence:** High — read all 5 files (surface, rename_feed, pull_request, models, `__init__`) at 100%; traced rename committed + worktree paths and the arg-injection guards; both inbound edges grepped with file:line; confirmed git has no non-cluster outbound legis edge. - ---- - -## Checks - -**Location:** `src/legis/checks/` - -**Responsibility:** Record and serve CI check-run facts (named check ran against a code state → outcome), in an indexed relational table queryable by commit / branch / PR — deliberately NOT the hash-chained governance audit log. - -**Key Components:** -- `surface.py` (122 lines) — `CheckSurface` over its own SQLAlchemy `create_engine` (NullPool). `check_runs` table (indexed on check_name/commit_sha/branch/pr); idempotent additive migration adds `recorded_by` (lines 52-59). `record`, `for_commit`/`for_branch`/`for_pr`, `latest_state` (last write per check_name wins). -- `models.py` (34 lines) — `CheckOutcome` enum (pass/fail/skipped/timeout); frozen `CheckRun` (check_name, run_id, commit_sha, outcome, optional branch/pr/ran_against/rule_set/policy_version/timestamps/recorded_by). - -**Dependencies:** -- Inbound: `api/app.py:29-30` (`CheckOutcome`,`CheckRun`; `CheckSurface`), `mcp.py:20-21` (`CheckRun`; `CheckSurface`). -- Outbound: none to `legis.*`. External: SQLAlchemy; instantiates its **own** engine per surface (not the shared audit store). - -**Patterns Observed:** -- Operational facts vs governance trail: indexed queryable table, explicitly separated from the Sprint-0 append-only hash-chained audit log (docstring). -- Idempotent schema-evolution via `PRAGMA table_info` + conditional `ALTER TABLE`. -- Immutable fact records (frozen dataclass), but rows are mutable in practice (last-write-wins via `latest_state`). - -**Concerns:** -- **M2 confirmed (the checks half).** `CheckRun` is constructed from the API client's `model_dump()` with only `recorded_by=actor` attached (`api/app.py:466`). The check *outcome/commit_sha/run_id facts themselves are accepted on the writer's word* — no signature, no provenance verification, unlike the signed Wardline artifact path or the hash-chained audit log. `recorded_by` records *who submitted*, not that the fact is true. Architecturally this is by design (operational table, own engine, not the tamper-evident trail), but a consumer treating check outcomes as authoritative governance input would be trusting an unauthenticated writer. - -**Confidence:** High — read both files (surface, models) and `__init__` at 100%; confirmed the M2 write path at `api/app.py:466`; verified own-engine instantiation and the deliberate separation from the audit store. - ---- - -## Pulls - -**Location:** `src/legis/pulls/` - -**Responsibility:** Record and serve forge-reported pull-request metadata (number/title/base/head/state) in its own relational table — facts legis records, not local git. - -**Key Components:** -- `surface.py` (68 lines) — `PullSurface` over its own SQLAlchemy engine (NullPool). `pull_requests` table keyed on `number` (indexed base/head/state); idempotent `recorded_by` migration. `record` is delete-then-insert (upsert by number); `get`. -- `models.py` (23 lines) — `PullRequestState` enum (open/closed/merged); frozen `PullRequest` (number, title, base, head, state, optional url/recorded_by). -- `__init__.py` — re-exports `PullRequest`, `PullRequestState`, `PullSurface`. - -**Dependencies:** -- Inbound: `api/app.py:53-54` (`PullRequest`,`PullRequestState`; `PullSurface`), `mcp.py:36` (`PullSurface`). -- Outbound: none to `legis.*`. External: SQLAlchemy; own engine per surface. - -**Patterns Observed:** -- Same operational-table posture as checks; own engine, separate from the audit trail. -- Upsert-by-number via delete-then-insert in one transaction. - -**Concerns:** -- **M2 confirmed (the pulls half).** `PullRequest` is built from the client's `model_dump()` with only `recorded_by=actor` (`api/app.py:448`); PR state/base/head are accepted unauthenticated, same posture as checks. By design (recorded forge facts, not governance trail), but the writer's word is the only provenance. - -**Confidence:** High — read all 3 files at 100%; confirmed the M2 write path at `api/app.py:448`; verified own-engine instantiation. - ---- - -## Cross-Block Confidence / Risk / Gaps / Caveats - -**Confidence Assessment:** High across all six blocks. All 21 source files read at 100% (none exceed 226 lines). Every dependency edge grepped with file:line. The four prior-audit concerns (M2/M3/M4/M5) were each discriminated against current source: M5 not reproduced (with a confirming bypass-grep), M3 refined to intra-store, M4 confirmed with a coupling consequence, M2 confirmed at two precise write sites. - -**Risk Assessment:** Low risk in the read itself. The synthesis-relevant risks in the code: (1) intra-store non-atomic Wardline batches (governor.py:60-65); (2) Filigree binding availability coupled to Loomweave SEI capability (signoff_binding.py:38-42); (3) checks/pulls accept unauthenticated writer facts (api/app.py:448,466); (4) unsigned Filigree transport vs signed Loomweave transport. - -**Information Gaps:** Did not read the `service/wardline.py` orchestrator, `api/app.py`, or `mcp.py` bodies in full — only the specific edge/write lines (448, 466, 299-308, governor wiring). The exact shape of the injected `resolve` callable that `route_findings` receives was inferred from the governor signature + the service edge, not read end-to-end in the service layer. Loomweave/Wardline/Filigree wire contracts are taken from docstrings, not from the sibling repos. - -**Caveats:** "M5 not reproduced" and "M3 refined" reflect the tree at commit 2e69141 (current HEAD); the prior audit may have run against an earlier tree where the defects were live. The git-domain blocks disclaim symbol-level rename detection (that is Loomweave's matcher); `RenameEvidence` is path-level only. diff --git a/docs/arch-analysis-2026-06-06-0158/temp/validation-report.md b/docs/arch-analysis-2026-06-06-0158/temp/validation-report.md deleted file mode 100644 index 7cef8a5..0000000 --- a/docs/arch-analysis-2026-06-06-0158/temp/validation-report.md +++ /dev/null @@ -1,83 +0,0 @@ -# Validation Report — arch-analysis-2026-06-06-0158 - -**Validator:** independent analysis-validation gate (read-only) -**Date:** 2026-06-06 -**Target of validation:** `docs/arch-analysis-2026-06-06-0158/` deliverables 01–06, evidence base `temp/catalog-*.md` and `temp/AUDIT-*.md` -**Method:** source-level spot-check of highest-stakes claims (Read/Grep), live tooling re-run (ruff, coverage), internal-consistency sweep across 02/04/05, contract-conformance checklist, citation/metric hallucination hunt. - ---- - -## Overall verdict: **PASS-WITH-NOTES** - -The analysis is **evidence-backed and accurate** on every high-stakes structural and security claim spot-checked. Every required claim verified to `confirmed` against source at the cited (or adjacent) `file:line`. No claim refuted. No subsystem, finding, or metric was hallucinated. Tooling metrics (mypy-clean, 90% coverage / 3,453 stmts / 329 missed, 2 ruff F401, 63 files, ~7,353 LOC) reproduce against the live tree. - -Three **NOTE-level** issues hold it back from a clean PASS — all are label/metric/citation imprecision, none refutes a finding or breaks a contract section, none is BLOCK-level: - -- **N1 (consistency):** `04 §6` mislabels finding **M6** as "new this pass / not in prior audits" while `05` and `02` correctly call it a prior-audit baseline. The prior audit *does* contain it (`AUDIT-comprehensive.md:340`). Internal contradiction; underlying defect is source-confirmed. -- **N2 (metric):** `05` reports **480 test functions**; live count is **492** `def test_` across the same 68 files. Minor over-precision; direction (492>480) rules out parametrize-expansion as the explanation. -- **N3 (citation precision):** `05` cites Q-M1 at `service/source_binding.py:82-89`, which is the fail-closed *guard*; the actual "signs unverified" mechanism is the early-return at `:46-50` + write at `governance.py:170`. Substance correct, citation adjacent-not-exact. - ---- - -## Spot-checked claims (evidence-based) - -| Claim | Verdict | Evidence (file:line) | -|---|---|---| -| **Q-H1** `_verify_secret` returns actor on `LEGIS_API_SECRET` match **without** consulting `required_scope` | **Confirmed** | `api/app.py:108-116` — secret path returns `LEGIS_API_ACTOR`/default at :116; `required_scope` param (:103) never read on this branch | -| **Q-H1** `/protected/operator-override` is operator-scoped | **Confirmed** | `api/app.py:558-559` route → `Depends(verify_operator)`; `verify_operator`→`_verify_secret(...,"operator")` :142-143 | -| **Q-H1** `/signoff/{seq}/sign` is operator-scoped | **Confirmed** | `api/app.py:677` `post_signoff_sign(... operator=Depends(verify_operator))` — both operator routes thus reachable by a writer secret | -| **C3 RESOLVED** mcp `_verified_records` routes through `service.verified_records`/`TrailVerifier` | **Confirmed** | `mcp.py:649-651` `_verified_records`→`service_verified_records` (import alias :51); `TrailVerifier` imported :25, constructed :141 | -| **M11 RESOLVED** `override_submit` has idempotency-key handling | **Confirmed** | `mcp.py:562` `_override_idempotency_request_hash`; :690-736 override_submit reads `idempotency_key`, computes request-hash, replays via :587-596 | -| **C2 RESOLVED** mcp Wardline routing is server-owned (not caller-chosen) | **Confirmed** | `mcp.py:872-881` rejects caller routing — "Wardline routing is server-owned"; mirrors HTTP | -| **M9 RESOLVED** unknown mcp args rejected | **Confirmed** | `mcp.py:375` `_validate_argument_keys`, invoked :678 | -| **M10 RESOLVED** `poll_handle` integer | **Confirmed** | `mcp.py:620,791` `poll_handle` = integer `seq` | -| **Q-M3 / M6** verify_integrity loop-body `content_hash(rec.payload)` unguarded while `read_all()` guarded | **Confirmed** | `store/audit_store.py:163-166` try/except wraps `read_all()`; :168 `content_hash(rec.payload)` is OUTSIDE the try, inside the loop — `allow_nan=False` raises `ValueError` on tampered non-finite payload | -| **Dependency** enforcement does NOT import `legis.governance` or `legis.policy` | **Confirmed** | `grep src/legis/enforcement/` → 0 matches for governance/policy; all imports are canonical/clock/records/identity/store/intra-enforcement | -| **mcp → api coupling** mcp imports `DEFAULT_GOVERNANCE_DB`/`DEFAULT_CHECK_DB` from `legis.api.app` | **Confirmed** | `mcp.py:115,496` `from legis.api.app import DEFAULT_GOVERNANCE_DB`; :505 `DEFAULT_CHECK_DB` (defined `api/app.py:146-147`) | -| **Q-M1** non-`.py` protected entities sign `source_binding: unverified` (guard fails to catch) | **Confirmed** (substance) | `service/source_binding.py:46-50` returns `status:"unverified"` for non-`.py`; `require_verified_source_binding` :84-85 early-returns (no-op) when not a `.py` locator; `governance.py:157-170` writes that binding into signed extensions. **Cited :82-89 is the guard, not the signing site → N3.** | -| **Q-M6** signoff binding rejects `identity_stable=False` (locator) keys | **Confirmed** | `governance/signoff_binding.py:38-42` exact reject at cited lines | -| **Q-M1 mitigation** `.py` entities DO fail closed on unverified | **Confirmed** | `service/source_binding.py:82-89` raises `InvalidArgumentError` when a `.py` locator isn't verified | -| **ruff** 2 × F401 incl. `Hashable` in `policy/grammar.py:15` "+ one more" | **Confirmed** | live `ruff check src/` → 2 errors: `grammar.py:15` Hashable + `api/app.py:56` `WardlinePayloadError` | -| **coverage** 90% / 3,453 stmts / 329 missed | **Confirmed** | live `coverage report` TOTAL 3453 / 329 / 90% | -| **LOC** mcp 1123, api 830, policy 1072, enforcement 1062, 63 files, ~7,353 total | **Confirmed** | `wc -l`: mcp.py 1123, api/app.py 830, policy 1072, enforcement 1062; `find` → 63 files / 7,353 total | -| **test count** 480 test functions / 68 files | **Partially confirmed** | 68 test-module files correct; `def test_` count is **492**, not 480 → **N2** | - -**Tally: 16 confirmed · 1 partially-confirmed (test count) · 0 refuted · 0 unverifiable.** - ---- - -## Internal-consistency findings - -| # | Status | Detail | -|---|---|---| -| **N1** | **Contradiction (NOTE)** | **M6 provenance.** `04 §6` (line ~190) lists "M6 unguarded `content_hash` in the verify loop" under *"New findings surfaced this pass (not in prior audits)"* — yet the same `04 §6` table (line 187) calls M6 a baseline finding "Confirmed live," and `05` Q-M3 + `02` Store concern both label it "Baseline M6, PARTIALLY closed." Prior audit `AUDIT-comprehensive.md:340` ("M6. Audit integrity verification can raise decode exceptions") confirms M6 IS a prior-audit finding. So `04 §6`'s "new" tag is wrong; `05`/`02` are correct. Defect itself is source-confirmed (`audit_store.py:168`); only the new-vs-baseline label is inconsistent. | -| ✓ | Consistent | Finding-ID mapping Q-M3↔M6, Q-M1↔M1, Q-M6↔M4, Q-M7↔H6, Q-H1↔H7-adjacent is applied uniformly across 04/05/02. | -| ✓ | Consistent | Resolved/live status agrees across docs for C1/C2/C3/H1/H5/M9/M10/M11 (resolved), M1/M2/M7/H3/H6 (live), M5/M12/M13 (not-reproduced / partial). | -| ✓ | Consistent | `04 §3.4` three-implementation override-rate claim matches `05` Q-H2, `06` item 2, and the diagram dashed CLI-bypass edges (`03:85-86`). | -| ✓ | Consistent | Diagram ↔ catalog: `03` L0–L7 layering (canonical/clock/identity.*/filigree.client/governance.params @L0; resolver/records/store/policy @L1; enforcement @L2; governance/wardline @L3; service @L4; api/mcp/cli @L5–7) matches `02`/`04 §2` exactly. | -| ~ | Minor | `01` lists `api/` 831 LOC; `04`/`wc` use 830 (`api/app.py` 830, package incl. `__init__` 831). Off-by-one, harmless. | - ---- - -## Contract conformance (Option-C / Architect-Ready) - -| Deliverable | Required | Verdict | -|---|---|---| -| `02` catalog | Location · Responsibility · Dependencies (bidirectional, file:line) · Concerns · Confidence per subsystem | **PASS** — every subsystem carries all five; edges grepped with `file:line`; inbound+outbound both stated; per-subsystem confidence noted | -| `03` diagrams | present, abstraction-appropriate (C4 levels), match catalog | **PASS** — 5 mermaid: L1 Context, L2 Container (with central partial-seam finding), protected-flow Component, L4 dependency-layer; subsystems/layers match `02` | -| `04` final report | exec summary · subsystem map · cross-flows · strengths · concerns · remediation delta · confidence/limits | **PASS** (with N1 label inconsistency in §6) — all sections present, cross-flows are the load-bearing addition; limitations section honest about cross-repo wire contracts | -| `05` quality | real tooling signals (measured), finding inventory, CI review, verdict | **PASS** (with N2 metric) — mypy/ruff/coverage/CI signals are live-measured and reproduce; per-subsystem coverage table; severity-tiered inventory with status reconciliation | -| `06` handover | risk-ordered roadmap, concrete entry points, architect decisions | **PASS** — Tier 1/2/3 risk-ordered, every item has `file:line` entry point + effort, sequencing + receiving-architect checklist | -| `01` discovery | inventory, stack, entry points, orchestration decision | **PASS** — inventory/LOC/entry-points verified by direct measurement | - ---- - -## BLOCK-level issues - -**None.** No claim refuted, no contract section missing, no hallucinated subsystem/finding/metric. The single internal contradiction (N1) is a provenance label, not a defect-existence error, and the defect is source-confirmed. - -## Must-fix (NOTE) before downstream consumption - -1. **N1** — reconcile M6's new-vs-baseline label in `04 §6` to match `05`/`02` (it is a prior-audit baseline finding, partially closed). -2. **N2** — correct the `05` test-function count (live: 492, not 480) or document the counting method. -3. **N3** — repoint the Q-M1 citation in `05` from `source_binding.py:82-89` (the guard) to the unverified-return site (`:46-50`) and/or `governance.py:170` (the signing-into-extensions site). diff --git a/docs/arch-analysis-2026-06-28-2142/00-coordination.md b/docs/arch-analysis-2026-06-28-2142/00-coordination.md new file mode 100644 index 0000000..8b8c13b --- /dev/null +++ b/docs/arch-analysis-2026-06-28-2142/00-coordination.md @@ -0,0 +1,45 @@ +# 00 — Coordination Plan + +## Analysis Configuration +- **Subject:** Legis — the git/CI + governance layer of the Weft suite. +- **Scope:** `src/legis/` (~16,585 LOC, ~78 Python files, ~17 cohesive subsystems). +- **Deliverables:** **Option C — Architect-Ready** = discovery + full subsystem catalog + C4 diagrams + final report + code-quality assessment + architect handover. +- **Strategy:** **PARALLEL** — ≥5 independent subsystems, ~16.5K LOC, mostly loosely coupled around a central `service/` hub. Eight reviewer agents over coherent clusters, then validation, then synthesis. +- **Branch analyzed:** `fix/policy-boundary-containment` (off `main` `25d64e2`) — current working tree. +- **Complexity estimate:** High (crypto/audit-chain, 2×3-way error mapping, federation seams, three transports). + +## Scope limitations (documented, honest) +- `governance_read.v1` (service fn + transports) and the contents of `plainweave_preflight/` are **release-only / in-flight on PR #21** (`release/1.3.0-federation-reads`), NOT in this `main`-based checkout. They are described from CLAUDE.md/PDRs where relevant but not analyzed from source here. `warpline_preflight/` IS present (143 LOC). +- A pre-existing, unrelated dirty working-tree state exists under `docs/arch-analysis-2026-06-06-0158/` (a June-6 analysis showing deleted-but-tracked files). Left untouched; out of scope. +- All analysis artifacts are kept **untracked** (not committed) so nothing pollutes open PR #22 (the security fix) or PR #21 (the release). + +## Orchestration — eight reviewer clusters +| # | Cluster | Paths | ~LOC | +|---|---------|-------|------| +| 1 | Service layer (governance truth hub) | `service/` | 1497 | +| 2 | Enforcement & Policy (the 2×2 engine) | `enforcement/`, `policy/` | 2553 | +| 3 | Identity & Persistence (SEI + audit chain) | `identity/`, `store/`, `canonical.py`, `weft_signing.py`, `provenance.py`, `records/` | ~1416 | +| 4 | Transports (thin adapters) | `mcp.py`, `api/`, `cli.py` | ~4547 | +| 5 | Posture (floor + operator key + elevation) | `posture/` | 1331 | +| 6 | Federation seams | `wardline/`, `filigree/`, `governance/`, `warpline_preflight/` | ~1663 | +| 7 | Git/CI surfaces | `git/`, `checks/`, `pulls/` | 615 | +| 8 | Runtime/Ops | `install.py`, `doctor.py`, `hooks.py`, `config.py` | ~2930 | + +## Execution Log +- [2142] Created workspace `docs/arch-analysis-2026-06-28-2142/`. +- [2142] User selected **Option C (Architect-Ready)**. +- [2142] Holistic scan: LOC/file counts, entry point, cross-subsystem import matrix captured. +- [2143] Strategy = PARALLEL; 8 reviewer clusters defined (above). +- [next] Write `01-discovery-findings.md`; dispatch 8 parallel reviewers → `temp/catalog-*.md`; merge to `02-subsystem-catalog.md`; validate; synthesize `03`–`06`. + +## Execution Log (continued) +- [2143] Wrote 01-discovery-findings.md. +- [2144] Dispatched 8 parallel codebase-explorer reviewers → temp/catalog-*.md. All 8 returned with file:line-cited entries. +- [2145] Assembled 02-subsystem-catalog.md (687 lines). +- [2145] Controller-verified 2 surprising findings (posture read_floor = main-checkout artifact; keychain = deliberate fail-closed stub). +- [2146] Validation subagent STALLED mid-stream (API infra error, no report). Completed validation as controller with documented file:line evidence → temp/validation-catalog.md. Verdict PASS_WITH_NOTES; 4 high-severity-sounding concerns reclassified (none a live false-green). +- [2147] Synthesized 03-diagrams.md, 04-final-report.md, 05-quality-assessment.md, 06-architect-handover.md from the validated catalog. +- [DONE] Option C deliverables complete. Synthesis docs derive from the validated catalog (the load-bearing layer); reclassifications applied throughout. + +## Validation-gate note +The mandatory multi-subsystem validation ran against the catalog (the source-of-truth document). It stalled as a subagent (infra), so the controller completed it with per-claim evidence — a deviation from the independent-subagent gate, recorded transparently. The four synthesis documents (03–06) are derived from the validated catalog and apply its reclassifications; they were not separately subagent-validated. diff --git a/docs/arch-analysis-2026-06-28-2142/01-discovery-findings.md b/docs/arch-analysis-2026-06-28-2142/01-discovery-findings.md new file mode 100644 index 0000000..dd2346b --- /dev/null +++ b/docs/arch-analysis-2026-06-28-2142/01-discovery-findings.md @@ -0,0 +1,64 @@ +# 01 — Discovery Findings (Holistic Assessment) + +> Evidence base: directory/LOC scan, `pyproject.toml`, cross-subsystem import matrix, and `CLAUDE.md`/`README` framing, all verified against `src/legis/` on the analyzed checkout. Subsystem-level depth is produced by the parallel reviewers (`02-subsystem-catalog.md`). + +## 1. What Legis is +The git/CI + **governance** layer of the Weft four-tool suite (Loomweave, Wardline, Legis, Filigree). It answers *"what changed, in which branch/commit/PR/check context, and what governance/attestation state exists for that change?"* and enforces agent-defined policy at the git/CI boundary via a graded **2×2 enforcement model**. Defining stance: a **governance-honesty** tool — the cardinal sin is a **false-green** (reporting clearance when nothing was governed); code paths fail **closed**. + +## 2. Technology stack +- **Language/runtime:** Python ≥3.12, `uv`-managed. +- **Persistence:** SQLite (5 stores under `.weft/legis/`), append-only HMAC-signed audit chain (v3 chain-position binding). +- **Crypto:** HMAC signing (`weft_signing.py`, `canonical.py` byte-for-byte JSON contract with Wardline), OS-keychain operator key (posture). +- **Transports:** HTTP (FastAPI-style `api/app.py`), MCP stdio (`mcp.py`), CLI (`cli.py`). +- **Quality gates:** ruff (scoped `E4,E7,E9,F`), mypy, pytest+coverage (global 88 floor + stricter per-package floors), SEI conformance oracle, a self-hosted governance-honesty policy-boundary scanner, override-rate governance-gate. + +## 3. Organization model +**Hybrid layered + domain.** A central `service/` layer holds governance truth; three thin transport adapters sit above it; domain subsystems (enforcement, policy, identity, store, posture, federation seams, git/CI surfaces) sit beside/below it; a foundation of leaf utilities (`canonical.py`, `clock.py`, `weft_signing.py`, `provenance.py`, `config.py`, `records/`) underpins everything. + +## 4. Entry points +- **CLI / all transports:** `legis = legis.cli:main` (`pyproject.toml [project.scripts]`). The CLI also launches the MCP server (`legis mcp`) and the HTTP app. +- **HTTP:** `api/app.py` (imports nearly every surface + `mcp`). +- **MCP:** `mcp.py` (`build_runtime`, `call_tool`, `tool_definitions`) — ~21+ tools. +- **Install/runtime bootstrap:** `install.py` (`legis install`), `doctor.py` (`legis doctor [--fix]`), `hooks.py` (SessionStart). + +## 5. Subsystem inventory (by LOC) +| Subsystem | LOC | Role (one line) | +|-----------|-----|-----------------| +| `mcp.py` | 2748 | MCP stdio adapter; ~21 tools; ServiceError→error-envelope mapping | +| `install.py` | 1503 | Stand-up: instruction block, skill pack, hooks, `.mcp.json`, posture genesis | +| `service/` | 1497 | **Single source of governance truth**; adapters call into it | +| `enforcement/` | 1367 | The 2×2 engine: engine, judge, protected, signing, signoff, verdict, lifecycle | +| `posture/` | 1331 | Posture floor, OS-keychain operator key, sudo-style elevation sessions, ledger | +| `policy/` | 1186 | Policy grammar, cells (2×2 routing), boundary scanner, `@policy_boundary` decorator | +| `doctor.py` | 1002 | Operator health/repair; tags problems `[auto-fixable]`/`[operator]` | +| `api/` | 955 | HTTP adapter (ServiceError→status codes) | +| `cli.py` | 844 | CLI adapter (ServiceError→exit codes) + subcommands | +| `wardline/` | 728 | Ingest Wardline findings; route into enforcement cells | +| `store/` | 667 | SQLite stores, head anchor, audit protocol | +| `governance/` | 607 | SEI-keyed sign-off binding, filigree gate | +| `identity/` | 542 | Loomweave SEI seam (client, resolver, entity_key) — consumer only | +| `git/` | 328 | Branch/commit/PR/check context + rename-feed provider | +| `hooks.py` | 222 | SessionStart hook | +| `config.py` | 203 | Env/`LEGIS_*_DB` store resolution | +| `filigree/` | 185 | Filigree consumer (bind/closure) | +| `checks/` | 175 | CI check surface | +| `warpline_preflight/` | 143 | Advisory preflight consumer | +| `pulls/` | 112 | PR surface | +| leaf utils | ~240 | `weft_signing.py`, `canonical.py`, `clock.py`, `provenance.py`, `records/` | + +## 6. Dependency overview (from the import matrix) +- **Hub:** `service/` → enforcement, identity, governance, wardline, policy, warpline_preflight, canonical. All three transports import `service/` (HTTP 15, MCP 6, CLI 2) — the "adapters over a truth layer" architecture is real, not aspirational. +- **Foundation (imported, imports little):** `canonical.py`, `clock.py`, `weft_signing.py`, `provenance.py`, `config.py`, `records/`. +- **`git/` is self-contained** (imports no other legis subsystem) — clean. +- **Coupling signals to probe in the quality pass (candidate concerns):** + 1. **`store/ ↔ enforcement/` bidirectional** — `store` imports `enforcement` and `enforcement` imports `store`. Possible layering tangle. + 2. **`policy/ → service/`** — policy (a lower-level grammar/scanner) imports the service hub; inversion risk (the boundary-scan containment fix used a *deferred* `service.errors` import precisely to avoid a load-time cycle — corroborates that this edge is delicate). + 3. **`posture/ → install/`** — posture imports the 1503-LOC ops module; heavy/odd dependency direction. + 4. **`api/ → mcp`** — the HTTP adapter imports the MCP module (transport-on-transport). + 5. **`mcp.py` (2748) and `install.py` (1503) are very large single files** — god-module risk; candidates for the quality pass. + +## 7. Orchestration decision +**PARALLEL**, eight reviewer clusters (see `00-coordination.md`). Rationale: ≥5 independent subsystems, ~16.5K LOC, a clean hub-and-adapters shape with only a handful of coupling edges to watch — well suited to independent per-cluster review followed by central synthesis and a validation gate. + +## 8. Confidence +**High** on structure, entry points, and dependency direction (directly measured). **Medium** pending the reviewers on internal patterns, invariants, and concerns per subsystem. Two release-only areas (`governance_read.v1`, `plainweave_preflight/` contents) are out of scope on this checkout (§ limitations in `00-coordination.md`). diff --git a/docs/arch-analysis-2026-06-28-2142/02-subsystem-catalog.md b/docs/arch-analysis-2026-06-28-2142/02-subsystem-catalog.md new file mode 100644 index 0000000..3a9b5e3 --- /dev/null +++ b/docs/arch-analysis-2026-06-28-2142/02-subsystem-catalog.md @@ -0,0 +1,687 @@ +# 02 — Subsystem Catalog + +> Produced by 8 parallel codebase-explorer reviewers over the clusters in 00-coordination.md. Each entry cites file:line evidence. Validation report: temp/validation-catalog.md. + +## Service layer + +**Location:** `src/legis/service/` + +**Responsibility:** Acts as the single, transport-agnostic governance decision authority that all three transports (HTTP `api/app.py`, MCP `mcp.py`, CLI `cli.py`) call into, raising typed `ServiceError` subclasses on failure so each adapter owns its own error-shape translation. + +**Key Components:** +- `__init__.py` — Public surface of the layer; re-exports 8 error types, 2 data classes, and 14 service functions as the defined contract adapters import from (`service/__init__.py:9–63`). +- `errors.py` — `ServiceError` taxonomy: 9 typed subclasses covering audit integrity (`AuditIntegrityError`), enablement gates (`NotEnabledError`), resource absence (`NotFoundError`, `NoSuchRequestError`), state conflicts (`NotClearedError`, `BindingUnavailableError`), bad input (`InvalidArgumentError`, `UnresolvedInputError`), routing failures (`WardlineRoutingError` with `SERVER_MISCONFIGURED`/`SERVER_OWNED`/`MALFORMED` kind discriminator), and key-absent protected reads (`ProtectedKeyRequiredError`). Adapters switch on type and `.kind`, never on message text (`errors.py:1–99`). +- `governance.py` — Core decision logic: `resolve_for_record`/`resolve_for_entry` (the single SEI-on-entry resolve boundary, failing closed when Loomweave absent or SEI mismatches, `governance.py:43–154`); `verified_records` (full O(N) trail verification on every call, `governance.py:157–206`); `compute_override_rate` / `evaluate_override_rate_gate` (threshold/window hardcoded from ADR-0002 constants, not caller input, `governance.py:209–409`); `submit_override`, `submit_protected_override`, `submit_operator_override`, `request_signoff`, `sign_off`, `bind_signoff_issue` (all wired through `resolve_for_entry` and gate-null checks, `governance.py:412–747`); `read_identity_gaps`, `read_lineage_integrity` (GOV-1/GOV-2 honesty reads — always `"unavailable"` vs `"checked"`, never an empty list that reads as all-clear, `governance.py:556–632`); `read_sei_attestations` (forge-proof discriminator for operator_override/signoff_cleared; asymmetric error rule: ambiguous → omit, never surface, `governance.py:223–350`); `evaluate_policy` (records UNKNOWN provenance gaps, `governance.py:749–767`). +- `explain.py` — `explain_policy`/`explain_cell` data types and logic: routes through `FlooredRegistry.cell_for` (not raw rule.cell) so posture floor is respected; `policy_known` boolean distinguishes configured policy from hallucinated/unconfigured name (`explain.py:87–108`); `explain_cell` is the single source of truth for per-cell `enabled`/`available_moves`, ensuring `policy_list` and `policy_explain` cannot disagree (`explain.py:111–175`). +- `wardline.py` — `resolve_scan_routing` (the single home for the server-owned-vs-request routing decision, raising `WardlineRoutingError` with kind for adapter mapping, `wardline.py:58–171`); `route_wardline_scan` (verifies artifact provenance, extracts active defects, resolves entity keys, routes findings through enforcement, returns `RoutedScan` with `artifact_status` + `artifact_status_reason` always present — no posture without provenance, `wardline.py:196–249`). +- `preflight.py` — `read_warpline_preflight`: advisory Warpline read; unconfigured/unreachable → `"unavailable"` with reason, never an empty affected-set that reads as "nothing impacted" (`preflight.py:16–38`). +- `source_binding.py` — `verify_current_source_binding` / `require_verified_source_binding`: fail-closed SHA-256 fingerprint check for protected submissions from Python source-path locators; non-path entities record honest `"unverified"` rather than being rejected; `source_binding_status` is folded into HMAC-signed fields so consumers can distinguish (`source_binding.py:31–107`). + +**Dependencies:** +- Inbound: `legis.api.app` (HTTP transport), `legis.mcp` (MCP transport), `legis.cli` (CLI transport) — all three import only from `service/` for governance decisions. +- Outbound: `legis.enforcement` (engine, lifecycle, protected gate, signoff gate, verdict), `legis.identity` (resolver, entity_key, loomweave_client), `legis.policy` (grammar, cells/FlooredRegistry), `legis.wardline` (governor, ingest, policy), `legis.warpline_preflight.client`, `legis.canonical` (content_hash), `legis.governance` (params, gaps, signoff_binding). + +**Patterns Observed:** +- Gate-null fail-closed: every function that requires a gate (`protected_gate`, `signoff_gate`, `filigree`) checks for `None` first and raises `NotEnabledError` naming the operator knob, before any computation (`governance.py:466–470`, `543–545`, `688–693`). +- Single resolve boundary: all identity resolution flows through `resolve_for_entry` (SEI-on-entry, L1/L2 paths) or `resolve_for_record` (record-side), never re-derived in gate/engine layers (`governance.py:43–154`). +- Asymmetric error rules: false-positive safety is the cheaper failure mode; `read_sei_attestations` omits any ambiguous record, `read_identity_gaps`/`read_lineage_integrity` always discriminate `"unavailable"` vs `"checked"`/`"verified"`, `read_warpline_preflight` always has a reason alongside `"unavailable"` (`governance.py:229–232`, `preflight.py:1–9`). +- Adapter isolation: `errors.py` docs map each error type to HTTP status codes and MCP error codes, but the service layer raises only `ServiceError` subclasses with structured attributes (`.kind`, `.cause`, `.fix`) — adapters switch on type, never text (`errors.py:1–6`). +- Policy constants hardcoded out of reach: override-rate threshold/window/floor sourced from `params` module constants, not caller input (`governance.py:214–220`). +- Full trail verification on every interactive read: deliberate O(N) cost; comment explicitly rejects incremental verification as a tamper window (`governance.py:180–193`). +- `UnresolvedInputError` in `errors.py` is NOT re-exported from `__init__.py` (line 9–18): it is raised internally by `governance.py` but absent from the public surface, meaning adapters that import only from `service/` cannot `except UnresolvedInputError` by name without an additional import. `WardlineRoutingError`, `ProtectedKeyRequiredError` are similarly absent from `__all__`. + +**Concerns:** +- `UnresolvedInputError`, `WardlineRoutingError`, and `ProtectedKeyRequiredError` are defined in `errors.py` and raised from service functions but are NOT listed in `__init__.py`'s `__all__` or imports (`__init__.py:9–63`). Adapters relying only on `from legis.service import ...` cannot catch these by name without an extra `from legis.service.errors import ...`. This is a latent import discipline gap: the MCP and HTTP adapters presumably do import them directly from `errors`, but the omission from the declared public surface is inconsistently documented and risks future callers missing them. +- `sign_off` (operator sign-off on a pending request, `governance.py:724–746`) is implemented in `governance.py` but is not exported from `__init__.py` (absent from `__all__` and import list). This means it is not reachable via the declared public surface (`service/__init__.py:37–63`). If an adapter calls it via `from legis.service.governance import sign_off` it works, but the public-surface contract is violated. +- Verified: error handling is present throughout. No resource handles opened at service layer. No function returns `None` or `[]` on a failure path that an adapter could read as a governance pass — all failure paths raise. Warpline/identity-gap/lineage reads use explicit `"unavailable"` status, never silent empty. Source binding is recorded honestly (`"unverified"`) for non-path entities rather than rejected, with the status folded into HMAC fields — consumer read-side discipline is noted in comments but not enforced at this layer. + +**Confidence:** High — read 100% of all 7 files in `src/legis/service/` (errors.py, __init__.py, governance.py, explain.py, wardline.py, preflight.py, source_binding.py). Cross-validated export surface against `__init__.py:9–63` and function definitions. Dependency claims verified against import statements in each file. Honesty patterns confirmed by reading all decision paths and inline comments. + +--- + +## Enforcement + +**Location:** `src/legis/enforcement/` + +**Responsibility:** Implements the governance 2x2 enforcement engine: routes overrides through the appropriate cell (chill, coached, structured, or protected), appends every decision to an HMAC-signed append-only audit trail, and provides lifecycle gates (decay sweep, override-rate check) for the protected cell. + +**Key Components:** +- `engine.py` (120 lines) — Simple-tier engine (chill + coached cells). `EnforcementEngine.submit_override` is the single entry point; when `judge=None` the record appends unconditionally (chill), otherwise the `LLMJudge` evaluates before append (coached). Every submission appends exactly one record — no silent path. +- `judge.py` (186 lines) — Coached-cell judge. Defines `LLMJudge`, `parse_verdict` (fail-closed: anything not an explicit ACCEPTED is BLOCKED), and `_parse_structured_response` (JUDGE-3: rejects `OVERRIDDEN_BY_OPERATOR` from model output). Prompt injection defense: 8192-char serialized-request cap (JUDGE-1); JSON-serialized request prevents structural key injection (JUDGE-2). +- `protected.py` (422 lines) — Protected cell. `ProtectedGate.submit` routes through the LLM judge then requires a deterministic non-LLM `ProtectedValidator` to confirm ACCEPTED (JUDGE-3 / Q-H3); any validator exception is a veto. `_record_signed` computes an HMAC-SHA256 signature over a defined field set including `chain_seq` (v3 / AUD-1 position binding). `TrailVerifier` re-checks signatures on read and optionally checks `HeadAnchor` for tail-truncation detection. +- `signing.py` (62 lines) — HMAC-SHA256 signing primitive. v2 binds record content only; v3 additionally binds `chain_seq` to close delete-and-rechain forgery. `canonical_json` from `legis.canonical` is the serialization contract (ensure_ascii=False is intentional per the cross-tool HMAC contract with Wardline). +- `signoff.py` (194 lines) — Structured/protected sign-off gate. `SignoffGate.request` writes `PENDING_SIGNOFF` (does NOT clear the gate); `sign_off` writes `SIGNED_OFF` referencing the request by seq and payload hash. Protected sign-offs are HMAC-signed with v3 chain_seq binding. Anchor advance is batch-aware (Q-M5). +- `verdict.py` (51 lines) — Value types shared across the engine. `Verdict.model_emittable()` is the single source of truth for what an LLM may return (ACCEPTED or BLOCKED only); `Verdict.accepting()` is the single source of truth for what counts as cleared. Both are checked by name, not re-listed. +- `lifecycle.py` (135 lines) — Protected-cell lifecycle gates. `decay_sweep` re-judges each ACCEPTED suppression via the live judge (skips BLOCKED and OVERRIDDEN_BY_OPERATOR). `evaluate_override_rate` computes operator-override share over the most recent window; PASS_WITH_NOTICE when below `min_sample`. +- `judge_factory.py` (36 lines) — Runtime wiring. `FailClosedJudge` is the default when no LLM is configured (always returns BLOCKED). `build_judge_from_env` returns the real `LLMJudge` or `FailClosedJudge`, never None from the coached/protected wiring paths. +- `llm_client.py` (169 lines) — OpenRouter-backed LLM client. Validates HTTPS (loopback exception), blocks HTTP redirects, caps response size at 1 MB. API key read from `OPENROUTER_API_KEY` environment variable at config time. + +**Dependencies:** +- Inbound: `service/` (wires the engine/gates into governance decisions), `api/`, `mcp.py`, `cli.py` (thin adapters that consume `service/`) +- Outbound: `legis.canonical` (signing serialization), `legis.clock` (timestamp injection), `legis.identity.entity_key` (SEI-keyed records), `legis.records.override_record` (record schema), `legis.store.protocol` / `legis.store.head_anchor` (append-only store + tail-truncation anchor) + +**Patterns Observed:** +- Protocol-typed injection for all external seams (`AppendOnlyStore`, `Clock`, `Judge`, `LLMClient`, `ProtectedValidator`) — every dependency is testable without a network or real store. +- Every verdict path appends exactly one record (no silent governance path); the only way to not append is to raise, which is fail-closed. +- Single sources of truth: `Verdict.model_emittable()` for LLM-emittable verdicts, `Verdict.accepting()` for what counts as cleared, `signing_fields()` shared by write and verify paths so they cannot drift. +- v3 chain position binding: signature includes `chain_seq` from the database column (not a payload field), closing delete-and-rechain attacks. +- Validator exception handling in `ProtectedGate.submit` (protected.py:355-358): any exception from the user-supplied validator is caught and treated as a veto (fail-closed), preventing an unexpected record shape from surfacing as a fail-open 500. +- `FailClosedJudge` sentinel: the coached/protected paths never degrade to a nil judge on misconfiguration; absence of LLM config produces an always-BLOCKED fallback, not an open path. + +**Concerns:** +- `TrailVerifier._requires_verification` (protected.py:133-142) derives verification requirement from in-record fields (`protected_cell`, `file_fingerprint`, etc.). An actor with raw DB write access can strip these markers and downgrade a protected record to unsigned — the verifier then skips it. This is a documented residual (protected.py:100-113) of the raw-file-write threat tier, mitigated only by the opt-in `HeadAnchor`. The `HeadAnchor` itself has a documented anchor-replay caveat (re-writing the anchor to match a truncated trail). Both are known, stated residuals, not silent gaps. +- `SignoffGate.is_cleared` (signoff.py:163-170) performs a full linear scan of all governance records for each is-cleared query. Not a correctness issue, but a potential performance concern on long-lived trails. No pagination or index is used. +- The `ProtectedValidator` callable type alias (`protected.py:203`) has no interface contract beyond `Callable[[OverrideRecord], bool]`. There is no documented precondition about what fields of `OverrideRecord` the validator may trust. The exception-as-veto guard mitigates unexpected inputs, but validator authors have no formal contract to code against. +- `llm_client.py` API key (`OPENROUTER_API_KEY`) is read from the environment at `llm_client_config_from_env()` call time (llm_client.py:44), which is at server startup. No rotation/reload mechanism is visible in this module; a key rotation requires a server restart. + +**Confidence:** High — Read 100% of all 8 files in the subsystem. Cross-verified verdict-path claims by tracing `submit_override` (engine.py:52-97), `ProtectedGate.submit` (protected.py:303-387), and `TrailVerifier.verify` (protected.py:144-197). Cross-verified signing field set against both write path (`_record_signed`, protected.py:241-301) and verify path (`TrailVerifier.verify`, protected.py:144-197). Checked `Verdict.model_emittable()` usage in `_parse_structured_response` (judge.py:107) and in `TrailVerifier` comment context. + +--- + +## Policy + +**Location:** `src/legis/policy/` + +**Responsibility:** Provides the agent-programmable policy grammar (boundary type registry, evaluation, and fail-closed UNKNOWN semantics), the policy-to-cell routing registry (loaded from `policy/cells.toml`), and the `@policy_boundary` self-honesty gate (decorator + static scanner + test-evidence evaluator) that enforces Legis's own governance honesty over its source. + +**Key Components:** +- `grammar.py` (109 lines) — `PolicyGrammar` registry: `register` is conflict-safe (no shadowing), `evaluate` is fail-closed (unregistered policy, exception from boundary, or non-`PolicyResult` return all yield `UNKNOWN` with `provenance_gap=True` — never CLEAR). `AllowlistBoundary` is the builtin. `default_grammar()` preloads builtins. +- `cells.py` (138 lines) — `PolicyCellRegistry` for policy-to-cell routing. Glob-capable (`fnmatch`) with exact-pattern precedence. `default_policy_cells()` defaults to `chill` (dev/test only, explicitly documented as unsafe for production). `fail_closed_policy_cells()` defaults to `structured`. `load_policy_cells()` reads `policy/cells.toml` (committed default). Validation rejects unknown cell names at load time. +- `decorator.py` (254 lines) — `@policy_boundary` decorator (metadata-only passthrough), `fingerprint_source` (shared canonicalization for runtime and static scanner — Q-L5 parity), `check_policy_boundary` (runtime honesty gate: verifies citation, invariant, test_ref resolution, fingerprint match, and test evidence quality). `_stable_ast_repr` avoids `ast.dump` version instability across Python 3.12/3.13. +- `boundary_scan.py` (456 lines) — Static AST scanner. `scan_policy_boundaries` walks all `.py` files, parses each, runs `_BoundaryVisitor`. Fail-degraded-not-dead: parse errors and RecursionError/MemoryError on a per-file basis produce a finding and continue (per dogfood-4 A2). `count_source_files` is the single source of truth for scope — a gate must not report PASS on zero files scanned. `assert_within_boundary` blocks path-traversal attacks on caller-supplied scan roots (deferred import of `service.errors` to avoid load-time cycle). +- `evidence.py` (233 lines) — `evaluate_test_evidence`: shared logic for both the static scanner and runtime gate (Q-L5 parity). Checks in order: disabled marker detection (POLICY-1), exercise (excluding calls inside uninvoked nested helpers), shadowing, and policy co-occurrence inside the same `assert` condition (Q-M8). Documented honest residuals: module-level `pytestmark`, aliased skip markers, fixture-mediated skips. +- `__init__.py` (1 line) — Module docstring only; no public re-exports. + +**Dependencies:** +- Inbound: `service/` (wires policy grammar and cell registry into governance decisions, calls `scan_policy_boundaries` and `count_source_files` for the honesty gate), `enforcement/` (consumes `PolicyCellRegistry.cell_for` to select the enforcement gate per policy) +- Outbound: `legis.canonical` (`content_hash` used in `decorator.fingerprint_source`); `legis.service.errors.InvalidArgumentError` via a deferred call-time import in `boundary_scan.assert_within_boundary` (to avoid a load-time cycle — `service/__init__` imports `policy/`) + +**Patterns Observed:** +- Fail-closed by design at every ambiguous point: unregistered policy → UNKNOWN (not CLEAR), boundary exception → UNKNOWN, zero-file scan is explicitly distinguishable from a scan with zero findings. +- Deferred import pattern in `boundary_scan.assert_within_boundary` (boundary_scan.py:116-117) breaks a load-time cycle between `policy/` and `service/` without restructuring the dependency graph. +- Shared canonicalization (`fingerprint_source`) ensures the runtime gate and the static scanner compute identical fingerprints for the same source, preventing divergence (Q-L5). +- `evaluate_test_evidence` is the single evidence-judgement implementation used by both the static scanner and the runtime gate, so the two gates cannot have different evidence standards. +- `_stable_ast_repr` (decorator.py:104-126) is a forward-compatibility measure: `ast.dump` output changed between Python 3.12 and 3.13 (`show_empty` default), which would have invalidated pinned fingerprints on a Python bump. The stable serializer walks `_fields` explicitly. +- Cell routing precedence: exact patterns beat globs (cells.py:44-56); unlisted policies fall through to `default_cell`. The committed `policy/cells.toml` ships `default_cell = "structured"` (fail-closed for production). + +**Concerns:** +- `default_policy_cells()` (cells.py:64-71) defaults to `chill` and is documented as "NOT a safe production default". The code relies on composition roots to choose `fail_closed_policy_cells()` or `load_policy_cells()` instead. If a composition root omits the production selection, governance silently self-clears. The comment and docstring warn against this but there is no runtime guard preventing it. +- The `policy/` → `service/` layering edge (boundary_scan.py:116-117, deferred import) is a structural inversion: `policy/` reaches into `service/` for `InvalidArgumentError`. This is mitigated by the call-time import (no load-time cycle), but it means `policy/` is not independently deployable from `service/`. A dedicated `errors.py` module shared by both layers would close this without restructuring. +- `count_source_files` (boundary_scan.py:84-97) is a separate filesystem walk from `scan_policy_boundaries`. A race between the two (a file appearing or disappearing between the count and the scan) could produce a count mismatch. The gate compares count > 0 vs. findings, not exact file-set equality. In practice this is a non-issue for a local source scan, but is a gap for a hostile or rapidly changing filesystem. +- `evidence.py` documents three residual false-green classes (module-level `pytestmark`, aliased skip markers, fixture-mediated skips) that the gate structurally cannot detect while maintaining Q-L5 runtime/static parity. These are stated honestly, not silently absent. The live exposure is noted as nil at current decoration-site count. + +**Confidence:** High — Read 100% of all 6 files in the subsystem (grammar.py, cells.py, decorator.py, boundary_scan.py, evidence.py, __init__.py). Read the committed `policy/cells.toml`. Verified fail-closed path in `PolicyGrammar.evaluate` (grammar.py:62-85), the deferred import location (boundary_scan.py:116-117), the zero-scope guard (`count_source_files`, boundary_scan.py:84-97), and the shared `fingerprint_source` call in both decorator.py:144-162 and boundary_scan.py:237. + +--- + +## Identity + +**Location:** `src/legis/identity/` + +**Responsibility:** Resolves git locators to Loomweave-minted Stable Entity Identities (SEIs), producing opaque `EntityKey` values that key governance attestations so they survive rename/move events. + +**Key Components:** +- `entity_key.py` (41 lines) — Frozen dataclass holding `value: str` and `identity_stable: bool`; two factory classmethods (`from_locator`, `from_sei`) are the only construction paths; `value` is never parsed downstream — the docstring explicitly states the opacity discipline at line 4. +- `resolver.py` (263 lines) — `IdentityResolver` drives the WP-5.1 upgrade path: returns a locator-keyed (`identity_stable=False`) `EntityKey` on any failure path and an SEI-keyed (`identity_stable=True`) key only on a confirmed-alive Loomweave response; `IdentityResolution` frozen dataclass has a `__post_init__` guard (lines 61–100) that makes it impossible to construct a self-contradictory record (e.g. `alive=True` + status `NOT_ALIVE`). +- `loomweave_client.py` (240 lines) — `HttpLoomweaveIdentity` thin transport wrapper over `urllib`; injectable `fetch` callable for offline tests; HMAC signed via `weft_signing.sign_weft_request` when a key is provisioned; `_validate_base_url` enforces HTTPS for non-loopback hosts (lines 139–159); `_decode_json_response` enforces a 1 MB response cap; SEI strings are URL-quoted but never parsed (lines 224, 232). +- `__init__.py` (1 line) — Module docstring only; no re-exports. + +**Dependencies:** +- Inbound: `enforcement/` (uses `IdentityResolver` and `EntityKey`), `governance/signoff_binding.py` (SEI-keyed sign-off), `records/override_record.py` (embeds `EntityKey`) +- Outbound: `canonical.content_hash` (lineage snapshot hashing, resolver.py line 168), `weft_signing` (HMAC transport signing, loomweave_client.py lines 33–38), external Loomweave HTTP service + +**Patterns Observed:** +- Fail-closed on every transport and parse error: capability probe failure clears the capability latch (`_capable = None`, resolver.py lines 148–151) so the next resolve retries rather than trusting a stale positive; locator resolve failures and malformed responses all return the `degraded` value (locator-keyed, `UNAVAILABLE`). +- Capability TTL re-probe (5 min window, resolver.py lines 27, 127–153) prevents the positive-latch-forever bug (Q-L6): a Loomweave that loses SEI capability mid-life is noticed within one TTL window. +- `alive is not True` strict identity check (resolver.py lines 194, 232, comment "ID-SEI-2") rejects non-bool truthy values from a buggy or hostile Loomweave — a string `"false"` or integer `1` cannot promote a dead entity to a stable identity. +- `resolve_supplied_sei` returns `None` (not a degraded locator key) when an agent-supplied SEI cannot be confirmed alive (resolver.py lines 171–209): silently demoting an L1 SEI bind to a locator-keyed record is explicitly refused. +- TLS custody warning is emitted but not enforced when `LEGIS_ALLOW_INSECURE_REMOTE_HTTP=1` is set for non-loopback hosts (loomweave_client.py lines 147–158, comment "ID-SEI-1") — the flag is the documented dev-only escape hatch. + +**Concerns:** +- The capability TTL and latch-clearing pattern is correct, but a capability probe failure at the START of `_capability()` (before a latch is ever set, `_capable is None`) logs a warning and returns `False` — which is the right degraded behavior. However, if the Loomweave host is reachable but systematically returns a non-`{"sei": {"supported": true}}` body, `_capable` is set to `False` (not `None`) and latched for the full TTL, suppressing re-probes even when the upstream recovers. This is acceptable but is not documented as a known limitation alongside the Q-L6 fix. +- The `content_hash` field on `IdentityResolution` comes verbatim from the Loomweave response (resolver.py lines 200–201, 252–254) and is not independently verified — Legis trusts Loomweave's assertion. This is architecturally correct (Loomweave is the authority), but an on-path attacker who can forge a response (e.g. via the `LEGIS_ALLOW_INSECURE_REMOTE_HTTP` escape) controls the content axis of governance records, which is not called out in the security model. + +**Confidence:** High — Read 100% of all four files; cross-verified imports (`resolver.py` line 19 imports `content_hash`, line 20 imports `LoomweaveIdentity`; `loomweave_client.py` lines 33–38 import `weft_signing`). Fail-closed and SEI-opacity behaviors verified line-by-line. + + +## Store + +**Location:** `src/legis/store/` + +**Responsibility:** Provides a record-agnostic, append-only, hash-chained SQLite audit store with DB-level UPDATE/DELETE triggers, contiguous-seq verification, and an optional out-of-band head anchor for tail-truncation detection. + +**Key Components:** +- `audit_store.py` (457 lines) — `AuditStore` is the implementation; SQLAlchemy with `NullPool` (no lingering locks); `synchronous=FULL` enforced unconditionally (lines 69–77, not configurable); `journal_mode=WAL`; `BEGIN IMMEDIATE` write locks on every append path; `append_signed` (lines 297–312) hands the signer its `(seq, prev_hash)` under the held lock so the v3 HMAC binds the exact position the row receives (AUD-1); `transaction()` (lines 179–212) provides batched appends with thread-local ambient connection; `_assert_no_batch_in_progress` (lines 222–240) turns mid-batch reads into explicit `RuntimeError`s. +- `head_anchor.py` (143 lines) — `HeadAnchor` sidecar file holding `(head_seq, head_chain_hash)` HMAC-signed with `enforcement.signing`; `update` uses temp-file + `os.replace` for atomic writes (line 93); `check` fails closed on missing file (`AnchorError`, lines 107–113) and on signature mismatch (lines 120–121); the REPLAY LIMITATION (a snapshotting attacker can restore a genuine older anchor) is explicitly documented in the module docstring (lines 34–47). +- `protocol.py` (68 lines) — `AppendOnlyStore` Protocol and `AuditRecordLike` Protocol; `transaction()` docstring (lines 57–68) prohibits reads inside batches and documents that `AuditStore` enforces this; `append_signed` contract (lines 27–37) documents the reserve-sign-insert atomicity guarantee. +- `__init__.py` (1 line) — Module docstring only; no re-exports. + +**Dependencies:** +- Inbound: `enforcement/engine.py` (imports `AppendOnlyStore` protocol), `enforcement/protected.py` (imports `AnchorError`, `HeadAnchor`, `AppendOnlyStore`), `enforcement/signoff.py` (imports `HeadAnchor`, `AppendOnlyStore`) +- Outbound: `canonical` (`canonical_json`, `content_hash`; audit_store.py line 40), `enforcement/signing` (`sign`, `verify`; head_anchor.py lines 56–57), `legis.config.ensure_sqlite_parent` (lazy import, audit_store.py line 127) + +**Patterns Observed:** +- Bidirectional coupling between `store/` and `enforcement/`: `enforcement/` imports `store.protocol` and `store.head_anchor`; `store/head_anchor.py` imports `enforcement.signing`. The dependency graph is: `store.head_anchor` → `enforcement.signing` → `canonical`; `enforcement.{engine,protected,signoff}` → `store.{protocol,head_anchor}`. This forms a cycle at the package level (`store ↔ enforcement`) but NOT at the module level (no circular imports in practice because `audit_store.py` does NOT import enforcement and `enforcement.signing` does not import store). +- `verify_integrity` (lines 362–442) is O(N) by design: checks seq contiguity, recomputes `content_hash` for every payload, walks the `prev_hash` chain, and recomputes `chain_hash`. The `allow_nan=False` in `canonical_json` catches a Nan/Infinity-injected payload that `json.loads` would silently accept (lines 403–415). +- DB-level triggers (lines 161–177) reject UPDATE/DELETE at the SQLite engine before any application logic runs; the application layer has no mutation methods — two independent enforcement layers. +- `append` on an uninitialized (`initialize=False`) path is never called safely against a no-table DB: `_has_log_table` guards reads (lines 314–323) but `_insert` / `_write` would raise `OperationalError` on missing table. This is by design — `initialize=False` is for read-only inspection handles. + +**Concerns:** +- The store↔enforcement bidirectional coupling is real but is NOT a circular import at runtime: `store.head_anchor` → `enforcement.signing` → `canonical` is a clean downward dependency; `enforcement.{engine,protected,signoff}` → `store.protocol` / `store.head_anchor` is also clean downward. The cycle is only at the package-name level. The governance-honesty risk is low: neither direction pulls in logic that could silently change a governance decision. However, if `enforcement.signing` ever gains upward imports back into `store`, the current non-circular property breaks silently — a refactor to extract signing into a shared `crypto/` leaf would close this cleanly. +- The head anchor's REPLAY LIMITATION (a snapshotting attacker can restore a genuine earlier anchor and paired truncated DB, and the check passes) is honestly documented in `head_anchor.py` lines 34–47 and in the project README. No mitigating control exists for local-filesystem deployments; the documentation correctly names WORM/remote storage or an external monotonicity monitor as the only closure. This is a conceded residual threat, not a gap. +- `transaction()` nested re-entrance silently reuses the outer batch (lines 201–204), which is the correct behavior for avoid double-commit, but it means a caller who believes it opened a fresh transaction boundary has not — if the outer batch rolls back, so does work the inner caller expected to commit. No warning is emitted on re-entrance. This is low-severity for an append-only store but could surprise future callers. + +**Confidence:** High — Read 100% of all four files; cross-verified the bidirectional coupling by grepping all import statements across both packages (head_anchor.py:56–57, enforcement/engine.py:25, enforcement/protected.py:25–26, enforcement/signoff.py:21–22); read enforcement/signing.py in full to confirm it imports only `canonical` (no back-reference to store). + + +## Crypto/Leaf Primitives + +**Location:** `src/legis/canonical.py`, `src/legis/weft_signing.py`, `src/legis/provenance.py`, `src/legis/records/` + +**Responsibility:** Provide the canonical JSON serialization, content hashing, transport-HMAC signing, provenance vocabulary, and governance record schemas that all upper layers share without creating inter-layer dependencies. + +**Key Components:** +- `canonical.py` (51 lines) — `canonical_json`: `json.dumps` with `sort_keys=True`, `separators=(",",":")`, `ensure_ascii=False`, `allow_nan=False`; `content_hash`: SHA-256 of the UTF-8-encoded canonical form. The `ensure_ascii=False` is the byte-for-byte HMAC contract with Wardline's Python signer; both sides use identical `json.dumps` params, so non-ASCII payloads round-trip without escape divergence. The module docstring (lines 1–34) documents the Q-L4 RFC-8785 deferral, the cross-repo golden vector status, and the condition that triggers the upgrade (a non-Python verifier). +- `weft_signing.py` (91 lines) — `sign_weft_request`: produces `X-Weft-Component`, `X-Weft-Timestamp`, `X-Weft-Nonce` headers; signs `METHOD\npath?query\nsha256(body)\ntimestamp\nnonce`; body canonicalization uses `ensure_ascii=True` (NOT `canonical_json`) to match the transport contract — deliberately different from the audit HMAC contract (lines 38–42, module docstring lines 9–15). `weft_hmac_key_from_env`: channel-specific env var falls back to `LEGIS_HMAC_KEY`. +- `enforcement/signing.py` (61 lines) — `sign`/`verify` for per-record HMACs; version-tagged prefixes (`v2` binds content, `v3` additionally binds `chain_seq`); `verify` accepts both v2 and v3 without ambiguity (lines 57–61); uses `canonical_json` (the `ensure_ascii=False` variant) for HMAC body. This file lives in `enforcement/` but functions as a shared crypto primitive used also by `store/head_anchor.py`. +- `provenance.py` (28 lines) — `Provenance` str-Enum; currently one member (`UNAUTHENTICATED`); shared by `checks/` and `pulls/` without either importing the other. +- `records/override_record.py` (40 lines) — `OverrideRecord` frozen dataclass; `extensions: dict` open field for coached- and protected-cell additions without schema migration; `to_payload()` flattens to dict for `AuditStore.append`. +- `records/__init__.py` (1 line) — Module docstring only. + +**Dependencies:** +- Inbound: `canonical` ← `store/audit_store.py`, `enforcement/signing.py`, `identity/resolver.py`, `governance/`, `wardline/` (cross-repo: Wardline's `core/legis.py` replicates the same `json.dumps` call); `weft_signing` ← `identity/loomweave_client.py`, `filigree/`; `enforcement/signing` ← `store/head_anchor.py`, `enforcement/protected.py`, `enforcement/verdict.py`; `records/` ← `enforcement/engine.py`, `governance/` +- Outbound: `canonical.py` — stdlib only (`hashlib`, `json`); `weft_signing.py` — stdlib only (`hashlib`, `hmac`, `json`, `os`, `urllib.parse`); `enforcement/signing.py` — `canonical` only; `provenance.py` — stdlib only; `records/override_record.py` — `identity.entity_key` only + +**Patterns Observed:** +- Two distinct canonicalization contracts coexist intentionally: `canonical_json` (`ensure_ascii=False`) for audit HMACs and content hashes; `weft_body_bytes` (`ensure_ascii=True`) for transport HMACs. Both are documented; the module docstrings explicitly cross-reference each other to prevent accidental unification (canonical.py lines 1–34, weft_signing.py lines 1–27). +- `allow_nan=False` in `canonical_json` is a tamper-detection aid: a payload injected with `Infinity` or `NaN` survives `json.loads` but raises on re-canonicalization, which `verify_integrity` catches as tamper rather than a crash (audit_store.py lines 403–415). +- The v2/v3 signature version tag allows the signing primitive to evolve the field set without ambiguity in stored records; `verify` accepts both prefixes, so a store with mixed v2/v3 records verifies correctly. +- `Provenance` as a str-Enum means `json.dumps` / `canonical_json` emit the bare string value without any enum wrapper, keeping wire payloads stable across Python versions and avoiding coercion on read-back (provenance.py lines 14–16). +- `OverrideRecord.extensions` (records/override_record.py line 24) is the deliberate extension point for coached- and protected-cell fields; no schema migration is required to add judge output or HMAC binding to an override record. + +**Concerns:** +- `enforcement/signing.py` is located inside `enforcement/` but is consumed as a shared primitive by `store/head_anchor.py` — it is not co-located with the other leaf primitives (`canonical.py`, `weft_signing.py`) it logically belongs with. This is a naming/location inconsistency rather than a governance-honesty risk, but it means a reader of `store/` must know to look in `enforcement/` for the signing primitive, which the `head_anchor.py` import makes visible but surprising. +- `Provenance` has exactly one member (`UNAUTHENTICATED`) and the docstring states an authenticated path "would add a stronger value here." The enum is not yet used in any decision path — it is recorded into check/pull payloads but nothing currently gates on its value. If policy logic is added that trusts a higher-provenance value, the gap between the recorded `unauthenticated` claim and any actual authentication verification must be explicitly re-examined. +- The cross-repo non-ASCII golden vector is not yet pinned (canonical.py lines 22–27): both Python signers use identical `json.dumps` params, so they agree by construction, but a Wardline-side drift (e.g. switching to `ensure_ascii=True` in a refactor) would break cross-repo HMAC verification without a failing test on either side until a non-ASCII payload hits production. The fix (a shared golden HMAC vector with a non-ASCII payload in Wardline's repo) is documented as a Wardline-side follow-up. + +**Confidence:** High — Read 100% of `canonical.py` (51 lines), `weft_signing.py` (91 lines), `enforcement/signing.py` (61 lines), `provenance.py` (28 lines), `records/override_record.py` (40 lines), `records/__init__.py` (1 line). Cross-verified that `canonical.py` has no legis imports (stdlib only); verified `weft_signing.py` has no legis imports (stdlib only); verified `enforcement/signing.py` imports only `canonical` (line 30); verified `records/override_record.py` imports only `identity.entity_key` (line 13). + +--- + +## MCP Stdio Adapter + +**Location:** `src/legis/mcp.py` + +**Responsibility:** Implements the MCP-over-stdio JSON-RPC transport that exposes 23 agent-callable tools, translating tool calls into `service/` calls and mapping all `ServiceError` subclasses to typed `isError` error envelopes without duplicating any governance decision. + +**Key Components:** +- `McpRuntime` dataclass (lines 154–182) — holds all wired dependencies (engine, gates, ledger handles, identity, filigree, warpline) per launch; `posture_ledger` is stored as a handle only (never a cached floor value, D2 discipline) +- `build_runtime(agent_id)` (lines 203–309) — composition root; constructs gates conditionally on `LEGIS_HMAC_KEY`; wires identity, filigree, warpline, posture ledger; fail-closed defaults throughout (e.g. missing policy cells → `fail_closed_policy_cells()`) +- `tool_definitions()` (lines 369–1316) — emits JSON Schema for all 23 tools including `outputSchema` for every tool (enforced after dogfood-4 A6 incident where an omitted top-level `"type": "object"` caused Claude Code's zod validator to drop all 21 tools) +- `_TOOL_HANDLERS` dict (lines 2574–2599) — dispatch table mapping tool names to 23 `_tool_*` functions; `call_tool()` (lines 2602–2610) wraps every dispatch in `_service_error()` catch-all +- `_service_error(exc)` (lines 1432–1484) — ServiceError→error_code mapping table; covers 12 typed cases including `NoSuchRequestError` before `NotFoundError` (subclass ordering), `WardlineRoutingError` before generic `ServiceError`, and a logging fall-through to `INTERNAL_ERROR` for unhandled exceptions +- `_recovery_for(code)` (lines 1326–1388) — maps each error code to `{recoverable, next_action}` text; `AUDIT_INTEGRITY_FAILURE` and `INTERNAL_ERROR` are marked `recoverable=False` +- `ERROR_ENVELOPE_SCHEMA` (lines 348–366) — shared schema for all `isError:true` responses; `additionalProperties: False` with required `[error_code, message, recoverable, next_action]` +- `run_jsonrpc()` / `main()` (lines 2705–2748) — stdlib-only stdio loop with `_read_bounded_line()` enforcing a 16 MiB per-request cap (overridable via `LEGIS_MCP_MAX_REQUEST_BYTES`) +- `_load_policy_cell_registry()` (lines 184–200) — resolution order: env var → `policy/cells.toml` → fail-closed (unless `LEGIS_DEV_DEFAULT_CELLS=1`) +- `_floored_registry(runtime)` (lines 1588–1596) — called fresh at every cell-resolution site; missing ledger maps to `_NoLedger` → structured floor, never chill + +**Dependencies:** +- Inbound: `cli.py` (`legis mcp` command bootstraps `build_runtime` and calls `main()`); `api/app.py` (imports `_load_policy_cell_registry` for shared config loading) +- Outbound: `service/governance`, `service/wardline`, `service/explain`, `service/preflight`; `enforcement/` (engine, protected gate, signoff gate, trail verifier); `store/audit_store`; `policy/cells`, `policy/grammar`; `identity/resolver`; `filigree/client`; `governance/binding_ledger`; `posture/floor`, `posture/ledger`; `git/surface`; `checks/surface`; `pulls/surface`; `wardline/ingest`; `warpline_preflight/client`; `doctor`, `install`, `hooks` (via `cli.py` best-effort refresh on boot) + +**Patterns Observed:** +- Strict thin-adapter discipline: every tool handler calls a `service/` function and maps the result to the MCP envelope; no governance decision logic is in `mcp.py` itself +- Launch-bound agent identity: `agent_id` is set at startup and propagated to every record; no tool schema accepts an actor argument (enforced by `_validate_argument_keys` against `_allowed_tool_arguments`) +- Idempotency via request-hash scan: `_existing_idempotent_record()` walks the full verified trail (O(N) HMAC cost deliberate — optimization that would skip verification was explicitly declined in rc4 review) +- Posture floor always read fresh per request via `_floored_registry()`; floor-raising during an idempotent replay emits `floor_warning` rather than silently grandfathering past the new floor (D4 discipline) +- `warpline` field annotated `# advisory sibling; NEVER read by a verdict path` (line 181); `warpline_preflight_get` tool description explicitly says "Purely advisory" +- `_one_of()` helper (lines 321–332) injects `"type": "object"` at every discriminated outputSchema to prevent zod validator rejection of entire tools/list + +**Concerns:** +- God-module size (2748 LOC): all 23 tool handlers, the full JSON Schema catalogue, the stdio loop, runtime construction, and multiple utility classes live in one file. No honesty violation, but high change-coupling — adding a tool requires edits across tool_definitions(), _TOOL_HANDLERS, _allowed_tool_arguments(), and the handler itself, all in one file with no enforced co-location. +- `api/app.py` imports `_load_policy_cell_registry` directly from `mcp.py` (line 398): a transport module's private helper (`_` prefix) is shared by the HTTP adapter. This is a transport-on-transport dependency; the function belongs in `config.py` or `policy/` to break the coupling (a comment at the site acknowledges it as Q-H2 / "store-location resolvers live in the transport-agnostic config module"). +- `_service_error` fall-through (line 1483): any unhandled exception reaches `INTERNAL_ERROR` with `logger.error`; operators/Sentry see it, but the agent receives only `str(exc)` — no structured payload — which may be less actionable than the typed cases above it. Not a false-green (error is surfaced), but observability gap for novel exception types. +- `WardlineRoutingError` has three HTTP-distinct kinds (`SERVER_MISCONFIGURED` → 500, `SERVER_OWNED` → 403, `MALFORMED` → 422) but MCP collapses all three to `INVALID_CELL_SPEC` (line 1470); this is intentional and documented in `service/errors.py`, but an agent cannot distinguish a server misconfiguration (operator action needed) from a caller error (argument fix needed) by error code alone. + +**Confidence:** High — read 100% of `mcp.py` structurally (build_runtime, McpRuntime, _TOOL_HANDLERS dispatch, _service_error mapping table, _recovery_for, ERROR_ENVELOPE_SCHEMA, run_jsonrpc, all 23 handler function signatures); sampled 8 handler bodies in depth; read `service/errors.py` fully. Cross-verified: `_AGENT_TOOLS` frozenset (line 81) has 23 members matching `_TOOL_HANDLERS` (line 2574) entry count; `NoSuchRequestError` subclass ordering confirmed correct in both the error hierarchy and the mapping. + +--- + +## HTTP API Adapter + +**Location:** `src/legis/api/app.py` + +**Responsibility:** Implements the FastAPI HTTP adapter that exposes Legis governance, git/CI, and wardline surfaces as REST endpoints, translating `ServiceError` subclasses to HTTP status codes and delegating all governance decisions to `service/`. + +**Key Components:** +- `create_app()` factory (lines 314–954) — single application factory injecting all dependencies (engine, gates, identity, filigree, binding ledger, posture ledger) with lazy fallbacks from environment; returns a `FastAPI` instance +- Auth layer: `_verify_secret()` / `_token_actor_from_mapping()` (lines 94–189) — scope-gated bearer token auth; single-secret mode defaults to `writer`-only, requiring explicit scope grant for `operator`; `LEGIS_UNSAFE_DEV_AUTH=1` escape hatch; `_authenticated_actor_configured()` guards whether body-supplied actor is trusted +- `verify_writer` / `verify_operator` dependencies (lines 211–216) — FastAPI `Depends` guards that enforce writer/operator scope split on all write routes; operator routes use a separate `verify_operator` dependency +- `_WARDLINE_ROUTING_STATUS` map (lines 295–299) — three-way HTTP status dispatch for `WardlineRoutingError` kinds (500/403/422), complementing MCP's single-code collapse +- `POST /overrides` unified route (lines 586–715) — cell-dispatched override submission; reads `floored_registry()` per request (D2); branches on `chill`/`coached`/`structured`/`protected` cells; `202` for structured (never `201` — "an old '201 == accepted' reader must not misread it"); `need_inputs` discriminant returns `422` (not a generic error) for protected-cell missing inputs +- `_unresolved_input_http()` (lines 192–208) — structured `422` for non-resolving inline SEI; carries `weft_reason` dict matching MCP's `UNRESOLVED_INPUT` envelope +- `POST /signoff/{seq}/bind-issue` (lines 766–802) — maps 6 service exception types to distinct status codes including `502` for `FiligreeError` (typed, recoverable, not 500) +- `POST /wardline/scan-results` (lines 892–952) — `WardlineDirtyTreeError` → `JSONResponse(409)` (not 2xx); `outcome: ScanOutcome.ROUTED` plus `artifact_status_reason` honesty field + +**Dependencies:** +- Inbound: `cli.py` (`legis serve` sets env vars and calls `uvicorn.run("legis.api.app:create_app", factory=True)`) +- Outbound: `service/governance`, `service/explain`, `service/wardline`; `enforcement/` (engine, protected gate, signoff gate, trail verifier); `store/audit_store`; `policy/cells`, `policy/grammar`; `identity/resolver`; `filigree/client`; `governance/binding_ledger`, `governance/filigree_gate`; `posture/floor`, `posture/ledger`; `git/surface`, `git/rename_feed`, `git/pull_request`; `checks/surface`; `pulls/surface`; `wardline/ingest`, `wardline/governor`; `config` (db URL resolvers); `mcp._load_policy_cell_registry` (cross-transport import, line 398) + +**Patterns Observed:** +- Cell dispatch inside the unified `POST /overrides` route mirrors the MCP `_tool_override_submit` dispatch, both delegating to the same `service/` functions; no governance logic duplicated between adapters +- Status codes carry semantic weight: `201` (accepted/recorded), `202` (pending escalation), `409` (blocked/conflict/dirty-tree), `422` (input error/need_inputs), `500` (integrity failure), `502` (upstream unavailable) +- `floored_registry()` called fresh at each request via the `floored()` closure (lines 415–423) with `PostureLedger` handle shared but floor re-read each time (D2 compliance) +- Auth scope model: `writer` for all mutation routes, `operator` exclusively for `POST /protected/operator-override` and `POST /signoff/{seq}/sign`; unscoped tokens rejected by default (AUTH-1 comment, line 118) +- `LEGIS_ALLOW_UNSCOPED_API_TOKENS=1` flag comment (line 123) explicitly notes it grants unscoped tokens full operator authority — intentionally blunt warning in code + +**Concerns:** +- `create_app()` imports `_load_policy_cell_registry` from `legis.mcp` (line 398): this is the transport-on-transport coupling noted under the MCP entry. The comment at that line attributes it to Q-H2 ("store-location resolvers live in the transport-agnostic config module") but the function still lives in `mcp.py` with a `_` prefix rather than the identified right home. +- `assert simple_engine is not None` (line 607) inside `post_override` after `simple_engine_for(cell)` returns `None` for coached when no judge is configured: this path would panic with `AssertionError` rather than a clean `ServiceError`. The `assert` is a correctness assumption that the upstream `not explanation.enabled` guard (line 603) would have caught the unwired case — but `simple_engine_for` returns `None` for coached without a judge, and `explanation.enabled` may still be `True` in some edge configurations. This is a potential unguarded assertion that would produce a 500 with stack trace rather than a structured error. (Medium confidence — the `explanation.enabled` path is the primary guard, but the assertion is a backup, not a primary defense.) +- No top-level exception handler is registered on the FastAPI app for unhandled `ServiceError` subclasses: any `ServiceError` that escapes route-level `except` blocks becomes an untyped 500. The individual routes cover their expected exception shapes, but a new `ServiceError` subclass not yet added to a route's except list would surface as a 500 without a structured payload. + +**Confidence:** High — read 100% of `app.py` (954 lines). Cross-verified: ServiceError imports at lines 51–60 match handler `except` clauses in routes; `_WARDLINE_ROUTING_STATUS` keys match `WardlineRoutingError` kind constants in `service/errors.py`; `floored()` closure construction confirmed at lines 415–423. + +--- + +## CLI Adapter + +**Location:** `src/legis/cli.py` + +**Responsibility:** Implements the `legis` command-line interface, providing subcommands for serving, MCP boot, governance gates, install/doctor/posture/operator management, and policy tooling, delegating all governance logic to `service/` and translating outcomes to exit codes. + +**Key Components:** +- `build_parser()` (lines 36–260) — argparse definition for all subcommands: `serve`, `mcp`, `check-override-rate`/`governance-gate`, `sei-backfill`, `policy-boundary-check`, `install`, `session-context`, `doctor`, `posture {show,set,rekey}`, `operator {enable,disable}` +- `main()` (lines 705–844) — top-level dispatch; `serve` sets env vars and calls `uvicorn.run`; `mcp` sets env vars and calls `legis.mcp.main(args.agent_id)`; operator/posture subcommands use full governance paths +- `_check_override_rate()` (lines 287–335) — override-rate gate: reads store → `evaluate_override_rate_gate()` in `service/`; fail-closed in CI (`LEGIS_ALLOW_MISSING_GOVERNANCE_DB` guard); integrity check before scoring; exit 1 on FAIL/missing-in-CI +- `_run_install()` (lines 609–689) — step-runner for `legis install`; catches per-step exceptions broadly (BLE001) to avoid half-applied installs, counts failures, returns 1 if any step failed +- `_run_posture()` / `_run_operator()` (lines 401–606) — operator elevation and posture floor management; `posture set` requires open session and matching key fingerprint; `posture rekey` chains KEY_RESET with no old key needed; env backend refused for rekey (cannot persist from child process) +- `_build_operator_signer()` (lines 365–398) — custody dispatch: `env` → `EnvSigner`, `age-file` → `AgeFileSigner` with passphrase from `LEGIS_OPERATOR_KEY_AGE_PASSPHRASE`; `keychain` raises LOUD (not shipped) +- `_parse_ttl()` (lines 344–362) — fail-closed: empty or non-positive TTL raises `ValueError`; no silent zero-length session windows +- `policy-boundary-check` handler (lines 779–832) — zero-scope guard: exits with code 2 (`NO_ROOT`) if scan root is missing or contains no Python files; explicitly blocks a vacuous green PASS on empty input + +**Dependencies:** +- Inbound: process entry point (`legis` console script); no other Legis module imports `cli.py` +- Outbound: `mcp.main()` (for `legis mcp`); `api.app.create_app` (via uvicorn, for `legis serve`); `service/governance.evaluate_override_rate_gate`; `governance/sei_backfill`; `policy/boundary_scan`; `store/audit_store`; `identity/loomweave_client`; `doctor.run_doctor`; `install.*`; `hooks.generate_session_context`, `hooks.refresh_instructions`; `posture.*`; `config` (db URLs) + +**Patterns Observed:** +- Subcommand handlers are private `_run_*()` functions called from `main()`; thin: I/O + env var forwarding + exit code mapping, no governance logic +- Override-rate gate delegates fully to `service.evaluate_override_rate_gate()`; CLI only adds I/O and exit code (the comment at line 323 explicitly says "The detect → require-key → verify → score decision lives in the service layer") +- `policy-boundary-check` has explicit no-scope false-green guard (zero-file = exit 2, not exit 0) mirroring the honesty stance; comment references `weft-ef2e898642 silent-clean-on-zero-scope` +- Operator elevation (D3): `posture set` cannot bypass session — requires a prior `operator enable`; fingerprint mismatch between signer and current epoch is caught before any session opens +- `_refresh_instructions_best_effort()` (lines 692–702) wraps boot-time instruction refresh in broad except with `logger.warning`; explicit comment: "Best-effort: never break the server, but don't vanish silently either" + +**Concerns:** +- `_run_install()` catches all exceptions per step with `except Exception` (line 677, BLE001 suppressed); this prevents tracebacks from aborting a partial install, but means a step can fail silently if the failure string is ambiguous. Acceptable tradeoff for install resilience, but could mask custody errors that should be fatal (e.g., a failed posture step printing `[FAIL]` still lets the overall install return 0 if it was a deferred step). +- None observed for governance-honesty discipline: the CLI makes no governance decisions; all decision paths delegate to `service/` or to the modules that own the logic (`install`, `doctor`, `posture`). Error paths uniformly use non-zero exit codes (1 for failures, 2 for usage/vacuous-scan). The operator elevation path is fail-closed at every guard (signer verification, fingerprint match, session persistence before audit append). + +**Confidence:** High — read 100% of `cli.py` (844 lines). Cross-verified: subcommand names in `build_parser()` match dispatch cases in `main()`; `_check_override_rate()` delegates confirmed by reading the service call at line 326 (`evaluate_override_rate_gate`); zero-scope guard at line 795 (`count_source_files`) confirmed to precede `scan_policy_boundaries`. + +--- + +## Posture + +**Location:** `src/legis/posture/` + +**Responsibility:** Maintains a signed, append-only posture floor that sets the minimum governance enforcement cell across all surfaces, enforces that an absent or empty ledger fails closed to `structured` (never `chill`), and gates floor transitions behind a short-lived sudo-style operator elevation session backed by an OS-keychain, age-encrypted file, or (explicitly opted-in) env-var key custody backend. + +**Key Components:** +- `floor.py` (84 lines) — `FlooredRegistry` subclasses `PolicyCellRegistry`; `cell_for` and `default_cell` are raised to the floor via `_max_tier`; `floored_registry()` calls `ledger.read_floor()` at call time (never cached, D2) and maps `None` to `"structured"` (fail-closed). This is the cross-surface chokepoint consumed by all three transports. +- `ledger.py` (506 lines) — `PostureLedger` wraps `AuditStore`; exposes `read_floor()` (single descending SQL scan skipping metadata records so a `OPERATOR_SESSION_OPENED` tail cannot lower the floor), `genesis()`, `transition()` (fingerprint-checked, v3 HMAC signed, fail-closed via `append_signed`), `session_opened()`, `rekey()`, and the `set_floor()` change gate. The change gate (lines 400–506) enforces: open session required, epoch fingerprint must match LEDGER (not session field), session audit record must be present, signer must prove custody, any fault refuses with zero records written. +- `session.py` (259 lines) — Persisted elevation session at `.weft/legis/operator_session.json`; atomic write via temp-file + `os.replace`; `load_session()` deletes stale files and returns `None` (fail-closed expiry); session file never holds key plaintext, passphrase, or raw blob — only `unlock_ref` (keychain item id, or `None` for age/env). +- `signing.py` (353 lines) — Three custody backends: `KeychainSigner` (key fetched per call, discarded), `AgeFileSigner` (blob + passphrase callback, key unwrapped per call), `EnvSigner` (plaintext env escape hatch; requires explicit `insecure_env=True` and emits `InsecureEnvKeyWarning`). `PostureSigner` / `PostureVerifier` protocols. `verify_signer_signature()` requires fingerprint match AND HMAC verification — self-attested fingerprint alone is not sufficient. +- `records.py` (54 lines) — Frozen `PostureRecord` dataclass; `to_payload()` deliberately excludes `seq/prev_hash/chain_hash` (added by `AuditStore`) to avoid breaking `verify_integrity`. +- `__init__.py` (75 lines) — Public re-exports of all five modules; constitutes the full public surface of the package. + +**Dependencies:** +- Inbound: `api/app.py` (imports `floored_registry`, `PostureLedger`); `mcp.py` (imports `FlooredRegistry`, `_max_tier`, `floored_registry`, `PostureLedger`); `cli.py` (imports signers, `set_floor`, session functions); `install.py` (imports `select_backend`, `mint_key`, `key_fingerprint`, `wrap_key`); `doctor.py` (imports `PostureLedger`, `signing`, `records`); `hooks.py` (imports `PostureLedger`). +- Outbound: `legis.policy.cells` (`PolicyCellRegistry`, `CELL_TIER_ORDER`, `_validate_cell`); `legis.store.audit_store` (`AuditStore`); `legis.enforcement.signing` (`sign`, `verify`); `legis.config` (`operator_session_path`); `legis.clock` (`Clock`, `SystemClock`); `legis.install` (`OperatorKeyCustodyError` — one deferred import in `ledger.py:344`, inside `rekey()` only). + +**Patterns Observed:** +- Fail-closed at every boundary: `None` floor maps to `"structured"` in `floored_registry()` (floor.py:79–83); `read_floor()` returns `None` for absent file, absent table, or empty store (ledger.py:101–118); `load_session()` returns `None` for absent, malformed, or lapsed file (session.py:199–221); `verify_signer_signature()` returns `False` on any exception (signing.py:333). +- Key-never-resident: all three non-env backends fetch the key into a local variable per `sign` call and discard it; no backend exposes a `key` attribute; `__slots__` used on `_RawKeySigner`, `AgeFileSigner`, `KeychainSigner` to prevent attribute injection. +- Append-only chain with position binding: `transition()` uses `append_signed` with a build callback that folds `chain_seq=seq` into signed fields (v3 HMAC); a raise in the callback leaves no half-write (ledger.py:207–262). +- GENESIS idempotent guard: `genesis()` checks `current_epoch_fingerprint()` and is a no-op if any epoch-opening record exists — prevents double-genesis on reinstall (ledger.py:185–197). +- Read-fresh floor (D2): `FlooredRegistry` is constructed per-request in all three transports; the floor is never cached at runtime. +- `rekey()` hands key to custody sink BEFORE appending the `KEY_RESET` record — a custody failure leaves no fingerprint the ledger cannot later sign against (ledger.py:356–358). +- Metadata records (`OPERATOR_SESSION_OPENED`) are explicitly excluded from the `_FLOOR_RECORD_KINDS` set and skipped by the descending `read_floor()` scan, so a session-open tail cannot lower or freeze the effective floor (ledger.py:82, 116). + +**Concerns:** +- `read_floor()` does NOT call `verify_integrity()` before returning the floor value. The task description states "read_floor() gates on verify_integrity()" but the actual implementation (ledger.py:92–118) performs only a descending SQL payload scan with no chain-hash verification. A silently corrupted ledger (raw DB write that keeps SQL rows intact but alters payload bytes) could cause `read_floor()` to return an attacker-lowered floor value. `verify_integrity()` is called separately by `doctor.py:480` during health checks, not inline on floor reads. This is a documented residual threat ("raw-DB-write deletion/truncation are conceded residual threats" per CLAUDE.md), but the gap is worth recording explicitly — the floor-read path trusts payload content without chain verification. +- The `posture -> install` coupling (`ledger.py:344`: `from legis.install import OperatorKeyCustodyError`) is a deferred import inside `rekey()`. `install.py` is a large module (owns CLI install flows, doctor probe logic, config/runtime setup) with the opposite dependency direction expected: install calls posture during setup. The current coupling is narrow (one exception class) but the direction is logically inverted — `OperatorKeyCustodyError` belongs in a shared errors or posture module, not in install. This creates a latent risk: changes to `install.py`'s imports or structure can inadvertently affect the posture/rekey path. +- `epoch_reset_unacknowledged()` and `current_epoch_fingerprint()` both call `self.store.read_all()` (full table scans, ledger.py:138, 165). As the ledger grows with session-open and transition records, these become increasingly expensive. `read_floor()` correctly uses a descending early-exit scan; these two read paths do not benefit from the same optimization. +- No logging or observability in the posture package itself. A refused `set_floor()` call (wrong session, fingerprint mismatch, signer fault) returns a structured `PostureSetResult` but nothing is written to an audit trail at the time of refusal — only accepted transitions appear in the ledger. + +**Confidence:** High — Read 100% of all six source files in the package (`__init__.py`, `floor.py`, `ledger.py`, `records.py`, `session.py`, `signing.py`; 1,331 lines total). Verified inbound dependency graph by grepping all legis source for `from legis.posture`. Cross-validated the `read_floor()` fail-closed path against `floored_registry()` (floor.py:78–84) and the `set_floor()` gate (ledger.py:400–506). Confirmed the `install` import is a single deferred call site at ledger.py:344. Verified `verify_integrity()` is absent from the posture read path by direct code inspection. + +--- + +## Wardline Findings Ingestion and Routing + +**Location:** `src/legis/wardline/` + +**Responsibility:** Ingest Wardline scan payloads (agent-supplied, not pulled via HTTP), validate the wire contract and artifact provenance, and route active defect findings into the configured enforcement cell — all without re-adjudicating Wardline's trust/taint verdicts. + +**Key Components:** +- `ingest.py` (534 lines) — Wire validation: `active_defects()` extracts the gate population, `verify_wardline_artifact()` authenticates provenance (HMAC via `LEGIS_WARDLINE_ARTIFACT_KEY` or records `key_absent`). Defines `TRUST_TIERS`, `KNOWN_KINDS`, `DEFECT_KIND`, `FINDINGS_KEY`, `WardlineFinding`, `Suppressed` enum, `ArtifactStatus`/`ArtifactStatusReason` enums, canonical-reason carrier, `WardlineDirtyTreeError` (typed amber, not a generic red), and `ScanOutcome`. +- `governor.py` (178 lines) — Routing engine: `route_findings()` maps `WardlineFinding` list to `WardlineCellPolicy` members (`SURFACE_OVERRIDE`, `BLOCK_ESCALATE`, `SURFACE_ONLY`), resolves entity SEI before opening writes, wraps the batch in a single-store transaction (engine or signoff), and delegates to `EnforcementEngine.submit_override`, `SignoffGate.request`, or `EnforcementEngine.record_event`. +- `policy.py` (18 lines) — Thin helper: `resolve_cell()` maps a finding's severity rank against a configured `fail_on` threshold to produce either the gate cell or `SURFACE_ONLY`. +- `__init__.py` (1 line) — Empty aside from module docstring. + +**Dependencies:** +- Inbound: `service/` (scan_route handler), `api/` and `mcp.py` (adapters supply the scan payload and call service) +- Outbound: `enforcement/engine.py` (`EnforcementEngine`), `enforcement/signoff.py` (`SignoffGate`), `enforcement/signing.py` (`verify`), `identity/entity_key.py` (`EntityKey`) + +**Patterns Observed:** +- "Wardline analyses, Legis governs" enforced structurally: `ingest.py` module docstring (line 5) states "legis never re-analyzes — it reads findings and governs"; `TRUST_TIERS` and `KNOWN_KINDS` are explicitly labelled "carried, never re-derived" (ingest.py:16, ingest.py:43). +- Fail-closed on `FINDINGS_KEY` absence: `active_defects()` raises `WardlinePayloadError` rather than defaulting to empty (ingest.py:488–493), guarding against the G1 false-green where a producer key rename produces zero defects under a green status. +- Fail-closed on unknown `kind`: any kind outside `KNOWN_KINDS` is rejected loudly (ingest.py:511–517), closing the G1 twin (value-axis) where a drifted `kind` token silently removes a defect from the gate population. +- Agent-suppressed findings require proof: `waived`/`suppressed` findings without `suppression_proof`, `suppression_ticket`, or `suppression_reason` raise `WardlinePayloadError` (ingest.py:521–527). +- Dirty-tree amber is type-distinct from malformed-or-tampered: `WardlineDirtyTreeError` is intentionally not a subclass of `WardlinePayloadError` (ingest.py:191); its `to_payload()` produces `SKIPPED_DIRTY_TREE` so harnesses distinguish "commit first" from "scan is broken". +- Entity resolution before write: `governor.py` resolves all SEIs in `prepared` before opening any write transaction (governor.py:108–111), so identity network calls never run inside a SQLite transaction. +- Cross-store mixing rejected before any write: the guard at governor.py:94–97 rejects a batch that would span engine and signoff stores simultaneously. + +**Concerns:** +- Transaction atomicity is partial: the pre-write guard (governor.py:65–66 comment) explicitly acknowledges that a mid-loop runtime failure after some findings persist leaves those writes permanent. This is accepted but undocumented at the call-site level for callers who may not read the comment. +- The `cell_map` dependency check (governor.py:80–84) is deliberately conservative — it validates all mapped cells, not only those triggered by present findings. The inline comment (governor.py:74–79) flags that narrowing this requires recomputing from present findings, leaving it as acknowledged future work. +- No rate-limit or per-agent throttle on the findings batch beyond `MAX_FINDINGS = 500` (ingest.py:26). A batch at exactly the maximum is accepted without an audit-trail event marking it as a large batch. + +**Confidence:** High — Read all three implementation files (ingest.py 534 lines, governor.py 178 lines, policy.py 18 lines) fully. Cross-verified dependency claims against imports and governance-honesty invariants confirmed in code at cited lines. + + +## Filigree Sign-off Binding + +**Location:** `src/legis/filigree/` and `src/legis/governance/` + +**Responsibility:** Bind a cleared, governed sign-off to a Filigree issue via an SEI-keyed entity-association, record the binding in a local HMAC-signed append-only ledger, and gate issue closure on verified ledger evidence — without touching Filigree's issue lifecycle (locked decision 5). + +**Key Components:** +- `filigree/client.py` (185 lines) — `HttpFiligreeClient` and `FiligreeClient` Protocol: HTTP transport to Filigree using stdlib `urllib`, with injectable `fetch` for offline testing. Explicitly omits `X-Weft-*` transport HMAC headers (client.py:6–13, 44–45); the app-level `binding_signature` in the JSON body is the governance evidence. +- `governance/signoff_binding.py` (83 lines) — `bind_signoff_to_issue()`: validates `entity_key.identity_stable` (raises `ValueError` if false — locator-keyed bind is rejected), optionally HMAC-signs the binding payload, calls `filigree.attach()`, then calls `ledger.record()` in validate→attach→record order. +- `governance/binding_ledger.py` (94 lines) — `BindingLedger`: append-only tamper-bound store of `issue_binding` records, each signed with the LEGIS HMAC key. `get()` and `get_by_issue_id()` call `verify()` before returning data (fail-closed: tampered ledger raises `BindingError`, never returns data). +- `governance/filigree_gate.py` (33 lines) — Pure decision: `evaluate_issue_closure()` calls `ledger.get_by_issue_id()` (which verifies the chain) and returns `allowed: False` if no verified binding record exists. +- `governance/gaps.py` (121 lines) — `find_orphan_gaps()` and `find_lineage_integrity()`: scans the governance audit trail for SEI-keyed records, resolves current liveness/lineage from Loomweave, surfaces orphaned attestations and lineage divergences. Prefix-check semantics: lineage appends are legitimate; a removed or mutated prior event is divergence (gaps.py:6–9). +- `governance/sei_backfill.py` (269 lines) — Append-only migration: upgrades legacy locator-keyed audit rows to SEI-keyed `SEI_BACKFILL` events without rewriting history. Checks integrity before running; dry-run default. +- `governance/params.py` (10 lines) — Policy constants: `OVERRIDE_RATE_THRESHOLD = 0.2`, `OVERRIDE_RATE_WINDOW = 100`, `OVERRIDE_RATE_MIN_SAMPLE = 20`. Explicitly marked as ADR-0002 policy — not tuneable via request parameters. +- `governance/__init__.py` (1 line) — Empty aside from module docstring. + +**Dependencies:** +- Inbound: `service/` (`bind_signoff_issue`, `read_filigree_closure_gate`), `mcp.py` (`signoff_bind_issue`, `filigree_closure_gate_get`) +- Outbound: `filigree/client.py` → `weft_signing.weft_body_bytes`; `signoff_binding.py` → `enforcement/signing.sign`, `filigree/client.FiligreeClient`, `governance/binding_ledger.BindingLedger`, `identity/entity_key.EntityKey`; `binding_ledger.py` → `clock.Clock`, `enforcement/signing.sign`+`verify`, `identity/entity_key.EntityKey`, `store/protocol.AppendOnlyStore`; `gaps.py` → `canonical.content_hash`, `identity/loomweave_client.LoomweaveIdentity`, `store/protocol.AuditRecordLike`; `sei_backfill.py` → `canonical.content_hash`, `clock.Clock`, `identity/loomweave_client.LoomweaveIdentity`, `identity/entity_key.EntityKey`, `identity/resolver.*`, `store/protocol.*` + +**Patterns Observed:** +- SEI-stability gate is the first check in `bind_signoff_to_issue()` (signoff_binding.py:46–49): a locator key raises `ValueError` before any network call, enforcing ADR-0003 fail-closed semantics. +- App-level `binding_signature` is the governance evidence, not transport HMAC: client.py explicitly omits `X-Weft-*` headers (client.py:9–13) to avoid dead handshake with a non-verifying Filigree route; binding integrity lives in the local ledger's HMAC chain. +- Validate→attach→record order in `bind_signoff_to_issue()` (signoff_binding.py:71–82): if `ledger.record()` raises after `filigree.attach()` succeeds, the code comment (signoff_binding.py:72–76) honestly documents the accepted trade-off — a binding with no ledger entry is surfaced by `verify()`, not silently lost. +- Ledger always verifies before reading: both `get()` and `get_by_issue_id()` call `self.verify()` as the first operation (binding_ledger.py:79, 87), so a tampered ledger raises `BindingError` and returns no data. +- Filigree does not own lifecycle: `evaluate_issue_closure()` is a pure read-decision that returns structured `allowed/reason/evidence` without writing to Filigree or mutating issue state (filigree_gate.py:14–32). +- Lineage integrity uses prefix semantics, not whole-list equality: `find_lineage_integrity()` computes `content_hash(current[:n])` against the stored snapshot (gaps.py:110), so appended rename events do not trigger false divergences. +- `params.py` constants cannot be tuned by request (params.py:7–9 comment): the override-rate threshold reads from this file, not from request parameters. + +**Concerns:** +- Uncompensated partial-write window in `bind_signoff_to_issue()`: if `ledger.record()` raises after `filigree.attach()` succeeds, Filigree holds the association pointer but legis has no tamper-bound record of it. The code correctly documents this (signoff_binding.py:72–76), but there is no reconciliation path or operator repair tool identified — a `BindingLedger.verify()` call surfaces the mismatch but cannot heal it. +- `filigree/client.py` response integrity depends on TLS only: the inline comment (client.py:127) acknowledges that `LEGIS_ALLOW_INSECURE_REMOTE_HTTP=1` with a non-loopback Filigree host makes responses forgeable on-path. The escape hatch is guarded by the env flag and log warning, but there is no posture/doctor check that flags a non-loopback HTTP Filigree URL in production. +- `get_by_issue_id()` returns the *last* verified binding record for an issue (binding_ledger.py:88–93); if multiple bindings exist for the same `issue_id` (re-bind after a re-sign-off), earlier records are silently shadowed. No audit event marks a supersession. + +**Confidence:** High — Read all 7 files fully (signoff_binding.py 83 lines, binding_ledger.py 94 lines, filigree_gate.py 33 lines, gaps.py 121 lines, sei_backfill.py 269 lines, params.py 10 lines, filigree/client.py 185 lines). Cross-verified transport-boundary claims and SEI-stability guard at signoff_binding.py:46–49. + + +## Warpline Preflight Advisory Consumer + +**Location:** `src/legis/warpline_preflight/` + +**Responsibility:** Provide read-only advisory access to Warpline's impact-radius and reverify-worklist data, surfaced as a sibling informational tool that is structurally isolated from every governance verdict path. + +**Key Components:** +- `client.py` (144 lines) — `HttpWarplineClient` and `WarplineClient` Protocol: two read-only GETs (`impact_radius`, `reverify_worklist`) via stdlib `urllib`, with injectable `fetch`. HTTPS-required for non-loopback, redirect-blocked, response size-capped at 1 MB. No signing — Warpline's advisory responses are not HMAC-authenticated. +- `__init__.py` (1 line) — Empty. +- `service/preflight.py` (39 lines, not in this directory but the sole consumption point) — `read_warpline_preflight()`: returns `{"status": "unavailable", ...}` when client is `None` or raises `WarplineError`; returns `{"status": "checked", ...}` on success. Transport failures are contained as `unavailable` and never propagate as `INTERNAL_ERROR`. + +**Dependencies:** +- Inbound: `service/preflight.read_warpline_preflight` (consumed only by `mcp.py:_tool_warpline_preflight_get` — a dedicated MCP tool, not embedded in any governance tool handler) +- Outbound: stdlib only (`urllib`, `json`, `ipaddress`, `os`, `logging`) + +**Patterns Observed:** +- Advisory boundary is enforced by structural isolation: `client.py` module docstring (line 7) states "nothing it returns may reach a governance verdict path"; grep of `service/policy.py`, `enforcement/engine.py`, and all governance tool handlers confirms zero cross-references between warpline preflight and any verdict, gate, sign-off, or honesty-read path. +- Fail-unavailable, not fail-empty: an unconfigured or unreachable Warpline returns `{"status": "unavailable", "unavailable": [{"reason": ...}]}` (preflight.py:21–33), never an empty affected-set that could read as "nothing impacted". +- No request signing: `_transport_fetch` passes an empty headers dict (client.py:129), consistent with the advisory posture — no HMAC contract with Warpline exists. +- `mcp.py` constructs `HttpWarplineClient` lazily at runtime (mcp.py:234–238) and stores it on `McpRuntime.warpline`; a `WarplineError` during construction makes it `None`, triggering the unavailable path. + +**Concerns:** +- Advisory boundary relies on discipline, not a type wall: the `WarplineClient` Protocol and its response dicts are untyped at the governance layer — there is no newtype or sealed return type that would prevent a future developer from accidentally plumbing a Warpline response into a verdict path. The contract is documented in comments but not machine-enforced. +- Response integrity depends on TLS only (same structural gap as the Filigree client): a non-loopback HTTP Warpline URL under `LEGIS_ALLOW_INSECURE_REMOTE_HTTP=1` yields forgeable advisory data (client.py:106–114). Since advisory data is never supposed to reach governance verdicts, this is lower risk than the Filigree case, but the doctor check gap is the same. +- None observed for advisory-boundary enforcement in the governance verdict paths — verified by grepping `service/policy.py` and `enforcement/engine.py` for any warpline reference (both returned empty). + +**Confidence:** High — Read `client.py` (144 lines) fully and `service/preflight.py` (39 lines) fully. Advisory boundary verified by negative grep across enforcement and policy service modules. MCP wiring verified at mcp.py:2277–2286 and mcp.py:234–238. + +--- + +## Git/Change Surface + +**Location:** `src/legis/git/` + +**Responsibility:** Provides stateless, read-only access to the local git repository (branches, commits, rename evidence) and defines the injectable forge seam used by adapters that need PR context from an external forge. + +**Key Components:** +- `surface.py` (208 lines) — `GitSurface`: shells out to `git -C ` for all reads; exposes `branches()`, `commit()`, `commits()`, `merge_base()`, `renames()`, and `working_tree_renames()`; validates every ref/SHA against a strict allowlist regex before passing to the shell (`surface.py:81,117,124,127,137`) to prevent injection; raises `GitError` on bad exit codes or timeouts (10 s ceiling) +- `rename_feed.py` (48 lines) — `build_rename_feed()`: composes committed and optional worktree renames into the dict structure consumed by `GET /git/rename-feed` (HTTP) and `git_rename_feed_get` (MCP); emits `worktree_checked` flag to distinguish "checked and clean" from "not checked" (`rename_feed.py:14-16`) +- `models.py` (46 lines) — frozen dataclasses `BranchInfo`, `CommitInfo`, `RenameEvidence`; `RenameEvidence` docstring explicitly scopes the claim to path-level git detection only and defers symbol-level resolution to Loomweave (`models.py:33-38`) +- `pull_request.py` (28 lines) — `PullRequestContext` dataclass and `PullRequestSource` runtime-checkable Protocol; a deployment wires a provider (e.g. `gh`-backed); legis bakes in no forge HTTP (`pull_request.py:1-8`) +- `__init__.py` (1 line) — module docstring only; exports nothing (consumers import from submodules directly) + +**Dependencies:** +- Inbound: `api/app.py` (imports `GitSurface`, `build_rename_feed`, `PullRequestSource`); `mcp.py` (imports `GitSurface`, `build_rename_feed`, `GitError`) +- Outbound: stdlib only (`subprocess`, `pathlib`, `re`); no Legis sibling packages + +**Patterns Observed:** +- Strict ref/SHA allowlist validation at every entry point before shell invocation (`surface.py:81,117,124,127,137,178`) — injection defense is in the surface, not in each caller +- Stateless design: every method call reads directly from the real repository; no in-memory cache +- `_run_raw()` vs `_run()` split: error-tolerant reads (e.g. upstream ahead/behind, blob lookups) use `_run_raw` and check returncode; mandatory reads use `_run` and raise `GitError` +- Working-tree renames use the literal sentinel `"WORKTREE"` as `commit_sha` (`surface.py:194`), communicating uncommitted provenance to the consumer +- Forge seam (PR context) uses a `Protocol` injection pattern matching `identity/` and `filigree/` client seams + +**Concerns:** +- None observed for governance honesty: rename evidence carries an explicit docstring boundary (path-level only, `models.py:33-38`); `PullRequestSource` is read-only and injects no forge writes; `working_tree_renames` emits `commit_sha="WORKTREE"` rather than a hash, preventing misinterpretation as a committed ref; `merge_base()` returns `None` (not an empty string that could collide with a ref) when there is no common ancestor (`surface.py:131-132`) +- `_blob()` silently returns `""` when a rev/path cannot be resolved (`surface.py:207`); a consumer receiving `old_blob=""` cannot distinguish "object missing" from "lookup failed" — documented as intentional for Loomweave's matcher, but the emptiness semantics are not explicit in `RenameEvidence` +- Test coverage: `test_git_surface.py` (156 lines), `test_rename_feed.py` (47 lines), `test_git_rename_feed_contract.py` (103 lines); ref-validation injection tests verified in `test_git_surface.py` + +**Confidence:** High — read 100% of all 5 source files (surface.py 208 lines, rename_feed.py 48 lines, models.py 46 lines, pull_request.py 28 lines, __init__.py 1 line); cross-verified inbound callers in api/app.py and mcp.py; verified test coverage exists for rename feed and surface + + +--- + +## CI/Check Surface + +**Location:** `src/legis/checks/` + +**Responsibility:** Records CI check runs supplied by writers (agents, CI adapters) into an indexed SQLite store and serves them back queryable by commit SHA, branch name, or PR number, always tagging recorded runs as writer-supplied and unauthenticated. + +**Key Components:** +- `surface.py` (133 lines) — `CheckSurface`: SQLAlchemy Core over SQLite (NullPool); `record()`, `for_commit()`, `for_branch()`, `for_pr()`, `latest_state()`; additive migration via `_ensure_schema()` for `recorded_by` and `provenance` columns (`surface.py:57-66`); `_to_run()` defaults missing `provenance` to `Provenance.UNAUTHENTICATED` for rows written before the column existed (`surface.py:115`) +- `models.py` (43 lines) — `CheckRun` frozen dataclass and `CheckOutcome` str-Enum; comment at `models.py:36-42` explicitly names the governance limit: "a recorded check is a writer-supplied claim, not a forge-verified fact"; `provenance` defaults to `Provenance.UNAUTHENTICATED` +- `__init__.py` (1 line) — module docstring only + +**Dependencies:** +- Inbound: `api/app.py` (imports `CheckSurface`, `CheckRun`, `CheckOutcome`); `mcp.py` (imports `CheckSurface`, `CheckRun`, `CheckOutcome`); `pulls/surface.py` imports `Provenance` (via shared `legis.provenance`, not this package) +- Outbound: `legis.provenance` (shared vocabulary); `legis.config.ensure_sqlite_parent` (parent-dir creation for DB path); SQLAlchemy (storage) + +**Patterns Observed:** +- Provenance honesty by construction: the `provenance` field is set to `UNAUTHENTICATED` at the model default (`models.py:42`) and is re-applied on read for pre-migration rows (`surface.py:115`), so no code path can return a check run without an explicit provenance claim +- `check_report` MCP tool echoes `recorded_by` and `provenance` back to the caller in its result (`mcp.py:2433-2440`), explicitly preventing a caller from believing its own report became forge-attested +- `latest_state()` uses last-write-wins by insert order (`surface.py:129-131`), matching a CI model where newer runs supersede older ones for the same check name +- Table is indexed (not append-only) to support dimensional queries, distinct from the HMAC-chained governance audit log (`surface.py:3-7`) +- Additive `ALTER TABLE` migration rather than versioned migrations — suitable for the single-writer, file-local SQLite model + +**Concerns:** +- No deduplication guard: recording the same `(run_id, commit_sha, check_name)` twice produces two rows; `latest_state()` will return the second by insert order, but `for_commit()` / `for_branch()` / `for_pr()` return all rows. An agent that double-reports does not cause a false-green (both rows are unauthenticated claims), but check counts in API/MCP responses may mislead +- `check_report` (MCP write tool) accepts `commit_sha` as a free string with no validation against the actual repo — an agent can record a check against a SHA that does not exist in the repository; there is no proof-of-commit gate +- `provenance` column is `Text` in the DB and is never validated on read — a raw-DB write of an arbitrary string would survive round-trip; the `_to_run()` fallback only guards `NULL`, not arbitrary values (`surface.py:115`) +- No forge-verification path exists today (only `UNAUTHENTICATED`); the field is wired for a future authenticated path (e.g. signed webhook), but the extension point is in `provenance.py` only — there is no corresponding routing or validation logic yet + +**Confidence:** High — read 100% of all 3 source files (surface.py 133 lines, models.py 43 lines, __init__.py 1 line); read the MCP check_report and check_list tool implementations (mcp.py:2396-2440); cross-verified provenance defaults in surface._to_run() and models.CheckRun; test coverage in tests/checks/test_check_surface.py (84 lines) + + +--- + +## Pull-Request Surface + +**Location:** `src/legis/pulls/` + +**Responsibility:** Records forge-reported pull-request metadata (writer-supplied, unauthenticated) into a per-PR upsert SQLite store and serves it back, always preserving a provenance label that prevents a consumer from treating a writer-asserted PR state as forge-authoritative. + +**Key Components:** +- `surface.py` (78 lines) — `PullSurface`: SQLAlchemy Core over SQLite (NullPool); `record()` upserts via delete-then-insert keyed on PR number (`surface.py:46-58`); `get()` returns `None` for unknown PRs; `_ensure_schema()` adds `recorded_by` and `provenance` columns additively (`surface.py:34-43`); `get()` defaults `provenance` to `UNAUTHENTICATED` for pre-migration rows (`surface.py:76`) +- `models.py` (30 lines) — `PullRequest` frozen dataclass and `PullRequestState` str-Enum; comment at `models.py:26-29` mirrors the checks provenance honesty contract +- `__init__.py` (3 lines) — explicit `__all__` re-export of `PullRequest`, `PullRequestState`, `PullSurface` + +**Dependencies:** +- Inbound: `api/app.py` (imports `PullRequest`, `PullRequestState`, `PullSurface`); `mcp.py` (imports `PullRequestState`, `PullSurface`); also `git/pull_request.py` defines a parallel `PullRequestContext` / `PullRequestSource` Protocol used by the live-forge injection path (the two are not merged) +- Outbound: `legis.provenance` (shared vocabulary); `legis.config.ensure_sqlite_parent`; SQLAlchemy + +**Patterns Observed:** +- Upsert-by-number semantics: each `record()` call replaces any prior row for that PR number atomically within a transaction (`surface.py:46-48`), so the store always reflects the last-known state rather than accumulating history +- `pull_request_get` MCP tool lazily initialises `_checks()` unconditionally to prevent call-order-dependent gaps where a fresh runtime might report no checks for a PR (`mcp.py:2354-2359`) — an explicit governance-honesty fix noted in the comment +- Two parallel PR seams in the HTTP adapter: `GET /git/pull-requests/{number}` uses the injected `PullRequestSource` (live forge); `POST /git/pulls` + `GET /git/pulls/{number}` use `PullSurface` (recorded cache) — clearly separated and documented +- `pull_request_record` intentionally absent from MCP tool surface (per CLAUDE.md: "forge is source of truth, pinned in test") — the MCP surface is read-only for PRs; write is HTTP-only + +**Concerns:** +- The two PR representations (`PullRequestContext` in `git/pull_request.py` and `PullRequest` in `pulls/models.py`) carry overlapping fields (`number`, `title`, `base`, `head`, `state`) but are structurally separate types with no shared base or adapter — a consumer receiving one cannot easily convert to the other without a manual mapping; this is intentional (live-forge vs recorded seam) but the distinction is undocumented at the type level +- `record()` silently overwrites an existing PR's data with whatever the writer provides; if a writer submits a stale state (e.g. `open` after a PR merged), the store will reflect the stale value with no conflict detection or warning +- No test exercises the `provenance` field being set to a non-default value (e.g. a hypothetical `"webhook_signed"`); the `UNAUTHENTICATED` default is tested implicitly but the upgrade path is only named in comments, not in any test fixture + +**Confidence:** High — read 100% of all 3 source files (surface.py 78 lines, models.py 30 lines, __init__.py 3 lines); cross-verified both PR seams in api/app.py (lines 523-551); verified MCP pull_request_get implementation (mcp.py:2347-2360); test coverage verified via tests/pulls/test_pull_surface.py (30 lines) and tests/git/test_pull_request_api.py + +--- + +## install.py — Project Installer + +**Location:** `src/legis/install.py` + +**Responsibility:** Stands up legis in a project by injecting a versioned instruction block into CLAUDE.md/AGENTS.md, installing the legis-workflow skill pack, registering a Claude Code SessionStart hook in .claude/settings.json, writing .gitignore rules, minting the posture-ledger GENESIS and operator key, and registering the MCP server entry in .mcp.json — all idempotent, all symlink-escape-guarded. + +**Key Components:** +- `project_path` / `ensure_project_dir` / `reject_symlink` (lines 132–160) — symlink-escape guards applied to every installer write; raise `UnsafeInstallPathError` on traversal outside project root +- `inject_instructions` (lines 311–387) — foreign-fence-aware instruction-block injector; uses `_first_own_open_fence_pos` + `_first_foreign_fence_pos` to never delete a co-resident sibling block (wardline/filigree); writes via `_atomic_write_text` (temp + `os.replace`) +- `_atomic_write_text` (lines 278–308) — empty-content guard + mode-preserving atomic write (refuses empty payload, rejects symlink direct target); used for all text writes in install +- `_install_skill_to` (lines 416–483) — concurrent-safe skill-pack copytree with rename race: stages to a temp dir, renames target aside, atomically swaps in new tree; on failure restores the prior pack rather than silently dropping it +- `install_claude_code_hooks` (lines 733–837) — registers `legis session-context` as a SessionStart hook; upgrades stale/bare commands to the resolved binary path; backs up malformed `settings.json` before resetting rather than silently clobbering; only rewrites unscoped blocks (never touches user-scoped blocks) +- `register_mcp_json` (lines 1199–1289) — idempotent MCP entry manager; a usable existing entry (command resolves + args valid + env clean) is NEVER regenerated; on rebuild preserves the operator-owned `env` dict minus blocked keys (`_safe_mcp_env`); `_REJECTED_MCP_ENV_KEYS` + `_REJECTED_MCP_ENV_PREFIXES` scrub secrets and unsafe escape-hatch vars +- `install_posture` (lines 1367–1431) — posture-ledger genesis; key minted once, handed to custody sink BEFORE `GENESIS` is written (fail-closed: no fingerprint written if custody fails); idempotency guard at `current_epoch_fingerprint()` prevents second-mint; env-backend adopts `LEGIS_OPERATOR_KEY` rather than minting a throwaway (legis-1844bf8ac9) +- `_default_key_sink` (lines 1462–1502) — custody router: `env`=no-op, `age-file`=atomic blob write, `keychain`=loud failure (adapter not shipped); raises `OperatorKeyCustodyError` rather than dropping the key +- `_find_legis_command` (lines 522–562) — binary resolution; prefers `sys.argv[0]` (faithful to the running binary, legis-788a85fac1) over PATH; skips project-local hits to avoid pinning a venv shim that doctor would immediately flag as stale + +**Dependencies:** +- Inbound: `hooks.py` (imports `inject_instructions`, `install_skills`, `install_codex_skills`, `_get_skills_source_dir`, `_skill_tree_fingerprint`, `_instructions_block_is_current`, `_marker_token`, `INSTRUCTIONS_MARKER`, `SKILL_NAME`); `doctor.py` (imports `_install` module, calls `mcp_entry_is_current`, `inject_instructions`, `install_claude_code_hooks`, `install_skills`, `install_codex_skills`, `gitignore_rules_present`, `ensure_gitignore`, `ensure_legis_dir_gitignore`, `_has_unscoped_session_start_hook`, `_own_open_marker_tokens`, `_marker_token`, `SESSION_CONTEXT_COMMAND`, `SKILL_NAME`, `LEGIS_DIR_GITIGNORE_MARKER`); `posture/` (imports `install_posture` for genesis) +- Outbound: `legis.config` (posture DB URL); `legis.posture` (`PostureLedger`, `mint_key`, `key_fingerprint`, `wrap_key`, `select_backend`); `legis.clock`; `importlib.resources` / `importlib.metadata` (bundled instructions, version); stdlib (`hashlib`, `shutil`, `tempfile`, `os`, `json`, `re`, `stat`) + +**Patterns Observed:** +- Fail-closed on every operation: custody failure before GENESIS write, empty-content guard before any text write, `UnsafeInstallPathError` on symlink escape — the codebase never auto-accepts partial success +- Strict idempotency contract: every installer function checks current state before acting; a second `install` over a healthy project modifies nothing material (returns early with "already registered" / "already present") +- Operator-env preservation on `.mcp.json` updates: the existing `env` dict is carried forward (minus scrubbed secret keys) rather than wiped — named fix for legis-788a85fac1; `_safe_mcp_env` is the scrub gate +- Foreign-fence awareness in instruction injection: `_INSTR_FENCE_RE` detects any tool's namespace fence (case-insensitive); the injector never deletes inter-block content owned by wardline/filigree (C-4 multi-owner block contract) +- `_find_legis_command` avoids project-local path poisoning: prefers the running binary, skips `.venv/bin/legis` hits so the registered hook/MCP command doesn't bounce on freshness checks + +**Concerns:** +- `_keychain_available()` (line 1348–1349) always returns `False` — the live OS-keychain backend is not yet implemented. This means `choose_install_backend` never selects `keychain` automatically. The comment documents it as deferred, but a caller providing `backend="keychain"` directly would route to the `_default_key_sink` `raise` path. The honesty posture is correct (fail-closed, not silent), but the gap means the most-secure custody path is unattainable without a custom `key_sink` injection. +- `install.py` is 1503 lines and owns six distinct responsibilities (instruction injection, skill pack, hook, gitignore, MCP registration, posture genesis). The functions are well-decomposed but the file is a god module by size; a future author adding a seventh install artifact has no forcing function to split it. +- The `settings.json` backup-on-corrupt path (lines 761–768, 804–817) writes `settings.json.bak` by copying via `shutil.copy2` then overwrites without `reject_symlink` on the backup path itself — a symlink at `.claude/settings.json.bak` could cause the backup to land outside the project tree. The backup path does call `reject_symlink` (lines 763, 807) before the copy, so this is mitigated, but only for the direct symlink case, not for symlinked parent directories. Evidence: line 763 `reject_symlink(backup)` is present, so direct symlink is blocked. + +**Confidence:** High — Read 100% of install.py (1503 lines). Every claim cites specific line ranges. Cross-verified the `hooks.py` import list against actual `from legis.install import (...)` at hooks.py:23–32. Verified `register_mcp_json` env-preservation logic at lines 1282–1284. Verified `install_posture` idempotency guard at lines 1411–1413. Verified `_keychain_available` stub at lines 1347–1349. + + +## doctor.py — Operator Health and Repair + +**Location:** `src/legis/doctor.py` + +**Responsibility:** Inspects and (with `--fix`) repairs legis's install and runtime artifacts, reporting each problem as `[auto-fixable]` or `[operator]`, sharing the machine-readable report surface (`doctor_payload`) with the MCP `doctor_get` tool so CLI and agent surfaces cannot drift. + +**Key Components:** +- `DoctorCheck` (lines 28–49) — frozen dataclass with `id`, `status` (`ok`/`warn`/`error`), `fixed`, `message`, `repairable`; the `repairable` field is the source of truth for the `[auto-fixable]` vs `[operator]` tag rendered in `render_text` +- `render_text` (lines 71–114) — renders `[fixed]`/`[auto-fixable]`/`[operator]` tags; a `repairable=False` check that is not fixed renders `[operator]`; a `repairable=True` check that is not fixed renders `[auto-fixable]`; confirmed honest: split-brain blocks set `repairable=False` (line 197) matching their "resolve it by hand" message +- `doctor_payload` (lines 56–64) — single source of the machine-readable schema shared by CLI `--format json` and MCP `doctor_get`; both render from this function (docstring confirmed; cross-validated against `mcp.py` usage by import path) +- `collect_checks` (lines 967–996) — runs 25 checks in order; repair branches live inside individual check functions (not here), so the orchestrator is pure composition +- `check_instruction_block` (lines 172–205) — distinguishes missing / drifted / split-brain; split-brain (`len(tokens) > 1`) returns `repairable=False` (line 197) because the injector cannot canonicalise across a sibling's block — honesty-correct; `--fix` re-runs `inject_instructions` and re-checks state before reporting success +- `check_mcp_json` (lines 117–137) — repair calls `register_mcp_json` and re-checks `mcp_entry_is_current` before returning `fixed=True`; never writes env secrets (delegates to `_safe_mcp_env` inside `register_mcp_json`) +- `check_audit_chain` (lines 461–487) — absent store → `ok` (never creates DB); tampered chain → `error`, `repairable=False`; never auto-repairs a hash-chain failure +- `check_posture_chain` / `check_posture_ledger` / `check_posture_key_reset` / `check_operator_key_accessible` (lines 561–810) — posture-ledger integrity checks; all report-only; `check_operator_key_accessible` probes key reachability without rendering the key value (lines 763–791); env escape hatch presence yields `warn`, not `ok` (honesty note at line 773) +- `check_weft_toml` (lines 339–358) — distinguishes absent (ok / defaults apply) from present-but-broken (error / config silently not applying), per C-9(b); NEVER writes `weft.toml` (confirmed — no write call anywhere in this function) +- `check_filigree_binding_scope` (lines 927–964) — triggered by unscoped filigree URL presence, NOT local install; `repairable=False` (operator-pinned URL); closes the false-green where doctor said "ok" while scans silently non-emitted +- `_store_dir_for` (lines 329–336) — anchored at `root`, not `cwd`, and explicitly ignores `weft.toml` (comment at line 332); custody rule correctly enforced in doctor's own store-path logic + +**Dependencies:** +- Inbound: CLI (`legis.cli` `doctor` subcommand); MCP tool surface (`legis.mcp` `doctor_get` tool uses `doctor_payload` / `collect_checks`); integration tests +- Outbound: `legis.install` (all repair operations delegate back to install functions); `legis.config` (`STORE_DB_SPECS`, `protected_policies`, `posture_db_url`, `operator_age_path`); `legis.store.audit_store` (`AuditStore`); `legis.posture.ledger` (`PostureLedger`); `legis.posture.signing` (`key_fingerprint`, `unwrap_key`); `legis.posture.records` (kind constants); `legis.enforcement.signing` (`verify`) + +**Patterns Observed:** +- Report-then-repair contract: every check function verifies current state first; repair branches are conditional on the `repair` flag; post-repair state is re-verified before claiming `fixed=True` (e.g. `check_hook` lines 259–261) +- `repairable` flag drives honest tagging: split-brain blocks, operator-key items, and audit-chain failures all set `repairable=False`; auto-fixable items all set `repairable=True`; the tag in `render_text` derives directly from this flag, not from ad-hoc conditions +- Doctor never writes `weft.toml` (C-9(b)): confirmed — no `weft.toml` write in any check function; `check_weft_toml` is read-only +- `_store_dir_for` ignores `weft.toml` and uses `root`-anchored path (line 332 comment); doctor's store resolution is independent of the runtime config's `LEGIS_*_DB` overrides for the store-path check itself (overrides are respected only by `_store_url` when building actual DB URLs for integrity checks) +- `STORE_DB_SPECS` imported from `config` (line 325) ensures doctor's override-env list can never silently drop a store when a 6th store is added — single-source enumeration closes that coverage gap + +**Concerns:** +- `check_wardline_artifact_key` (line 836) reports a `warn` when `LEGIS_WARDLINE_ARTIFACT_KEY` is absent, saying all scans govern as `artifact_status=unverified`. The honesty diagnosis (PDR-0023) is correct and the signal is present. However the message does not name a path for a "warn" exit in CI: operators who see a `warn` may not know whether CI fails on `warn` or only on `error`. `run_doctor` (line 999–1002) returns non-zero only when any check is not `.ok` — and `warn` is not "ok" (`.ok` is `status != "error"`, not `status == "ok"` — confirmed: `DoctorCheck.ok` at line 38 returns `self.status != "error"`). This means `warn`s do NOT cause a non-zero exit. An unset `LEGIS_WARDLINE_ARTIFACT_KEY` therefore yields a `warn` that does NOT fail CI — which is documented as intentional ("deliberately a warn, not an error") but could mislead operators who expect CI-blocking behavior for unsigned verification. +- At 1002 lines `doctor.py` carries both the check domain logic and the rendering/orchestration logic. It is coherent but large; adding a new sibling or posture check requires editing a single growing file with no structural boundary. + +**Confidence:** High — Read 100% of doctor.py (1003 lines). Every claim cites specific line ranges. Verified `DoctorCheck.ok` property logic at line 38. Confirmed `repairable=False` for split-brain at line 197. Confirmed `check_weft_toml` has no write call. Confirmed `run_doctor` exit-code logic at lines 999–1002. + + +## hooks.py — SessionStart Hook and Refresh + +**Location:** `src/legis/hooks.py` + +**Responsibility:** Implements the `legis session-context` SessionStart hook, which refreshes drifted instruction blocks and skill packs in place and emits a one-line posture banner (instructions, skill, cells, posture floor) that is always non-empty to distinguish "nothing to report" from "broken." + +**Key Components:** +- `refresh_instructions` (lines 38–93) — refreshes drifted instruction blocks (byte-exact check via `_instructions_block_is_current`) and stale skill packs (fingerprint check via `_skill_tree_fingerprint`); only touches marker-bearing files and already-installed skill dirs; never creates a block or dir that doesn't already exist (that is install's job) +- `generate_session_context` (lines 198–222) — the top-level entry point; always returns a non-empty string (dogfood N-1); composes four posture sub-strings: `_instructions_posture`, `_skill_pack_posture`, `_cells_posture`, `_posture_floor`; exceptions from `refresh_instructions` are caught and reported as a failure line, not raised +- `_posture_floor` (lines 173–195) — reads the posture ledger with `initialize=False` (never creates the DB); absent/empty ledger returns `"posture floor: none (fail-closed structured)"` not a false-green claim; unreadable ledger returns `"posture floor: unreadable"` (warn, not silent); imported lazily inside the function to avoid circular import at module load +- `_cells_posture` (lines 145–170) — mirrors `mcp._load_policy_cell_registry` file precedence (`LEGIS_POLICY_CELLS` > `policy/cells.toml`) but is explicitly documented as report-only at hook process scope, never claiming server runtime posture; unreadable cells → `"cells config: unreadable"`, not a false-green +- `_skill_pack_posture` (lines 122–142) — when bundled source is missing, returns `"skill pack unverifiable (bundled source missing)"` (line 138) rather than claiming currency — honesty-correct; only claims "current" when fingerprints compare equal + +**Dependencies:** +- Inbound: `legis.cli` (the `session-context` subcommand calls `generate_session_context`); `legis.mcp` (MCP startup calls `refresh_instructions` best-effort) +- Outbound: `legis.install` (substantial: `inject_instructions`, `install_skills`, `install_codex_skills`, `_get_skills_source_dir`, `_skill_tree_fingerprint`, `_instructions_block_is_current`, `_marker_token`, `INSTRUCTIONS_MARKER`, `SKILL_NAME`); `legis.policy.cells` (`load_policy_cells`); `legis.config` (`posture_db_url`); `legis.posture.ledger` (`PostureLedger`) + +**Patterns Observed:** +- Refresh-only-in-place invariant: `refresh_instructions` checks `if not md_path.exists(): continue` (line 50) and `if not target_root.is_dir(): continue` (line 83) — never creates absent install artifacts; install vs hooks boundary is structurally enforced +- All four posture sub-functions are fail-closed: each returns a distinct "unreadable" or "not installed" string rather than silently eliding the field or returning an empty string +- Lazy import of `legis.config` / `legis.posture.ledger` inside `_posture_floor` (lines 183–184) avoids circular import; consistent with the pattern used across enforcement modules + +**Concerns:** +- `refresh_instructions` warns via `logger.warning` (lines 69–71, 88–90) when drift re-injection fails, but the warning goes to the log, not to the session banner. An operator reading the banner without checking logs would see no signal about the failure. The comment at line 69 notes this is intentional ("Surface it for the operator (peer of the boot-log path)"), but in practice an agent running headlessly may have no log reader. The `_instructions_posture` post-refresh check (line 116) does catch still-drifted state and returns `"instructions stale (refresh failed; see logs)"` in the banner — so this is only a partial gap. +- `hooks.py` imports eight private symbols from `install.py` (prefixed `_`). This is a documented dependency (the module comment explains the two callers), but changes to private install helpers require cross-checking hooks.py. The coupling is inward-only (hooks does not re-export these) and exists because hooks is explicitly a "lighter-weight" refresh surface reusing install's logic. + +**Confidence:** High — Read 100% of hooks.py (223 lines). Verified `refresh_instructions` never creates absent paths (lines 50, 83). Verified `_posture_floor` uses `initialize=False` (line 189). Verified `_skill_pack_posture` unverifiable path (line 138). Confirmed private symbol imports from install at lines 23–32. + + +## config.py — Store Resolution and Env Configuration + +**Location:** `src/legis/config.py` + +**Responsibility:** Resolves all SQLite store URLs and composition-root configuration (protected policies, operator paths) from environment variables, with `LEGIS_*_DB` overrides as the sole relocation mechanism — `weft.toml` is explicitly and deliberately ignored for store paths. + +**Key Components:** +- `STORE_DB_SPECS` (lines 71–77) — stably-ordered tuple of `(env_var, db_filename)` for all five stores; the single source of store identity so doctor and any future consumer never re-list the env vars / filenames independently +- `_resolve_db_url` (lines 110–123) — the single resolution point for all stores: `env_var in os.environ` (membership check, not `.get()`) so a present-but-empty override returns verbatim rather than silently falling through to the default; a present-but-empty override is therefore a broken override, never a "use default" fallback +- `_store_dir` (lines 90–97) — ignores `weft.toml` by design (comment at line 92); builds `.weft/legis/` under the provided root or `Path(".")` (relative, resolved against cwd at call time) +- `protected_policies` (lines 171–184) — single parse point for `LEGIS_PROTECTED_POLICIES`: `frozenset` of comma-split, stripped, non-empty names; read at call time so the CLI can write the env var from `--protected-policies` before composition roots read it +- `ensure_sqlite_parent` (lines 187–203) — creates the parent directory lazily at store-open time, never at URL-compute time; importing `config` or computing a default URL never litters `.weft/` directories +- `operator_session_path` / `operator_age_path` (lines 151–168) — operator-elevation file paths, both under `.weft/legis/`; documented as holding references/encrypted blobs only, never key plaintext + +**Dependencies:** +- Inbound: every module that opens a store (`api/app.py`, `mcp.py`, `cli.py`, `store/`, `posture/`, `install.py`, `doctor.py`, `hooks.py`) +- Outbound: `sqlalchemy.engine.make_url` (URL parsing in `ensure_sqlite_parent`); `os`, `pathlib` + +**Patterns Observed:** +- `weft.toml`-is-enrich-only documented at module level (lines 18–23) and structurally enforced: `_store_dir` does not read `weft.toml` at all; no `tomllib` import in `config.py` +- Present-but-empty env var treated as verbatim override (line 121), not silent fallback — consistent with CLAUDE.md doctrine +- Call-time resolution (not module-load-time) for both DB URLs and `protected_policies`: env vars written late by the CLI (e.g. `--protected-policies` flag sets `os.environ` before the composition root reads it) always produce the correct value + +**Concerns:** +- None observed. Verified: no `weft.toml` read in any path; present-but-empty override is handled correctly (line 121); `ensure_sqlite_parent` defers directory creation to store-open time (not import time); `STORE_DB_SPECS` is the single enumeration consumed by doctor. The module is 204 lines with a single, well-bounded responsibility. + +**Confidence:** High — Read 100% of config.py (204 lines). Verified `env_var in os.environ` membership-check at line 121. Verified absence of `tomllib` import. Verified `_store_dir` comment at lines 91–93. Verified `STORE_DB_SPECS` structure at lines 71–77. Confirmed lazy `ensure_sqlite_parent` design at lines 193–203. + +--- + diff --git a/docs/arch-analysis-2026-06-28-2142/03-diagrams.md b/docs/arch-analysis-2026-06-28-2142/03-diagrams.md new file mode 100644 index 0000000..b740bef --- /dev/null +++ b/docs/arch-analysis-2026-06-28-2142/03-diagrams.md @@ -0,0 +1,120 @@ +# 03 — Architecture Diagrams (C4 + dependency) + +> Mermaid sources. Edges reflect the measured cross-subsystem import matrix (`01-discovery-findings.md` §6); coupling concerns are validated in `temp/validation-catalog.md`. + +## C4 L1 — System Context + +```mermaid +graph TB + agent["Coding Agent
(primary customer)"] + operator["Human Operator
(signs off / posture / keys
— from OUTSIDE the loop)"] + legis["Legis
git/CI + governance layer
(governance-honesty)"] + forge["Forge
(git / CI / PR / checks)"] + loom["Loomweave
(SEI authority)"] + ward["Wardline
(trust/taint analysis)"] + fil["Filigree
(issue tracking / sign-off)"] + warp["Warpline
(preflight facts)"] + + agent -->|override / signoff / read attestations / route findings| legis + operator -->|sign-off, posture floor, key custody| legis + legis -->|reads branch/commit/PR/check context| forge + legis -->|resolve_sei / lineage (SEI opaque)| loom + legis -->|provides rename feed| loom + ward -->|findings ingested → routed into cells| legis + legis -->|SEI-keyed sign-off binding / closure gate| fil + warp -->|advisory preflight facts (never gates)| legis +``` + +## C4 L2 — Containers (process + persistence) + +```mermaid +graph TB + subgraph transports["Transports (thin adapters)"] + http["HTTP — api/app.py"] + mcp["MCP stdio — mcp.py (~23 tools)"] + cli["CLI — cli.py (legis ...)"] + end + svc["service/ — single source of governance truth
(ServiceError taxonomy; adapters translate)"] + subgraph domain["Domain subsystems"] + enf["enforcement/ (2x2 engine)"] + pol["policy/ (grammar, cells, boundary scanner)"] + idn["identity/ (SEI seam — consumer)"] + pos["posture/ (floor, operator key, elevation)"] + fed["federation: wardline/ filigree/ governance/ warpline_preflight/"] + gitci["git/ checks/ pulls/ (git-CI surfaces)"] + end + ops["Runtime/Ops: install.py · doctor.py · hooks.py · config.py"] + stores[("SQLite stores under .weft/legis/
append-only, HMAC-signed, v3 chain binding")] + + http --> svc + mcp --> svc + cli --> svc + http -.imports.-> mcp + svc --> enf + svc --> pol + svc --> idn + svc --> fed + enf --> stores + pos --> stores + gitci --> stores + ops -.resolves store paths (LEGIS_*_DB).-> stores +``` + +## C4 L3 — Component: hub-and-adapters with dependency direction + +```mermaid +graph LR + http["api/"]; mcp["mcp.py"]; cli["cli.py"] + svc["service/"] + enf["enforcement/"]; pol["policy/"]; idn["identity/"] + sto["store/"]; pos["posture/"]; gov["governance/"] + ward["wardline/"]; fil["filigree/"]; warp["warpline_preflight/"] + leaf["leaf: canonical · weft_signing · clock · provenance · records · config"] + inst["install.py"] + + http --> svc; mcp --> svc; cli --> svc + svc --> enf; svc --> pol; svc --> idn; svc --> gov; svc --> ward; svc --> warp + enf --> sto; enf --> idn; enf --> leaf + pol --> svc + sto --> enf + gov --> enf; gov --> fil; gov --> idn; gov --> sto + ward --> enf; ward --> idn + idn --> leaf + pos --> sto; pos --> enf; pos --> pol; pos --> inst + http --> mcp + + classDef concern stroke:#c0392b,stroke-width:3px; + class pol,sto,pos concern +``` + +> Red-outlined nodes participate in a validated coupling concern: `store ↔ enforcement` (bidirectional), `policy → service` (inversion; uses a deferred import to dodge a load cycle), `posture → install` (inverted direction). `api → mcp` is the transport-on-transport edge (Q-H2). + +## The governance 2×2 (enforcement engine) + +```mermaid +quadrantChart + title Enforcement cells — structure (x) x inline LLM judge (y) + x-axis "Simple structure" --> "Complex structure" + y-axis "Judge OFF" --> "Judge ON" + quadrant-1 "PROTECTED: HMAC verdicts, decay sweep, override-rate gate, operator sign-off" + quadrant-2 "COACHED: LLM judge gates the override (model-robustness wall, not crypto)" + quadrant-3 "CHILL: surface + recordable override (no LLM/crypto)" + quadrant-4 "STRUCTURED: block + escalate; human operator signs off" +``` + +## Sign-off binding sequence (fail-closed) + +```mermaid +sequenceDiagram + participant A as Agent/Operator + participant G as governance/signoff_binding + participant F as Filigree + participant L as BindingLedger + A->>G: bind(issue, entity_key, content_hash, signoff_seq) + G->>G: reject if entity_key NOT identity_stable (locator) — fail closed + G->>F: attach(... binding_signature) + F-->>G: ok (pointer held) + G->>L: record(binding_seq) + Note over G,L: if record() fails after attach():
split state, NO silent bind —
ledger.verify() surfaces the missing entry (fail closed) + G-->>A: {binding_seq, binding_signature} +``` diff --git a/docs/arch-analysis-2026-06-28-2142/04-final-report.md b/docs/arch-analysis-2026-06-28-2142/04-final-report.md new file mode 100644 index 0000000..fd11643 --- /dev/null +++ b/docs/arch-analysis-2026-06-28-2142/04-final-report.md @@ -0,0 +1,32 @@ +# 04 — Final Report + +**Subject:** Legis — git/CI + governance layer of the Weft suite +**Checkout:** `fix/policy-boundary-containment` off `main` `25d64e2` (~16,585 LOC, ~78 files, 8 reviewer clusters) +**Deliverable tier:** Architect-Ready (C). Inputs: `01-discovery-findings.md`, `02-subsystem-catalog.md`, `temp/validation-catalog.md` (PASS_WITH_NOTES). + +## Executive summary +Legis is a **governance-honesty** tool whose architecture is unusually faithful to its stated design. The "single source of governance truth in `service/`, with three thin transport adapters over it" claim is **real and measured**: HTTP/MCP/CLI all import `service/` (15/6/2 references) and translate `ServiceError` subclasses into their own shapes rather than re-deciding governance. The fail-closed ethos is implemented at the seams that matter — the cell-routing composition root defaults to `structured` (escalate) when unconfigured, sign-off binding refuses locator (non-SEI) keys and surfaces split-state via `verify()`, and the audit chain is HMAC-signed with v3 chain-position binding. Across 8 independent reviewers, **no live false-green or fail-open path was found**; every honesty-relevant residual is documented and fails closed (detectable), or is already tracked. + +The architecture's weaknesses are **structural, not correctness**: two god-modules (`mcp.py` 2748 LOC, `install.py` 1503 LOC), a small set of coupling edges that invert the intended layering (`store↔enforcement`, `policy→service`, `posture→install`, `api→mcp`), and a few public-surface/robustness loose ends (incomplete `service` `__all__`, no sign-off reconciliation tool, keychain custody adapter still a stub). These are maintainability and evolvability risks, addressable incrementally without touching governance correctness. + +## What the system does (recap) +Records and enforces, at the git/CI boundary, *what changed and whether a human authorized it for the code as it stands now*. Governance is graded through a 2×2 of structure (simple/complex) × inline LLM judge (off/on) → **chill / coached / structured / protected**, with HMAC-signed verdicts, operator sign-off, override-rate gating, and an append-only audit chain in the top cell. SEI (from Loomweave) keys attestations so they survive rename/move; siblings (Wardline, Filigree, Warpline) are separate authorities consumed across explicit federation seams. + +## Architecture shape +- **Hub-and-adapters.** `service/` is the orchestration hub (→ enforcement, policy, identity, governance, wardline, warpline_preflight, canonical). Transports sit above; domain subsystems beside/below; a leaf layer (`canonical`, `clock`, `weft_signing`, `provenance`, `config`, `records`) underneath. +- **Persistence as evidence.** Five SQLite stores under `.weft/legis/`, append-only and HMAC-signed; relocation only via explicit `LEGIS_*_DB` (a repo `weft.toml` deliberately cannot redirect stores — a custody decision). +- **Federation as bounded authority.** Each sibling seam is read/advisory with a hard rule that an advisory consumer (Warpline preflight) never enters a verdict path; Wardline findings are routed but never re-adjudicated ("Wardline analyses, Legis governs"). + +## Strengths (evidence-backed) +1. **Adapter discipline holds.** No reviewer found a governance decision duplicated in an adapter; adapters only map `ServiceError` → shape. The one private cross-adapter import (`api`→`mcp._load_policy_cell_registry`) is a known helper-placement nit (Q-H2), not a decision leak. +2. **Fail-closed at the composition root.** Unconfigured cell routing → `structured`; chill requires explicit `LEGIS_DEV_DEFAULT_CELLS=1` (mcp.py:194-200). +3. **Honest residuals.** The raw-DB-write tamper class, the sign-off split-state window, and the keychain stub are all documented and fail closed / detectable — they do not masquerade as green. +4. **Self-hosted honesty gate.** The `@policy_boundary` decorator + boundary scanner enforce governance-honesty over Legis's own source in CI, and never report a vacuous PASS on zero scope. + +## Concerns (validated; full detail in `05-quality-assessment.md`) +After validation, **four high-severity-sounding reviewer concerns were reclassified** (none is a live gap): posture `read_floor` non-gating is a `main`-checkout artifact (fixed on the unmerged release line); the keychain stub is deliberate fail-closed future-work; the chill cell default is fenced behind a dev env var; the sign-off partial-write is a documented fail-closed trade-off. The surviving real themes are **god-module size**, **coupling inversions**, **public-surface completeness**, and **robustness/ops future-work**. + +## Confidence & limitations +- **High** on structure, dependency direction, adapter discipline, and the fail-closed seams (directly measured / read in full by reviewers). +- **Medium** on exhaustive per-tool behavior of `mcp.py` (sampled at structural seams, not all 23 tools line-by-line). +- **Scope:** `governance_read.v1` and `plainweave_preflight/` contents are release-only (PR #21) and were not analyzed from source; the posture `read_floor` gate is likewise release-only. A re-run against the post-PR-#21 `main` would close these. diff --git a/docs/arch-analysis-2026-06-28-2142/05-quality-assessment.md b/docs/arch-analysis-2026-06-28-2142/05-quality-assessment.md new file mode 100644 index 0000000..4e6d37b --- /dev/null +++ b/docs/arch-analysis-2026-06-28-2142/05-quality-assessment.md @@ -0,0 +1,46 @@ +# 05 — Code Quality Assessment + +> Findings from the 8 reviewers, **after validation** (`temp/validation-catalog.md`). Severity is by Legis's own yardstick: a real high-severity issue is a **false-green / fail-open** risk; documented fail-closed residuals are not defects. Each item cites evidence and, where relevant, an existing tracker ID. + +## Governance-honesty posture — strong +No live false-green or fail-open path surfaced across 16.5K LOC / 8 independent reviews. The composition root fails closed (mcp.py:194-200), sign-off binding fails closed on non-SEI keys and on split-state (signoff_binding.py:46-50, 71-81), the audit chain is HMAC-signed with v3 chain binding, and the self-hosted policy-boundary gate prevents vacuous PASS. **This is the product's core promise and the code keeps it.** + +## A. Reclassified — NOT defects (validation corrected the reviewers) +| Item | Original framing | Corrected classification | +|------|------------------|--------------------------| +| Posture `read_floor` not gating `verify_integrity` (ledger.py:92) | fail-open floor read | **Scope artifact** — gate (`eb28e4b`, legis-476ab6f125) is unmerged on release/PR #21, not on `main`. Fixed on the release line. | +| `_keychain_available()` returns False (install.py:1349) | unimplemented custody | **Deliberate fail-closed stub** — falls back to age-file rather than claiming a keychain it can't write. Honest future-work. | +| `default_policy_cells()` defaults to chill (cells.py:64) | silent self-clear default | **Fail-closed in production** — composition root needs explicit `LEGIS_DEV_DEFAULT_CELLS=1` for chill; otherwise `structured`. Discipline note only. | +| Sign-off partial-write window (signoff_binding.py:71-81) | uncompensated split state | **Documented fail-closed trade-off** — detectable via `ledger.verify()`; a repair tool is future-work (§D). | + +## B. Real — Structural / maintainability (Important) +| # | Finding | Evidence | Why it matters | +|---|---------|----------|----------------| +| B1 | **`mcp.py` god-module (2748 LOC)** — 23 tool handlers + JSON-schema catalog + stdio loop + runtime + idempotency + helpers in one file | mcp.py whole | Every new tool edits 4 sites in one file; high change-coupling, no structural enforcer. Highest-churn surface in the repo. | +| B2 | **`install.py` god-module (1503 LOC)** — instruction block, skill pack, hooks, `.mcp.json`, posture genesis, key custody | install.py whole | Mixed responsibilities; `posture/` even reaches back into it (B5). | +| B3 | **`store ↔ enforcement` bidirectional coupling** | `store.head_anchor`→`enforcement.signing`; `enforcement.{engine,protected,signoff}`→`store.*` | Not a runtime cycle today (no decision logic crosses), but `enforcement.signing` is a shared crypto primitive mis-located in `enforcement/`; one upward import would break the non-circular property silently. | +| B4 | **`policy → service` layering inversion** | policy imports service; `boundary_scan.py` uses a *deferred* `service.errors` import to dodge a load-time cycle | A lower-level grammar/scanner depending on the hub; the deferred import is a smell marker that this edge is fragile. | +| B5 | **`posture → install` inverted dependency** | `posture/ledger.py:344` imports `OperatorKeyCustodyError` from `install.py` | Setup module imported by a runtime module; the error type belongs in a shared/posture errors module. | +| B6 | **`api → mcp` transport-on-transport** | `api/app.py:398` imports private `mcp._load_policy_cell_registry` (comment cites Q-H2) | Fragile change-coupling between two adapters; helper belongs in `config.py`/`policy/`. | + +## C. Real — Public surface / contract hygiene (Minor) +| # | Finding | Evidence | +|---|---------|----------| +| C1 | `service/__init__.py` `__all__` omits `UnresolvedInputError`, `WardlineRoutingError`, `ProtectedKeyRequiredError`, and `sign_off` — adapters must reach into submodules to catch/call them | service/__init__.py:37; governance.py:724-746 | +| C2 | Two structurally-unrelated PR types (`git/pull_request.PullRequestContext` vs `pulls/models.PullRequest`) with identical fields — intentional ("forge is source of truth") but no type-level guard against confusion | git/pull_request.py, pulls/models.py | + +## D. Real — Robustness / ops future-work (Minor, fail-closed today) +| # | Finding | Evidence | Note | +|---|---------|----------|------| +| D1 | No reconciliation/repair tool for a sign-off split-state | signoff_binding.py:71-81 | Detectable via `verify()`; healing is manual. A `doctor` repair path would close it. | +| D2 | Keychain custody adapter unimplemented | install.py:1344-1349 | Deliberate stub; age-file/env tiers work. Ship a live keychain adapter when ready. | +| D3 | `checks.check_report` accepts `commit_sha` with no proof-of-commit; `pulls.record()` does delete-then-insert with no staleness check | checks surface; pulls/surface.py | No false-green (records labeled `UNAUTHENTICATED`), but phantom/stale rows can mislead a downstream SHA join. | +| D4 | Advisory boundary enforced by discipline/comments, not a type wall (Warpline/Filigree clients return untyped dicts) | warpline_preflight/client.py | A newtype/sealed return would make accidental verdict-path use impossible rather than merely discouraged. | + +## E. Tracked / known (link, do not re-file) +- **F1** — `TrailVerifier._requires_verification` derives verification need from attacker-controllable in-record fields (modify-to-unsigned). = tracker **legis-e5e5b0b57f** + README conceded raw-DB residual. +- **Non-ASCII golden vector** — `canonical.py ensure_ascii=False` is correct (intentional HMAC contract); the missing non-ASCII pinned cross-tool golden is a Wardline-side follow-up. +- **Q-H2** — the `api→mcp` helper placement (B6) is the named decision. + +## Overall +**Correctness/honesty: A.** **Structure/maintainability: B–.** The debt is concentrated in two god-modules and a handful of coupling inversions — all incrementally fixable. Prioritized plan in `06-architect-handover.md`. diff --git a/docs/arch-analysis-2026-06-28-2142/06-architect-handover.md b/docs/arch-analysis-2026-06-28-2142/06-architect-handover.md new file mode 100644 index 0000000..035c13e --- /dev/null +++ b/docs/arch-analysis-2026-06-28-2142/06-architect-handover.md @@ -0,0 +1,44 @@ +# 06 — Architect Handover + +> Transition document: from "what is" (`02`/`04`/`05`) to "what to change." For improvement-planning. Scope is **structure/maintainability** — the validation pass confirmed governance correctness is sound, so nothing here touches the fail-closed seams except to make them easier to keep correct. Items map to `05-quality-assessment.md` IDs. + +## Guardrails for any change here +1. **Preserve fail-closed.** Never let a refactor introduce a path where absent/empty input reads as a pass. Re-run the honesty gates after every change: `uv run legis policy-boundary-check --root src --repo-root .`, `uv run legis governance-gate`, `uv run pytest tests/conformance/test_sei_oracle.py`, full suite + coverage floors. +2. **Don't touch the byte-contracts.** `canonical.py ensure_ascii=False` and the HMAC field sets are cross-tool contracts — out of bounds for cleanup. +3. **SEI stays opaque.** No refactor may start parsing/deriving SEI. +4. **These are `main`-line refactors** — schedule them after PR #21 (release) and PR #22 (security fix) land, to avoid colliding with in-flight work. + +## Prioritized backlog + +### P1 — Decouple the layering inversions (low effort, high leverage) +These are small moves that remove the most fragile edges and unlock safer evolution. + +- **H-1 (B3): Extract a `crypto/` (or `signing/`) leaf.** Move `enforcement.signing` to a dependency-free leaf package that `store/` and `enforcement/` both import downward. Kills the `store↔enforcement` bidirectional edge and the "one upward import breaks it silently" risk. *Effort: S. Impact: H.* +- **H-2 (B5): Relocate `OperatorKeyCustodyError`.** Move it from `install.py` to a shared `posture/errors.py` (or a common errors module); `posture` stops importing the 1503-LOC setup module. *Effort: XS. Impact: M.* +- **H-3 (B6 / Q-H2): Move `_load_policy_cell_registry`** out of `mcp.py` into `policy/` or `config.py`; have both `api/` and `mcp.py` import it from there. Removes the transport-on-transport edge. *Effort: S. Impact: M.* +- **H-4 (B4): Document/contain the `policy→service` edge.** Confirm the only `policy→service` use is the deferred `service.errors` import; if so, consider moving the small error types `policy` needs into a leaf so the deferred import can become a normal one (or leave the deferred import with a clear comment as the sanctioned pattern). *Effort: S. Impact: M (clarity).* + +### P2 — God-module decomposition (medium effort, high long-term leverage) +- **H-5 (B1): Split `mcp.py` (2748 LOC).** Separate concerns: (a) tool **schemas/catalog**, (b) tool **handlers** (grouped by domain: governance, git/CI, federation, posture), (c) the **stdio loop + dispatch + error mapping**, (d) **runtime construction**. Keep `call_tool`/`tool_definitions`/`build_runtime` as the stable public surface. Target: no single file > ~600 LOC; adding a tool touches one handler module + one schema entry. *Effort: M–L. Impact: H. Risk: mechanical but broad — do behind the existing MCP conformance tests.* +- **H-6 (B2): Split `install.py` (1503 LOC).** Separate instruction-block injection, skill-pack install, hook registration, `.mcp.json` wiring, and posture/key custody into focused modules under an `install/` package. Lets H-2 land cleanly. *Effort: M. Impact: M–H.* + +### P3 — Robustness & contract hygiene (close the honest loose ends) +- **H-7 (D1): Sign-off reconciliation/repair.** Add a `doctor` check + repair that detects a Filigree-attached binding with no ledger entry (the documented split-state) and offers an `[operator]`-tagged heal. Turns a detectable-but-manual residual into a guided fix. *Effort: M. Impact: M.* +- **H-8 (C1): Complete `service/__init__.py` `__all__`.** Re-export `UnresolvedInputError`, `WardlineRoutingError`, `ProtectedKeyRequiredError`, and `sign_off` so the service layer's public surface is whole. *Effort: XS. Impact: M (contract clarity).* +- **H-9 (D3): Tighten the git/CI write surfaces.** Add an optional proof-of-commit gate to `check_report` (or document why `UNAUTHENTICATED`-labeled phantom rows are acceptable) and a staleness/conflict guard to `pulls.record()`. No false-green today, but removes downstream-join foot-guns. *Effort: M. Impact: M.* +- **H-10 (D4): Type-wall the advisory boundary.** Give the Warpline/Filigree advisory clients a sealed/newtype return so a future edit physically cannot route advisory data into a verdict path — upgrade the invariant from discipline to compiler-enforced. *Effort: M. Impact: M (defense-in-depth on the load-bearing advisory boundary).* + +### P4 — Future-work (already fenced, ship when ready) +- **H-11 (D2): Keychain custody adapter.** Implement `_keychain_available()` + a live keychain `key_sink`; the fail-closed fallback already protects users until then. + +## Tracked items — link, don't duplicate +- **F1 / legis-e5e5b0b57f** (derive protected-record verification from config/identity, not in-record fields) — already on the tracker; the modify-to-unsigned residual. Not re-filed here. +- **Posture `read_floor` gate** — already fixed on the release line (`eb28e4b`, legis-476ab6f125); will land on `main` with PR #21. No action. +- **Non-ASCII cross-tool golden vector** — Wardline-side follow-up. + +## Sequencing suggestion +P1 (H-1..H-4) first — they're cheap and make P2 safer. Then H-5/H-6 (god-modules) behind the conformance + full test suite. P3 as capacity allows; P4 when the keychain adapter is prioritized. Hand the sequenced/effort-scored version to `/axiom-program-management` if this becomes a funded refactor track; per-module implementation plans to `/axiom-planning`. + +## Suggested next packs +- **Quality deep-dive:** `axiom-system-architect` (architecture critique / debt cataloging) on the god-module split. +- **Security/threat modeling:** `ordis-security-architect` for an adversarial pass on the federation seams + audit chain (complements the governance-honesty lens here). diff --git a/docs/arch-analysis-2026-06-28-2142/temp/catalog-1-service.md b/docs/arch-analysis-2026-06-28-2142/temp/catalog-1-service.md new file mode 100644 index 0000000..55818ae --- /dev/null +++ b/docs/arch-analysis-2026-06-28-2142/temp/catalog-1-service.md @@ -0,0 +1,34 @@ +## Service layer + +**Location:** `src/legis/service/` + +**Responsibility:** Acts as the single, transport-agnostic governance decision authority that all three transports (HTTP `api/app.py`, MCP `mcp.py`, CLI `cli.py`) call into, raising typed `ServiceError` subclasses on failure so each adapter owns its own error-shape translation. + +**Key Components:** +- `__init__.py` — Public surface of the layer; re-exports 8 error types, 2 data classes, and 14 service functions as the defined contract adapters import from (`service/__init__.py:9–63`). +- `errors.py` — `ServiceError` taxonomy: 9 typed subclasses covering audit integrity (`AuditIntegrityError`), enablement gates (`NotEnabledError`), resource absence (`NotFoundError`, `NoSuchRequestError`), state conflicts (`NotClearedError`, `BindingUnavailableError`), bad input (`InvalidArgumentError`, `UnresolvedInputError`), routing failures (`WardlineRoutingError` with `SERVER_MISCONFIGURED`/`SERVER_OWNED`/`MALFORMED` kind discriminator), and key-absent protected reads (`ProtectedKeyRequiredError`). Adapters switch on type and `.kind`, never on message text (`errors.py:1–99`). +- `governance.py` — Core decision logic: `resolve_for_record`/`resolve_for_entry` (the single SEI-on-entry resolve boundary, failing closed when Loomweave absent or SEI mismatches, `governance.py:43–154`); `verified_records` (full O(N) trail verification on every call, `governance.py:157–206`); `compute_override_rate` / `evaluate_override_rate_gate` (threshold/window hardcoded from ADR-0002 constants, not caller input, `governance.py:209–409`); `submit_override`, `submit_protected_override`, `submit_operator_override`, `request_signoff`, `sign_off`, `bind_signoff_issue` (all wired through `resolve_for_entry` and gate-null checks, `governance.py:412–747`); `read_identity_gaps`, `read_lineage_integrity` (GOV-1/GOV-2 honesty reads — always `"unavailable"` vs `"checked"`, never an empty list that reads as all-clear, `governance.py:556–632`); `read_sei_attestations` (forge-proof discriminator for operator_override/signoff_cleared; asymmetric error rule: ambiguous → omit, never surface, `governance.py:223–350`); `evaluate_policy` (records UNKNOWN provenance gaps, `governance.py:749–767`). +- `explain.py` — `explain_policy`/`explain_cell` data types and logic: routes through `FlooredRegistry.cell_for` (not raw rule.cell) so posture floor is respected; `policy_known` boolean distinguishes configured policy from hallucinated/unconfigured name (`explain.py:87–108`); `explain_cell` is the single source of truth for per-cell `enabled`/`available_moves`, ensuring `policy_list` and `policy_explain` cannot disagree (`explain.py:111–175`). +- `wardline.py` — `resolve_scan_routing` (the single home for the server-owned-vs-request routing decision, raising `WardlineRoutingError` with kind for adapter mapping, `wardline.py:58–171`); `route_wardline_scan` (verifies artifact provenance, extracts active defects, resolves entity keys, routes findings through enforcement, returns `RoutedScan` with `artifact_status` + `artifact_status_reason` always present — no posture without provenance, `wardline.py:196–249`). +- `preflight.py` — `read_warpline_preflight`: advisory Warpline read; unconfigured/unreachable → `"unavailable"` with reason, never an empty affected-set that reads as "nothing impacted" (`preflight.py:16–38`). +- `source_binding.py` — `verify_current_source_binding` / `require_verified_source_binding`: fail-closed SHA-256 fingerprint check for protected submissions from Python source-path locators; non-path entities record honest `"unverified"` rather than being rejected; `source_binding_status` is folded into HMAC-signed fields so consumers can distinguish (`source_binding.py:31–107`). + +**Dependencies:** +- Inbound: `legis.api.app` (HTTP transport), `legis.mcp` (MCP transport), `legis.cli` (CLI transport) — all three import only from `service/` for governance decisions. +- Outbound: `legis.enforcement` (engine, lifecycle, protected gate, signoff gate, verdict), `legis.identity` (resolver, entity_key, loomweave_client), `legis.policy` (grammar, cells/FlooredRegistry), `legis.wardline` (governor, ingest, policy), `legis.warpline_preflight.client`, `legis.canonical` (content_hash), `legis.governance` (params, gaps, signoff_binding). + +**Patterns Observed:** +- Gate-null fail-closed: every function that requires a gate (`protected_gate`, `signoff_gate`, `filigree`) checks for `None` first and raises `NotEnabledError` naming the operator knob, before any computation (`governance.py:466–470`, `543–545`, `688–693`). +- Single resolve boundary: all identity resolution flows through `resolve_for_entry` (SEI-on-entry, L1/L2 paths) or `resolve_for_record` (record-side), never re-derived in gate/engine layers (`governance.py:43–154`). +- Asymmetric error rules: false-positive safety is the cheaper failure mode; `read_sei_attestations` omits any ambiguous record, `read_identity_gaps`/`read_lineage_integrity` always discriminate `"unavailable"` vs `"checked"`/`"verified"`, `read_warpline_preflight` always has a reason alongside `"unavailable"` (`governance.py:229–232`, `preflight.py:1–9`). +- Adapter isolation: `errors.py` docs map each error type to HTTP status codes and MCP error codes, but the service layer raises only `ServiceError` subclasses with structured attributes (`.kind`, `.cause`, `.fix`) — adapters switch on type, never text (`errors.py:1–6`). +- Policy constants hardcoded out of reach: override-rate threshold/window/floor sourced from `params` module constants, not caller input (`governance.py:214–220`). +- Full trail verification on every interactive read: deliberate O(N) cost; comment explicitly rejects incremental verification as a tamper window (`governance.py:180–193`). +- `UnresolvedInputError` in `errors.py` is NOT re-exported from `__init__.py` (line 9–18): it is raised internally by `governance.py` but absent from the public surface, meaning adapters that import only from `service/` cannot `except UnresolvedInputError` by name without an additional import. `WardlineRoutingError`, `ProtectedKeyRequiredError` are similarly absent from `__all__`. + +**Concerns:** +- `UnresolvedInputError`, `WardlineRoutingError`, and `ProtectedKeyRequiredError` are defined in `errors.py` and raised from service functions but are NOT listed in `__init__.py`'s `__all__` or imports (`__init__.py:9–63`). Adapters relying only on `from legis.service import ...` cannot catch these by name without an extra `from legis.service.errors import ...`. This is a latent import discipline gap: the MCP and HTTP adapters presumably do import them directly from `errors`, but the omission from the declared public surface is inconsistently documented and risks future callers missing them. +- `sign_off` (operator sign-off on a pending request, `governance.py:724–746`) is implemented in `governance.py` but is not exported from `__init__.py` (absent from `__all__` and import list). This means it is not reachable via the declared public surface (`service/__init__.py:37–63`). If an adapter calls it via `from legis.service.governance import sign_off` it works, but the public-surface contract is violated. +- Verified: error handling is present throughout. No resource handles opened at service layer. No function returns `None` or `[]` on a failure path that an adapter could read as a governance pass — all failure paths raise. Warpline/identity-gap/lineage reads use explicit `"unavailable"` status, never silent empty. Source binding is recorded honestly (`"unverified"`) for non-path entities rather than rejected, with the status folded into HMAC fields — consumer read-side discipline is noted in comments but not enforced at this layer. + +**Confidence:** High — read 100% of all 7 files in `src/legis/service/` (errors.py, __init__.py, governance.py, explain.py, wardline.py, preflight.py, source_binding.py). Cross-validated export surface against `__init__.py:9–63` and function definitions. Dependency claims verified against import statements in each file. Honesty patterns confirmed by reading all decision paths and inline comments. diff --git a/docs/arch-analysis-2026-06-28-2142/temp/catalog-2-enforcement-policy.md b/docs/arch-analysis-2026-06-28-2142/temp/catalog-2-enforcement-policy.md new file mode 100644 index 0000000..14105ac --- /dev/null +++ b/docs/arch-analysis-2026-06-28-2142/temp/catalog-2-enforcement-policy.md @@ -0,0 +1,72 @@ +## Enforcement + +**Location:** `src/legis/enforcement/` + +**Responsibility:** Implements the governance 2x2 enforcement engine: routes overrides through the appropriate cell (chill, coached, structured, or protected), appends every decision to an HMAC-signed append-only audit trail, and provides lifecycle gates (decay sweep, override-rate check) for the protected cell. + +**Key Components:** +- `engine.py` (120 lines) — Simple-tier engine (chill + coached cells). `EnforcementEngine.submit_override` is the single entry point; when `judge=None` the record appends unconditionally (chill), otherwise the `LLMJudge` evaluates before append (coached). Every submission appends exactly one record — no silent path. +- `judge.py` (186 lines) — Coached-cell judge. Defines `LLMJudge`, `parse_verdict` (fail-closed: anything not an explicit ACCEPTED is BLOCKED), and `_parse_structured_response` (JUDGE-3: rejects `OVERRIDDEN_BY_OPERATOR` from model output). Prompt injection defense: 8192-char serialized-request cap (JUDGE-1); JSON-serialized request prevents structural key injection (JUDGE-2). +- `protected.py` (422 lines) — Protected cell. `ProtectedGate.submit` routes through the LLM judge then requires a deterministic non-LLM `ProtectedValidator` to confirm ACCEPTED (JUDGE-3 / Q-H3); any validator exception is a veto. `_record_signed` computes an HMAC-SHA256 signature over a defined field set including `chain_seq` (v3 / AUD-1 position binding). `TrailVerifier` re-checks signatures on read and optionally checks `HeadAnchor` for tail-truncation detection. +- `signing.py` (62 lines) — HMAC-SHA256 signing primitive. v2 binds record content only; v3 additionally binds `chain_seq` to close delete-and-rechain forgery. `canonical_json` from `legis.canonical` is the serialization contract (ensure_ascii=False is intentional per the cross-tool HMAC contract with Wardline). +- `signoff.py` (194 lines) — Structured/protected sign-off gate. `SignoffGate.request` writes `PENDING_SIGNOFF` (does NOT clear the gate); `sign_off` writes `SIGNED_OFF` referencing the request by seq and payload hash. Protected sign-offs are HMAC-signed with v3 chain_seq binding. Anchor advance is batch-aware (Q-M5). +- `verdict.py` (51 lines) — Value types shared across the engine. `Verdict.model_emittable()` is the single source of truth for what an LLM may return (ACCEPTED or BLOCKED only); `Verdict.accepting()` is the single source of truth for what counts as cleared. Both are checked by name, not re-listed. +- `lifecycle.py` (135 lines) — Protected-cell lifecycle gates. `decay_sweep` re-judges each ACCEPTED suppression via the live judge (skips BLOCKED and OVERRIDDEN_BY_OPERATOR). `evaluate_override_rate` computes operator-override share over the most recent window; PASS_WITH_NOTICE when below `min_sample`. +- `judge_factory.py` (36 lines) — Runtime wiring. `FailClosedJudge` is the default when no LLM is configured (always returns BLOCKED). `build_judge_from_env` returns the real `LLMJudge` or `FailClosedJudge`, never None from the coached/protected wiring paths. +- `llm_client.py` (169 lines) — OpenRouter-backed LLM client. Validates HTTPS (loopback exception), blocks HTTP redirects, caps response size at 1 MB. API key read from `OPENROUTER_API_KEY` environment variable at config time. + +**Dependencies:** +- Inbound: `service/` (wires the engine/gates into governance decisions), `api/`, `mcp.py`, `cli.py` (thin adapters that consume `service/`) +- Outbound: `legis.canonical` (signing serialization), `legis.clock` (timestamp injection), `legis.identity.entity_key` (SEI-keyed records), `legis.records.override_record` (record schema), `legis.store.protocol` / `legis.store.head_anchor` (append-only store + tail-truncation anchor) + +**Patterns Observed:** +- Protocol-typed injection for all external seams (`AppendOnlyStore`, `Clock`, `Judge`, `LLMClient`, `ProtectedValidator`) — every dependency is testable without a network or real store. +- Every verdict path appends exactly one record (no silent governance path); the only way to not append is to raise, which is fail-closed. +- Single sources of truth: `Verdict.model_emittable()` for LLM-emittable verdicts, `Verdict.accepting()` for what counts as cleared, `signing_fields()` shared by write and verify paths so they cannot drift. +- v3 chain position binding: signature includes `chain_seq` from the database column (not a payload field), closing delete-and-rechain attacks. +- Validator exception handling in `ProtectedGate.submit` (protected.py:355-358): any exception from the user-supplied validator is caught and treated as a veto (fail-closed), preventing an unexpected record shape from surfacing as a fail-open 500. +- `FailClosedJudge` sentinel: the coached/protected paths never degrade to a nil judge on misconfiguration; absence of LLM config produces an always-BLOCKED fallback, not an open path. + +**Concerns:** +- `TrailVerifier._requires_verification` (protected.py:133-142) derives verification requirement from in-record fields (`protected_cell`, `file_fingerprint`, etc.). An actor with raw DB write access can strip these markers and downgrade a protected record to unsigned — the verifier then skips it. This is a documented residual (protected.py:100-113) of the raw-file-write threat tier, mitigated only by the opt-in `HeadAnchor`. The `HeadAnchor` itself has a documented anchor-replay caveat (re-writing the anchor to match a truncated trail). Both are known, stated residuals, not silent gaps. +- `SignoffGate.is_cleared` (signoff.py:163-170) performs a full linear scan of all governance records for each is-cleared query. Not a correctness issue, but a potential performance concern on long-lived trails. No pagination or index is used. +- The `ProtectedValidator` callable type alias (`protected.py:203`) has no interface contract beyond `Callable[[OverrideRecord], bool]`. There is no documented precondition about what fields of `OverrideRecord` the validator may trust. The exception-as-veto guard mitigates unexpected inputs, but validator authors have no formal contract to code against. +- `llm_client.py` API key (`OPENROUTER_API_KEY`) is read from the environment at `llm_client_config_from_env()` call time (llm_client.py:44), which is at server startup. No rotation/reload mechanism is visible in this module; a key rotation requires a server restart. + +**Confidence:** High — Read 100% of all 8 files in the subsystem. Cross-verified verdict-path claims by tracing `submit_override` (engine.py:52-97), `ProtectedGate.submit` (protected.py:303-387), and `TrailVerifier.verify` (protected.py:144-197). Cross-verified signing field set against both write path (`_record_signed`, protected.py:241-301) and verify path (`TrailVerifier.verify`, protected.py:144-197). Checked `Verdict.model_emittable()` usage in `_parse_structured_response` (judge.py:107) and in `TrailVerifier` comment context. + +--- + +## Policy + +**Location:** `src/legis/policy/` + +**Responsibility:** Provides the agent-programmable policy grammar (boundary type registry, evaluation, and fail-closed UNKNOWN semantics), the policy-to-cell routing registry (loaded from `policy/cells.toml`), and the `@policy_boundary` self-honesty gate (decorator + static scanner + test-evidence evaluator) that enforces Legis's own governance honesty over its source. + +**Key Components:** +- `grammar.py` (109 lines) — `PolicyGrammar` registry: `register` is conflict-safe (no shadowing), `evaluate` is fail-closed (unregistered policy, exception from boundary, or non-`PolicyResult` return all yield `UNKNOWN` with `provenance_gap=True` — never CLEAR). `AllowlistBoundary` is the builtin. `default_grammar()` preloads builtins. +- `cells.py` (138 lines) — `PolicyCellRegistry` for policy-to-cell routing. Glob-capable (`fnmatch`) with exact-pattern precedence. `default_policy_cells()` defaults to `chill` (dev/test only, explicitly documented as unsafe for production). `fail_closed_policy_cells()` defaults to `structured`. `load_policy_cells()` reads `policy/cells.toml` (committed default). Validation rejects unknown cell names at load time. +- `decorator.py` (254 lines) — `@policy_boundary` decorator (metadata-only passthrough), `fingerprint_source` (shared canonicalization for runtime and static scanner — Q-L5 parity), `check_policy_boundary` (runtime honesty gate: verifies citation, invariant, test_ref resolution, fingerprint match, and test evidence quality). `_stable_ast_repr` avoids `ast.dump` version instability across Python 3.12/3.13. +- `boundary_scan.py` (456 lines) — Static AST scanner. `scan_policy_boundaries` walks all `.py` files, parses each, runs `_BoundaryVisitor`. Fail-degraded-not-dead: parse errors and RecursionError/MemoryError on a per-file basis produce a finding and continue (per dogfood-4 A2). `count_source_files` is the single source of truth for scope — a gate must not report PASS on zero files scanned. `assert_within_boundary` blocks path-traversal attacks on caller-supplied scan roots (deferred import of `service.errors` to avoid load-time cycle). +- `evidence.py` (233 lines) — `evaluate_test_evidence`: shared logic for both the static scanner and runtime gate (Q-L5 parity). Checks in order: disabled marker detection (POLICY-1), exercise (excluding calls inside uninvoked nested helpers), shadowing, and policy co-occurrence inside the same `assert` condition (Q-M8). Documented honest residuals: module-level `pytestmark`, aliased skip markers, fixture-mediated skips. +- `__init__.py` (1 line) — Module docstring only; no public re-exports. + +**Dependencies:** +- Inbound: `service/` (wires policy grammar and cell registry into governance decisions, calls `scan_policy_boundaries` and `count_source_files` for the honesty gate), `enforcement/` (consumes `PolicyCellRegistry.cell_for` to select the enforcement gate per policy) +- Outbound: `legis.canonical` (`content_hash` used in `decorator.fingerprint_source`); `legis.service.errors.InvalidArgumentError` via a deferred call-time import in `boundary_scan.assert_within_boundary` (to avoid a load-time cycle — `service/__init__` imports `policy/`) + +**Patterns Observed:** +- Fail-closed by design at every ambiguous point: unregistered policy → UNKNOWN (not CLEAR), boundary exception → UNKNOWN, zero-file scan is explicitly distinguishable from a scan with zero findings. +- Deferred import pattern in `boundary_scan.assert_within_boundary` (boundary_scan.py:116-117) breaks a load-time cycle between `policy/` and `service/` without restructuring the dependency graph. +- Shared canonicalization (`fingerprint_source`) ensures the runtime gate and the static scanner compute identical fingerprints for the same source, preventing divergence (Q-L5). +- `evaluate_test_evidence` is the single evidence-judgement implementation used by both the static scanner and the runtime gate, so the two gates cannot have different evidence standards. +- `_stable_ast_repr` (decorator.py:104-126) is a forward-compatibility measure: `ast.dump` output changed between Python 3.12 and 3.13 (`show_empty` default), which would have invalidated pinned fingerprints on a Python bump. The stable serializer walks `_fields` explicitly. +- Cell routing precedence: exact patterns beat globs (cells.py:44-56); unlisted policies fall through to `default_cell`. The committed `policy/cells.toml` ships `default_cell = "structured"` (fail-closed for production). + +**Concerns:** +- `default_policy_cells()` (cells.py:64-71) defaults to `chill` and is documented as "NOT a safe production default". The code relies on composition roots to choose `fail_closed_policy_cells()` or `load_policy_cells()` instead. If a composition root omits the production selection, governance silently self-clears. The comment and docstring warn against this but there is no runtime guard preventing it. +- The `policy/` → `service/` layering edge (boundary_scan.py:116-117, deferred import) is a structural inversion: `policy/` reaches into `service/` for `InvalidArgumentError`. This is mitigated by the call-time import (no load-time cycle), but it means `policy/` is not independently deployable from `service/`. A dedicated `errors.py` module shared by both layers would close this without restructuring. +- `count_source_files` (boundary_scan.py:84-97) is a separate filesystem walk from `scan_policy_boundaries`. A race between the two (a file appearing or disappearing between the count and the scan) could produce a count mismatch. The gate compares count > 0 vs. findings, not exact file-set equality. In practice this is a non-issue for a local source scan, but is a gap for a hostile or rapidly changing filesystem. +- `evidence.py` documents three residual false-green classes (module-level `pytestmark`, aliased skip markers, fixture-mediated skips) that the gate structurally cannot detect while maintaining Q-L5 runtime/static parity. These are stated honestly, not silently absent. The live exposure is noted as nil at current decoration-site count. + +**Confidence:** High — Read 100% of all 6 files in the subsystem (grammar.py, cells.py, decorator.py, boundary_scan.py, evidence.py, __init__.py). Read the committed `policy/cells.toml`. Verified fail-closed path in `PolicyGrammar.evaluate` (grammar.py:62-85), the deferred import location (boundary_scan.py:116-117), the zero-scope guard (`count_source_files`, boundary_scan.py:84-97), and the shared `fingerprint_source` call in both decorator.py:144-162 and boundary_scan.py:237. diff --git a/docs/arch-analysis-2026-06-28-2142/temp/catalog-3-identity-persistence.md b/docs/arch-analysis-2026-06-28-2142/temp/catalog-3-identity-persistence.md new file mode 100644 index 0000000..eab2d16 --- /dev/null +++ b/docs/arch-analysis-2026-06-28-2142/temp/catalog-3-identity-persistence.md @@ -0,0 +1,91 @@ +## Identity + +**Location:** `src/legis/identity/` + +**Responsibility:** Resolves git locators to Loomweave-minted Stable Entity Identities (SEIs), producing opaque `EntityKey` values that key governance attestations so they survive rename/move events. + +**Key Components:** +- `entity_key.py` (41 lines) — Frozen dataclass holding `value: str` and `identity_stable: bool`; two factory classmethods (`from_locator`, `from_sei`) are the only construction paths; `value` is never parsed downstream — the docstring explicitly states the opacity discipline at line 4. +- `resolver.py` (263 lines) — `IdentityResolver` drives the WP-5.1 upgrade path: returns a locator-keyed (`identity_stable=False`) `EntityKey` on any failure path and an SEI-keyed (`identity_stable=True`) key only on a confirmed-alive Loomweave response; `IdentityResolution` frozen dataclass has a `__post_init__` guard (lines 61–100) that makes it impossible to construct a self-contradictory record (e.g. `alive=True` + status `NOT_ALIVE`). +- `loomweave_client.py` (240 lines) — `HttpLoomweaveIdentity` thin transport wrapper over `urllib`; injectable `fetch` callable for offline tests; HMAC signed via `weft_signing.sign_weft_request` when a key is provisioned; `_validate_base_url` enforces HTTPS for non-loopback hosts (lines 139–159); `_decode_json_response` enforces a 1 MB response cap; SEI strings are URL-quoted but never parsed (lines 224, 232). +- `__init__.py` (1 line) — Module docstring only; no re-exports. + +**Dependencies:** +- Inbound: `enforcement/` (uses `IdentityResolver` and `EntityKey`), `governance/signoff_binding.py` (SEI-keyed sign-off), `records/override_record.py` (embeds `EntityKey`) +- Outbound: `canonical.content_hash` (lineage snapshot hashing, resolver.py line 168), `weft_signing` (HMAC transport signing, loomweave_client.py lines 33–38), external Loomweave HTTP service + +**Patterns Observed:** +- Fail-closed on every transport and parse error: capability probe failure clears the capability latch (`_capable = None`, resolver.py lines 148–151) so the next resolve retries rather than trusting a stale positive; locator resolve failures and malformed responses all return the `degraded` value (locator-keyed, `UNAVAILABLE`). +- Capability TTL re-probe (5 min window, resolver.py lines 27, 127–153) prevents the positive-latch-forever bug (Q-L6): a Loomweave that loses SEI capability mid-life is noticed within one TTL window. +- `alive is not True` strict identity check (resolver.py lines 194, 232, comment "ID-SEI-2") rejects non-bool truthy values from a buggy or hostile Loomweave — a string `"false"` or integer `1` cannot promote a dead entity to a stable identity. +- `resolve_supplied_sei` returns `None` (not a degraded locator key) when an agent-supplied SEI cannot be confirmed alive (resolver.py lines 171–209): silently demoting an L1 SEI bind to a locator-keyed record is explicitly refused. +- TLS custody warning is emitted but not enforced when `LEGIS_ALLOW_INSECURE_REMOTE_HTTP=1` is set for non-loopback hosts (loomweave_client.py lines 147–158, comment "ID-SEI-1") — the flag is the documented dev-only escape hatch. + +**Concerns:** +- The capability TTL and latch-clearing pattern is correct, but a capability probe failure at the START of `_capability()` (before a latch is ever set, `_capable is None`) logs a warning and returns `False` — which is the right degraded behavior. However, if the Loomweave host is reachable but systematically returns a non-`{"sei": {"supported": true}}` body, `_capable` is set to `False` (not `None`) and latched for the full TTL, suppressing re-probes even when the upstream recovers. This is acceptable but is not documented as a known limitation alongside the Q-L6 fix. +- The `content_hash` field on `IdentityResolution` comes verbatim from the Loomweave response (resolver.py lines 200–201, 252–254) and is not independently verified — Legis trusts Loomweave's assertion. This is architecturally correct (Loomweave is the authority), but an on-path attacker who can forge a response (e.g. via the `LEGIS_ALLOW_INSECURE_REMOTE_HTTP` escape) controls the content axis of governance records, which is not called out in the security model. + +**Confidence:** High — Read 100% of all four files; cross-verified imports (`resolver.py` line 19 imports `content_hash`, line 20 imports `LoomweaveIdentity`; `loomweave_client.py` lines 33–38 import `weft_signing`). Fail-closed and SEI-opacity behaviors verified line-by-line. + + +## Store + +**Location:** `src/legis/store/` + +**Responsibility:** Provides a record-agnostic, append-only, hash-chained SQLite audit store with DB-level UPDATE/DELETE triggers, contiguous-seq verification, and an optional out-of-band head anchor for tail-truncation detection. + +**Key Components:** +- `audit_store.py` (457 lines) — `AuditStore` is the implementation; SQLAlchemy with `NullPool` (no lingering locks); `synchronous=FULL` enforced unconditionally (lines 69–77, not configurable); `journal_mode=WAL`; `BEGIN IMMEDIATE` write locks on every append path; `append_signed` (lines 297–312) hands the signer its `(seq, prev_hash)` under the held lock so the v3 HMAC binds the exact position the row receives (AUD-1); `transaction()` (lines 179–212) provides batched appends with thread-local ambient connection; `_assert_no_batch_in_progress` (lines 222–240) turns mid-batch reads into explicit `RuntimeError`s. +- `head_anchor.py` (143 lines) — `HeadAnchor` sidecar file holding `(head_seq, head_chain_hash)` HMAC-signed with `enforcement.signing`; `update` uses temp-file + `os.replace` for atomic writes (line 93); `check` fails closed on missing file (`AnchorError`, lines 107–113) and on signature mismatch (lines 120–121); the REPLAY LIMITATION (a snapshotting attacker can restore a genuine older anchor) is explicitly documented in the module docstring (lines 34–47). +- `protocol.py` (68 lines) — `AppendOnlyStore` Protocol and `AuditRecordLike` Protocol; `transaction()` docstring (lines 57–68) prohibits reads inside batches and documents that `AuditStore` enforces this; `append_signed` contract (lines 27–37) documents the reserve-sign-insert atomicity guarantee. +- `__init__.py` (1 line) — Module docstring only; no re-exports. + +**Dependencies:** +- Inbound: `enforcement/engine.py` (imports `AppendOnlyStore` protocol), `enforcement/protected.py` (imports `AnchorError`, `HeadAnchor`, `AppendOnlyStore`), `enforcement/signoff.py` (imports `HeadAnchor`, `AppendOnlyStore`) +- Outbound: `canonical` (`canonical_json`, `content_hash`; audit_store.py line 40), `enforcement/signing` (`sign`, `verify`; head_anchor.py lines 56–57), `legis.config.ensure_sqlite_parent` (lazy import, audit_store.py line 127) + +**Patterns Observed:** +- Bidirectional coupling between `store/` and `enforcement/`: `enforcement/` imports `store.protocol` and `store.head_anchor`; `store/head_anchor.py` imports `enforcement.signing`. The dependency graph is: `store.head_anchor` → `enforcement.signing` → `canonical`; `enforcement.{engine,protected,signoff}` → `store.{protocol,head_anchor}`. This forms a cycle at the package level (`store ↔ enforcement`) but NOT at the module level (no circular imports in practice because `audit_store.py` does NOT import enforcement and `enforcement.signing` does not import store). +- `verify_integrity` (lines 362–442) is O(N) by design: checks seq contiguity, recomputes `content_hash` for every payload, walks the `prev_hash` chain, and recomputes `chain_hash`. The `allow_nan=False` in `canonical_json` catches a Nan/Infinity-injected payload that `json.loads` would silently accept (lines 403–415). +- DB-level triggers (lines 161–177) reject UPDATE/DELETE at the SQLite engine before any application logic runs; the application layer has no mutation methods — two independent enforcement layers. +- `append` on an uninitialized (`initialize=False`) path is never called safely against a no-table DB: `_has_log_table` guards reads (lines 314–323) but `_insert` / `_write` would raise `OperationalError` on missing table. This is by design — `initialize=False` is for read-only inspection handles. + +**Concerns:** +- The store↔enforcement bidirectional coupling is real but is NOT a circular import at runtime: `store.head_anchor` → `enforcement.signing` → `canonical` is a clean downward dependency; `enforcement.{engine,protected,signoff}` → `store.protocol` / `store.head_anchor` is also clean downward. The cycle is only at the package-name level. The governance-honesty risk is low: neither direction pulls in logic that could silently change a governance decision. However, if `enforcement.signing` ever gains upward imports back into `store`, the current non-circular property breaks silently — a refactor to extract signing into a shared `crypto/` leaf would close this cleanly. +- The head anchor's REPLAY LIMITATION (a snapshotting attacker can restore a genuine earlier anchor and paired truncated DB, and the check passes) is honestly documented in `head_anchor.py` lines 34–47 and in the project README. No mitigating control exists for local-filesystem deployments; the documentation correctly names WORM/remote storage or an external monotonicity monitor as the only closure. This is a conceded residual threat, not a gap. +- `transaction()` nested re-entrance silently reuses the outer batch (lines 201–204), which is the correct behavior for avoid double-commit, but it means a caller who believes it opened a fresh transaction boundary has not — if the outer batch rolls back, so does work the inner caller expected to commit. No warning is emitted on re-entrance. This is low-severity for an append-only store but could surprise future callers. + +**Confidence:** High — Read 100% of all four files; cross-verified the bidirectional coupling by grepping all import statements across both packages (head_anchor.py:56–57, enforcement/engine.py:25, enforcement/protected.py:25–26, enforcement/signoff.py:21–22); read enforcement/signing.py in full to confirm it imports only `canonical` (no back-reference to store). + + +## Crypto/Leaf Primitives + +**Location:** `src/legis/canonical.py`, `src/legis/weft_signing.py`, `src/legis/provenance.py`, `src/legis/records/` + +**Responsibility:** Provide the canonical JSON serialization, content hashing, transport-HMAC signing, provenance vocabulary, and governance record schemas that all upper layers share without creating inter-layer dependencies. + +**Key Components:** +- `canonical.py` (51 lines) — `canonical_json`: `json.dumps` with `sort_keys=True`, `separators=(",",":")`, `ensure_ascii=False`, `allow_nan=False`; `content_hash`: SHA-256 of the UTF-8-encoded canonical form. The `ensure_ascii=False` is the byte-for-byte HMAC contract with Wardline's Python signer; both sides use identical `json.dumps` params, so non-ASCII payloads round-trip without escape divergence. The module docstring (lines 1–34) documents the Q-L4 RFC-8785 deferral, the cross-repo golden vector status, and the condition that triggers the upgrade (a non-Python verifier). +- `weft_signing.py` (91 lines) — `sign_weft_request`: produces `X-Weft-Component`, `X-Weft-Timestamp`, `X-Weft-Nonce` headers; signs `METHOD\npath?query\nsha256(body)\ntimestamp\nnonce`; body canonicalization uses `ensure_ascii=True` (NOT `canonical_json`) to match the transport contract — deliberately different from the audit HMAC contract (lines 38–42, module docstring lines 9–15). `weft_hmac_key_from_env`: channel-specific env var falls back to `LEGIS_HMAC_KEY`. +- `enforcement/signing.py` (61 lines) — `sign`/`verify` for per-record HMACs; version-tagged prefixes (`v2` binds content, `v3` additionally binds `chain_seq`); `verify` accepts both v2 and v3 without ambiguity (lines 57–61); uses `canonical_json` (the `ensure_ascii=False` variant) for HMAC body. This file lives in `enforcement/` but functions as a shared crypto primitive used also by `store/head_anchor.py`. +- `provenance.py` (28 lines) — `Provenance` str-Enum; currently one member (`UNAUTHENTICATED`); shared by `checks/` and `pulls/` without either importing the other. +- `records/override_record.py` (40 lines) — `OverrideRecord` frozen dataclass; `extensions: dict` open field for coached- and protected-cell additions without schema migration; `to_payload()` flattens to dict for `AuditStore.append`. +- `records/__init__.py` (1 line) — Module docstring only. + +**Dependencies:** +- Inbound: `canonical` ← `store/audit_store.py`, `enforcement/signing.py`, `identity/resolver.py`, `governance/`, `wardline/` (cross-repo: Wardline's `core/legis.py` replicates the same `json.dumps` call); `weft_signing` ← `identity/loomweave_client.py`, `filigree/`; `enforcement/signing` ← `store/head_anchor.py`, `enforcement/protected.py`, `enforcement/verdict.py`; `records/` ← `enforcement/engine.py`, `governance/` +- Outbound: `canonical.py` — stdlib only (`hashlib`, `json`); `weft_signing.py` — stdlib only (`hashlib`, `hmac`, `json`, `os`, `urllib.parse`); `enforcement/signing.py` — `canonical` only; `provenance.py` — stdlib only; `records/override_record.py` — `identity.entity_key` only + +**Patterns Observed:** +- Two distinct canonicalization contracts coexist intentionally: `canonical_json` (`ensure_ascii=False`) for audit HMACs and content hashes; `weft_body_bytes` (`ensure_ascii=True`) for transport HMACs. Both are documented; the module docstrings explicitly cross-reference each other to prevent accidental unification (canonical.py lines 1–34, weft_signing.py lines 1–27). +- `allow_nan=False` in `canonical_json` is a tamper-detection aid: a payload injected with `Infinity` or `NaN` survives `json.loads` but raises on re-canonicalization, which `verify_integrity` catches as tamper rather than a crash (audit_store.py lines 403–415). +- The v2/v3 signature version tag allows the signing primitive to evolve the field set without ambiguity in stored records; `verify` accepts both prefixes, so a store with mixed v2/v3 records verifies correctly. +- `Provenance` as a str-Enum means `json.dumps` / `canonical_json` emit the bare string value without any enum wrapper, keeping wire payloads stable across Python versions and avoiding coercion on read-back (provenance.py lines 14–16). +- `OverrideRecord.extensions` (records/override_record.py line 24) is the deliberate extension point for coached- and protected-cell fields; no schema migration is required to add judge output or HMAC binding to an override record. + +**Concerns:** +- `enforcement/signing.py` is located inside `enforcement/` but is consumed as a shared primitive by `store/head_anchor.py` — it is not co-located with the other leaf primitives (`canonical.py`, `weft_signing.py`) it logically belongs with. This is a naming/location inconsistency rather than a governance-honesty risk, but it means a reader of `store/` must know to look in `enforcement/` for the signing primitive, which the `head_anchor.py` import makes visible but surprising. +- `Provenance` has exactly one member (`UNAUTHENTICATED`) and the docstring states an authenticated path "would add a stronger value here." The enum is not yet used in any decision path — it is recorded into check/pull payloads but nothing currently gates on its value. If policy logic is added that trusts a higher-provenance value, the gap between the recorded `unauthenticated` claim and any actual authentication verification must be explicitly re-examined. +- The cross-repo non-ASCII golden vector is not yet pinned (canonical.py lines 22–27): both Python signers use identical `json.dumps` params, so they agree by construction, but a Wardline-side drift (e.g. switching to `ensure_ascii=True` in a refactor) would break cross-repo HMAC verification without a failing test on either side until a non-ASCII payload hits production. The fix (a shared golden HMAC vector with a non-ASCII payload in Wardline's repo) is documented as a Wardline-side follow-up. + +**Confidence:** High — Read 100% of `canonical.py` (51 lines), `weft_signing.py` (91 lines), `enforcement/signing.py` (61 lines), `provenance.py` (28 lines), `records/override_record.py` (40 lines), `records/__init__.py` (1 line). Cross-verified that `canonical.py` has no legis imports (stdlib only); verified `weft_signing.py` has no legis imports (stdlib only); verified `enforcement/signing.py` imports only `canonical` (line 30); verified `records/override_record.py` imports only `identity.entity_key` (line 13). diff --git a/docs/arch-analysis-2026-06-28-2142/temp/catalog-4-transports.md b/docs/arch-analysis-2026-06-28-2142/temp/catalog-4-transports.md new file mode 100644 index 0000000..e90b287 --- /dev/null +++ b/docs/arch-analysis-2026-06-28-2142/temp/catalog-4-transports.md @@ -0,0 +1,108 @@ +## MCP Stdio Adapter + +**Location:** `src/legis/mcp.py` + +**Responsibility:** Implements the MCP-over-stdio JSON-RPC transport that exposes 23 agent-callable tools, translating tool calls into `service/` calls and mapping all `ServiceError` subclasses to typed `isError` error envelopes without duplicating any governance decision. + +**Key Components:** +- `McpRuntime` dataclass (lines 154–182) — holds all wired dependencies (engine, gates, ledger handles, identity, filigree, warpline) per launch; `posture_ledger` is stored as a handle only (never a cached floor value, D2 discipline) +- `build_runtime(agent_id)` (lines 203–309) — composition root; constructs gates conditionally on `LEGIS_HMAC_KEY`; wires identity, filigree, warpline, posture ledger; fail-closed defaults throughout (e.g. missing policy cells → `fail_closed_policy_cells()`) +- `tool_definitions()` (lines 369–1316) — emits JSON Schema for all 23 tools including `outputSchema` for every tool (enforced after dogfood-4 A6 incident where an omitted top-level `"type": "object"` caused Claude Code's zod validator to drop all 21 tools) +- `_TOOL_HANDLERS` dict (lines 2574–2599) — dispatch table mapping tool names to 23 `_tool_*` functions; `call_tool()` (lines 2602–2610) wraps every dispatch in `_service_error()` catch-all +- `_service_error(exc)` (lines 1432–1484) — ServiceError→error_code mapping table; covers 12 typed cases including `NoSuchRequestError` before `NotFoundError` (subclass ordering), `WardlineRoutingError` before generic `ServiceError`, and a logging fall-through to `INTERNAL_ERROR` for unhandled exceptions +- `_recovery_for(code)` (lines 1326–1388) — maps each error code to `{recoverable, next_action}` text; `AUDIT_INTEGRITY_FAILURE` and `INTERNAL_ERROR` are marked `recoverable=False` +- `ERROR_ENVELOPE_SCHEMA` (lines 348–366) — shared schema for all `isError:true` responses; `additionalProperties: False` with required `[error_code, message, recoverable, next_action]` +- `run_jsonrpc()` / `main()` (lines 2705–2748) — stdlib-only stdio loop with `_read_bounded_line()` enforcing a 16 MiB per-request cap (overridable via `LEGIS_MCP_MAX_REQUEST_BYTES`) +- `_load_policy_cell_registry()` (lines 184–200) — resolution order: env var → `policy/cells.toml` → fail-closed (unless `LEGIS_DEV_DEFAULT_CELLS=1`) +- `_floored_registry(runtime)` (lines 1588–1596) — called fresh at every cell-resolution site; missing ledger maps to `_NoLedger` → structured floor, never chill + +**Dependencies:** +- Inbound: `cli.py` (`legis mcp` command bootstraps `build_runtime` and calls `main()`); `api/app.py` (imports `_load_policy_cell_registry` for shared config loading) +- Outbound: `service/governance`, `service/wardline`, `service/explain`, `service/preflight`; `enforcement/` (engine, protected gate, signoff gate, trail verifier); `store/audit_store`; `policy/cells`, `policy/grammar`; `identity/resolver`; `filigree/client`; `governance/binding_ledger`; `posture/floor`, `posture/ledger`; `git/surface`; `checks/surface`; `pulls/surface`; `wardline/ingest`; `warpline_preflight/client`; `doctor`, `install`, `hooks` (via `cli.py` best-effort refresh on boot) + +**Patterns Observed:** +- Strict thin-adapter discipline: every tool handler calls a `service/` function and maps the result to the MCP envelope; no governance decision logic is in `mcp.py` itself +- Launch-bound agent identity: `agent_id` is set at startup and propagated to every record; no tool schema accepts an actor argument (enforced by `_validate_argument_keys` against `_allowed_tool_arguments`) +- Idempotency via request-hash scan: `_existing_idempotent_record()` walks the full verified trail (O(N) HMAC cost deliberate — optimization that would skip verification was explicitly declined in rc4 review) +- Posture floor always read fresh per request via `_floored_registry()`; floor-raising during an idempotent replay emits `floor_warning` rather than silently grandfathering past the new floor (D4 discipline) +- `warpline` field annotated `# advisory sibling; NEVER read by a verdict path` (line 181); `warpline_preflight_get` tool description explicitly says "Purely advisory" +- `_one_of()` helper (lines 321–332) injects `"type": "object"` at every discriminated outputSchema to prevent zod validator rejection of entire tools/list + +**Concerns:** +- God-module size (2748 LOC): all 23 tool handlers, the full JSON Schema catalogue, the stdio loop, runtime construction, and multiple utility classes live in one file. No honesty violation, but high change-coupling — adding a tool requires edits across tool_definitions(), _TOOL_HANDLERS, _allowed_tool_arguments(), and the handler itself, all in one file with no enforced co-location. +- `api/app.py` imports `_load_policy_cell_registry` directly from `mcp.py` (line 398): a transport module's private helper (`_` prefix) is shared by the HTTP adapter. This is a transport-on-transport dependency; the function belongs in `config.py` or `policy/` to break the coupling (a comment at the site acknowledges it as Q-H2 / "store-location resolvers live in the transport-agnostic config module"). +- `_service_error` fall-through (line 1483): any unhandled exception reaches `INTERNAL_ERROR` with `logger.error`; operators/Sentry see it, but the agent receives only `str(exc)` — no structured payload — which may be less actionable than the typed cases above it. Not a false-green (error is surfaced), but observability gap for novel exception types. +- `WardlineRoutingError` has three HTTP-distinct kinds (`SERVER_MISCONFIGURED` → 500, `SERVER_OWNED` → 403, `MALFORMED` → 422) but MCP collapses all three to `INVALID_CELL_SPEC` (line 1470); this is intentional and documented in `service/errors.py`, but an agent cannot distinguish a server misconfiguration (operator action needed) from a caller error (argument fix needed) by error code alone. + +**Confidence:** High — read 100% of `mcp.py` structurally (build_runtime, McpRuntime, _TOOL_HANDLERS dispatch, _service_error mapping table, _recovery_for, ERROR_ENVELOPE_SCHEMA, run_jsonrpc, all 23 handler function signatures); sampled 8 handler bodies in depth; read `service/errors.py` fully. Cross-verified: `_AGENT_TOOLS` frozenset (line 81) has 23 members matching `_TOOL_HANDLERS` (line 2574) entry count; `NoSuchRequestError` subclass ordering confirmed correct in both the error hierarchy and the mapping. + +--- + +## HTTP API Adapter + +**Location:** `src/legis/api/app.py` + +**Responsibility:** Implements the FastAPI HTTP adapter that exposes Legis governance, git/CI, and wardline surfaces as REST endpoints, translating `ServiceError` subclasses to HTTP status codes and delegating all governance decisions to `service/`. + +**Key Components:** +- `create_app()` factory (lines 314–954) — single application factory injecting all dependencies (engine, gates, identity, filigree, binding ledger, posture ledger) with lazy fallbacks from environment; returns a `FastAPI` instance +- Auth layer: `_verify_secret()` / `_token_actor_from_mapping()` (lines 94–189) — scope-gated bearer token auth; single-secret mode defaults to `writer`-only, requiring explicit scope grant for `operator`; `LEGIS_UNSAFE_DEV_AUTH=1` escape hatch; `_authenticated_actor_configured()` guards whether body-supplied actor is trusted +- `verify_writer` / `verify_operator` dependencies (lines 211–216) — FastAPI `Depends` guards that enforce writer/operator scope split on all write routes; operator routes use a separate `verify_operator` dependency +- `_WARDLINE_ROUTING_STATUS` map (lines 295–299) — three-way HTTP status dispatch for `WardlineRoutingError` kinds (500/403/422), complementing MCP's single-code collapse +- `POST /overrides` unified route (lines 586–715) — cell-dispatched override submission; reads `floored_registry()` per request (D2); branches on `chill`/`coached`/`structured`/`protected` cells; `202` for structured (never `201` — "an old '201 == accepted' reader must not misread it"); `need_inputs` discriminant returns `422` (not a generic error) for protected-cell missing inputs +- `_unresolved_input_http()` (lines 192–208) — structured `422` for non-resolving inline SEI; carries `weft_reason` dict matching MCP's `UNRESOLVED_INPUT` envelope +- `POST /signoff/{seq}/bind-issue` (lines 766–802) — maps 6 service exception types to distinct status codes including `502` for `FiligreeError` (typed, recoverable, not 500) +- `POST /wardline/scan-results` (lines 892–952) — `WardlineDirtyTreeError` → `JSONResponse(409)` (not 2xx); `outcome: ScanOutcome.ROUTED` plus `artifact_status_reason` honesty field + +**Dependencies:** +- Inbound: `cli.py` (`legis serve` sets env vars and calls `uvicorn.run("legis.api.app:create_app", factory=True)`) +- Outbound: `service/governance`, `service/explain`, `service/wardline`; `enforcement/` (engine, protected gate, signoff gate, trail verifier); `store/audit_store`; `policy/cells`, `policy/grammar`; `identity/resolver`; `filigree/client`; `governance/binding_ledger`, `governance/filigree_gate`; `posture/floor`, `posture/ledger`; `git/surface`, `git/rename_feed`, `git/pull_request`; `checks/surface`; `pulls/surface`; `wardline/ingest`, `wardline/governor`; `config` (db URL resolvers); `mcp._load_policy_cell_registry` (cross-transport import, line 398) + +**Patterns Observed:** +- Cell dispatch inside the unified `POST /overrides` route mirrors the MCP `_tool_override_submit` dispatch, both delegating to the same `service/` functions; no governance logic duplicated between adapters +- Status codes carry semantic weight: `201` (accepted/recorded), `202` (pending escalation), `409` (blocked/conflict/dirty-tree), `422` (input error/need_inputs), `500` (integrity failure), `502` (upstream unavailable) +- `floored_registry()` called fresh at each request via the `floored()` closure (lines 415–423) with `PostureLedger` handle shared but floor re-read each time (D2 compliance) +- Auth scope model: `writer` for all mutation routes, `operator` exclusively for `POST /protected/operator-override` and `POST /signoff/{seq}/sign`; unscoped tokens rejected by default (AUTH-1 comment, line 118) +- `LEGIS_ALLOW_UNSCOPED_API_TOKENS=1` flag comment (line 123) explicitly notes it grants unscoped tokens full operator authority — intentionally blunt warning in code + +**Concerns:** +- `create_app()` imports `_load_policy_cell_registry` from `legis.mcp` (line 398): this is the transport-on-transport coupling noted under the MCP entry. The comment at that line attributes it to Q-H2 ("store-location resolvers live in the transport-agnostic config module") but the function still lives in `mcp.py` with a `_` prefix rather than the identified right home. +- `assert simple_engine is not None` (line 607) inside `post_override` after `simple_engine_for(cell)` returns `None` for coached when no judge is configured: this path would panic with `AssertionError` rather than a clean `ServiceError`. The `assert` is a correctness assumption that the upstream `not explanation.enabled` guard (line 603) would have caught the unwired case — but `simple_engine_for` returns `None` for coached without a judge, and `explanation.enabled` may still be `True` in some edge configurations. This is a potential unguarded assertion that would produce a 500 with stack trace rather than a structured error. (Medium confidence — the `explanation.enabled` path is the primary guard, but the assertion is a backup, not a primary defense.) +- No top-level exception handler is registered on the FastAPI app for unhandled `ServiceError` subclasses: any `ServiceError` that escapes route-level `except` blocks becomes an untyped 500. The individual routes cover their expected exception shapes, but a new `ServiceError` subclass not yet added to a route's except list would surface as a 500 without a structured payload. + +**Confidence:** High — read 100% of `app.py` (954 lines). Cross-verified: ServiceError imports at lines 51–60 match handler `except` clauses in routes; `_WARDLINE_ROUTING_STATUS` keys match `WardlineRoutingError` kind constants in `service/errors.py`; `floored()` closure construction confirmed at lines 415–423. + +--- + +## CLI Adapter + +**Location:** `src/legis/cli.py` + +**Responsibility:** Implements the `legis` command-line interface, providing subcommands for serving, MCP boot, governance gates, install/doctor/posture/operator management, and policy tooling, delegating all governance logic to `service/` and translating outcomes to exit codes. + +**Key Components:** +- `build_parser()` (lines 36–260) — argparse definition for all subcommands: `serve`, `mcp`, `check-override-rate`/`governance-gate`, `sei-backfill`, `policy-boundary-check`, `install`, `session-context`, `doctor`, `posture {show,set,rekey}`, `operator {enable,disable}` +- `main()` (lines 705–844) — top-level dispatch; `serve` sets env vars and calls `uvicorn.run`; `mcp` sets env vars and calls `legis.mcp.main(args.agent_id)`; operator/posture subcommands use full governance paths +- `_check_override_rate()` (lines 287–335) — override-rate gate: reads store → `evaluate_override_rate_gate()` in `service/`; fail-closed in CI (`LEGIS_ALLOW_MISSING_GOVERNANCE_DB` guard); integrity check before scoring; exit 1 on FAIL/missing-in-CI +- `_run_install()` (lines 609–689) — step-runner for `legis install`; catches per-step exceptions broadly (BLE001) to avoid half-applied installs, counts failures, returns 1 if any step failed +- `_run_posture()` / `_run_operator()` (lines 401–606) — operator elevation and posture floor management; `posture set` requires open session and matching key fingerprint; `posture rekey` chains KEY_RESET with no old key needed; env backend refused for rekey (cannot persist from child process) +- `_build_operator_signer()` (lines 365–398) — custody dispatch: `env` → `EnvSigner`, `age-file` → `AgeFileSigner` with passphrase from `LEGIS_OPERATOR_KEY_AGE_PASSPHRASE`; `keychain` raises LOUD (not shipped) +- `_parse_ttl()` (lines 344–362) — fail-closed: empty or non-positive TTL raises `ValueError`; no silent zero-length session windows +- `policy-boundary-check` handler (lines 779–832) — zero-scope guard: exits with code 2 (`NO_ROOT`) if scan root is missing or contains no Python files; explicitly blocks a vacuous green PASS on empty input + +**Dependencies:** +- Inbound: process entry point (`legis` console script); no other Legis module imports `cli.py` +- Outbound: `mcp.main()` (for `legis mcp`); `api.app.create_app` (via uvicorn, for `legis serve`); `service/governance.evaluate_override_rate_gate`; `governance/sei_backfill`; `policy/boundary_scan`; `store/audit_store`; `identity/loomweave_client`; `doctor.run_doctor`; `install.*`; `hooks.generate_session_context`, `hooks.refresh_instructions`; `posture.*`; `config` (db URLs) + +**Patterns Observed:** +- Subcommand handlers are private `_run_*()` functions called from `main()`; thin: I/O + env var forwarding + exit code mapping, no governance logic +- Override-rate gate delegates fully to `service.evaluate_override_rate_gate()`; CLI only adds I/O and exit code (the comment at line 323 explicitly says "The detect → require-key → verify → score decision lives in the service layer") +- `policy-boundary-check` has explicit no-scope false-green guard (zero-file = exit 2, not exit 0) mirroring the honesty stance; comment references `weft-ef2e898642 silent-clean-on-zero-scope` +- Operator elevation (D3): `posture set` cannot bypass session — requires a prior `operator enable`; fingerprint mismatch between signer and current epoch is caught before any session opens +- `_refresh_instructions_best_effort()` (lines 692–702) wraps boot-time instruction refresh in broad except with `logger.warning`; explicit comment: "Best-effort: never break the server, but don't vanish silently either" + +**Concerns:** +- `_run_install()` catches all exceptions per step with `except Exception` (line 677, BLE001 suppressed); this prevents tracebacks from aborting a partial install, but means a step can fail silently if the failure string is ambiguous. Acceptable tradeoff for install resilience, but could mask custody errors that should be fatal (e.g., a failed posture step printing `[FAIL]` still lets the overall install return 0 if it was a deferred step). +- None observed for governance-honesty discipline: the CLI makes no governance decisions; all decision paths delegate to `service/` or to the modules that own the logic (`install`, `doctor`, `posture`). Error paths uniformly use non-zero exit codes (1 for failures, 2 for usage/vacuous-scan). The operator elevation path is fail-closed at every guard (signer verification, fingerprint match, session persistence before audit append). + +**Confidence:** High — read 100% of `cli.py` (844 lines). Cross-verified: subcommand names in `build_parser()` match dispatch cases in `main()`; `_check_override_rate()` delegates confirmed by reading the service call at line 326 (`evaluate_override_rate_gate`); zero-scope guard at line 795 (`count_source_files`) confirmed to precede `scan_policy_boundaries`. diff --git a/docs/arch-analysis-2026-06-28-2142/temp/catalog-5-posture.md b/docs/arch-analysis-2026-06-28-2142/temp/catalog-5-posture.md new file mode 100644 index 0000000..bc75897 --- /dev/null +++ b/docs/arch-analysis-2026-06-28-2142/temp/catalog-5-posture.md @@ -0,0 +1,34 @@ +## Posture + +**Location:** `src/legis/posture/` + +**Responsibility:** Maintains a signed, append-only posture floor that sets the minimum governance enforcement cell across all surfaces, enforces that an absent or empty ledger fails closed to `structured` (never `chill`), and gates floor transitions behind a short-lived sudo-style operator elevation session backed by an OS-keychain, age-encrypted file, or (explicitly opted-in) env-var key custody backend. + +**Key Components:** +- `floor.py` (84 lines) — `FlooredRegistry` subclasses `PolicyCellRegistry`; `cell_for` and `default_cell` are raised to the floor via `_max_tier`; `floored_registry()` calls `ledger.read_floor()` at call time (never cached, D2) and maps `None` to `"structured"` (fail-closed). This is the cross-surface chokepoint consumed by all three transports. +- `ledger.py` (506 lines) — `PostureLedger` wraps `AuditStore`; exposes `read_floor()` (single descending SQL scan skipping metadata records so a `OPERATOR_SESSION_OPENED` tail cannot lower the floor), `genesis()`, `transition()` (fingerprint-checked, v3 HMAC signed, fail-closed via `append_signed`), `session_opened()`, `rekey()`, and the `set_floor()` change gate. The change gate (lines 400–506) enforces: open session required, epoch fingerprint must match LEDGER (not session field), session audit record must be present, signer must prove custody, any fault refuses with zero records written. +- `session.py` (259 lines) — Persisted elevation session at `.weft/legis/operator_session.json`; atomic write via temp-file + `os.replace`; `load_session()` deletes stale files and returns `None` (fail-closed expiry); session file never holds key plaintext, passphrase, or raw blob — only `unlock_ref` (keychain item id, or `None` for age/env). +- `signing.py` (353 lines) — Three custody backends: `KeychainSigner` (key fetched per call, discarded), `AgeFileSigner` (blob + passphrase callback, key unwrapped per call), `EnvSigner` (plaintext env escape hatch; requires explicit `insecure_env=True` and emits `InsecureEnvKeyWarning`). `PostureSigner` / `PostureVerifier` protocols. `verify_signer_signature()` requires fingerprint match AND HMAC verification — self-attested fingerprint alone is not sufficient. +- `records.py` (54 lines) — Frozen `PostureRecord` dataclass; `to_payload()` deliberately excludes `seq/prev_hash/chain_hash` (added by `AuditStore`) to avoid breaking `verify_integrity`. +- `__init__.py` (75 lines) — Public re-exports of all five modules; constitutes the full public surface of the package. + +**Dependencies:** +- Inbound: `api/app.py` (imports `floored_registry`, `PostureLedger`); `mcp.py` (imports `FlooredRegistry`, `_max_tier`, `floored_registry`, `PostureLedger`); `cli.py` (imports signers, `set_floor`, session functions); `install.py` (imports `select_backend`, `mint_key`, `key_fingerprint`, `wrap_key`); `doctor.py` (imports `PostureLedger`, `signing`, `records`); `hooks.py` (imports `PostureLedger`). +- Outbound: `legis.policy.cells` (`PolicyCellRegistry`, `CELL_TIER_ORDER`, `_validate_cell`); `legis.store.audit_store` (`AuditStore`); `legis.enforcement.signing` (`sign`, `verify`); `legis.config` (`operator_session_path`); `legis.clock` (`Clock`, `SystemClock`); `legis.install` (`OperatorKeyCustodyError` — one deferred import in `ledger.py:344`, inside `rekey()` only). + +**Patterns Observed:** +- Fail-closed at every boundary: `None` floor maps to `"structured"` in `floored_registry()` (floor.py:79–83); `read_floor()` returns `None` for absent file, absent table, or empty store (ledger.py:101–118); `load_session()` returns `None` for absent, malformed, or lapsed file (session.py:199–221); `verify_signer_signature()` returns `False` on any exception (signing.py:333). +- Key-never-resident: all three non-env backends fetch the key into a local variable per `sign` call and discard it; no backend exposes a `key` attribute; `__slots__` used on `_RawKeySigner`, `AgeFileSigner`, `KeychainSigner` to prevent attribute injection. +- Append-only chain with position binding: `transition()` uses `append_signed` with a build callback that folds `chain_seq=seq` into signed fields (v3 HMAC); a raise in the callback leaves no half-write (ledger.py:207–262). +- GENESIS idempotent guard: `genesis()` checks `current_epoch_fingerprint()` and is a no-op if any epoch-opening record exists — prevents double-genesis on reinstall (ledger.py:185–197). +- Read-fresh floor (D2): `FlooredRegistry` is constructed per-request in all three transports; the floor is never cached at runtime. +- `rekey()` hands key to custody sink BEFORE appending the `KEY_RESET` record — a custody failure leaves no fingerprint the ledger cannot later sign against (ledger.py:356–358). +- Metadata records (`OPERATOR_SESSION_OPENED`) are explicitly excluded from the `_FLOOR_RECORD_KINDS` set and skipped by the descending `read_floor()` scan, so a session-open tail cannot lower or freeze the effective floor (ledger.py:82, 116). + +**Concerns:** +- `read_floor()` does NOT call `verify_integrity()` before returning the floor value. The task description states "read_floor() gates on verify_integrity()" but the actual implementation (ledger.py:92–118) performs only a descending SQL payload scan with no chain-hash verification. A silently corrupted ledger (raw DB write that keeps SQL rows intact but alters payload bytes) could cause `read_floor()` to return an attacker-lowered floor value. `verify_integrity()` is called separately by `doctor.py:480` during health checks, not inline on floor reads. This is a documented residual threat ("raw-DB-write deletion/truncation are conceded residual threats" per CLAUDE.md), but the gap is worth recording explicitly — the floor-read path trusts payload content without chain verification. +- The `posture -> install` coupling (`ledger.py:344`: `from legis.install import OperatorKeyCustodyError`) is a deferred import inside `rekey()`. `install.py` is a large module (owns CLI install flows, doctor probe logic, config/runtime setup) with the opposite dependency direction expected: install calls posture during setup. The current coupling is narrow (one exception class) but the direction is logically inverted — `OperatorKeyCustodyError` belongs in a shared errors or posture module, not in install. This creates a latent risk: changes to `install.py`'s imports or structure can inadvertently affect the posture/rekey path. +- `epoch_reset_unacknowledged()` and `current_epoch_fingerprint()` both call `self.store.read_all()` (full table scans, ledger.py:138, 165). As the ledger grows with session-open and transition records, these become increasingly expensive. `read_floor()` correctly uses a descending early-exit scan; these two read paths do not benefit from the same optimization. +- No logging or observability in the posture package itself. A refused `set_floor()` call (wrong session, fingerprint mismatch, signer fault) returns a structured `PostureSetResult` but nothing is written to an audit trail at the time of refusal — only accepted transitions appear in the ledger. + +**Confidence:** High — Read 100% of all six source files in the package (`__init__.py`, `floor.py`, `ledger.py`, `records.py`, `session.py`, `signing.py`; 1,331 lines total). Verified inbound dependency graph by grepping all legis source for `from legis.posture`. Cross-validated the `read_floor()` fail-closed path against `floored_registry()` (floor.py:78–84) and the `set_floor()` gate (ledger.py:400–506). Confirmed the `install` import is a single deferred call site at ledger.py:344. Verified `verify_integrity()` is absent from the posture read path by direct code inspection. diff --git a/docs/arch-analysis-2026-06-28-2142/temp/catalog-6-federation.md b/docs/arch-analysis-2026-06-28-2142/temp/catalog-6-federation.md new file mode 100644 index 0000000..b356d1b --- /dev/null +++ b/docs/arch-analysis-2026-06-28-2142/temp/catalog-6-federation.md @@ -0,0 +1,97 @@ +## Wardline Findings Ingestion and Routing + +**Location:** `src/legis/wardline/` + +**Responsibility:** Ingest Wardline scan payloads (agent-supplied, not pulled via HTTP), validate the wire contract and artifact provenance, and route active defect findings into the configured enforcement cell — all without re-adjudicating Wardline's trust/taint verdicts. + +**Key Components:** +- `ingest.py` (534 lines) — Wire validation: `active_defects()` extracts the gate population, `verify_wardline_artifact()` authenticates provenance (HMAC via `LEGIS_WARDLINE_ARTIFACT_KEY` or records `key_absent`). Defines `TRUST_TIERS`, `KNOWN_KINDS`, `DEFECT_KIND`, `FINDINGS_KEY`, `WardlineFinding`, `Suppressed` enum, `ArtifactStatus`/`ArtifactStatusReason` enums, canonical-reason carrier, `WardlineDirtyTreeError` (typed amber, not a generic red), and `ScanOutcome`. +- `governor.py` (178 lines) — Routing engine: `route_findings()` maps `WardlineFinding` list to `WardlineCellPolicy` members (`SURFACE_OVERRIDE`, `BLOCK_ESCALATE`, `SURFACE_ONLY`), resolves entity SEI before opening writes, wraps the batch in a single-store transaction (engine or signoff), and delegates to `EnforcementEngine.submit_override`, `SignoffGate.request`, or `EnforcementEngine.record_event`. +- `policy.py` (18 lines) — Thin helper: `resolve_cell()` maps a finding's severity rank against a configured `fail_on` threshold to produce either the gate cell or `SURFACE_ONLY`. +- `__init__.py` (1 line) — Empty aside from module docstring. + +**Dependencies:** +- Inbound: `service/` (scan_route handler), `api/` and `mcp.py` (adapters supply the scan payload and call service) +- Outbound: `enforcement/engine.py` (`EnforcementEngine`), `enforcement/signoff.py` (`SignoffGate`), `enforcement/signing.py` (`verify`), `identity/entity_key.py` (`EntityKey`) + +**Patterns Observed:** +- "Wardline analyses, Legis governs" enforced structurally: `ingest.py` module docstring (line 5) states "legis never re-analyzes — it reads findings and governs"; `TRUST_TIERS` and `KNOWN_KINDS` are explicitly labelled "carried, never re-derived" (ingest.py:16, ingest.py:43). +- Fail-closed on `FINDINGS_KEY` absence: `active_defects()` raises `WardlinePayloadError` rather than defaulting to empty (ingest.py:488–493), guarding against the G1 false-green where a producer key rename produces zero defects under a green status. +- Fail-closed on unknown `kind`: any kind outside `KNOWN_KINDS` is rejected loudly (ingest.py:511–517), closing the G1 twin (value-axis) where a drifted `kind` token silently removes a defect from the gate population. +- Agent-suppressed findings require proof: `waived`/`suppressed` findings without `suppression_proof`, `suppression_ticket`, or `suppression_reason` raise `WardlinePayloadError` (ingest.py:521–527). +- Dirty-tree amber is type-distinct from malformed-or-tampered: `WardlineDirtyTreeError` is intentionally not a subclass of `WardlinePayloadError` (ingest.py:191); its `to_payload()` produces `SKIPPED_DIRTY_TREE` so harnesses distinguish "commit first" from "scan is broken". +- Entity resolution before write: `governor.py` resolves all SEIs in `prepared` before opening any write transaction (governor.py:108–111), so identity network calls never run inside a SQLite transaction. +- Cross-store mixing rejected before any write: the guard at governor.py:94–97 rejects a batch that would span engine and signoff stores simultaneously. + +**Concerns:** +- Transaction atomicity is partial: the pre-write guard (governor.py:65–66 comment) explicitly acknowledges that a mid-loop runtime failure after some findings persist leaves those writes permanent. This is accepted but undocumented at the call-site level for callers who may not read the comment. +- The `cell_map` dependency check (governor.py:80–84) is deliberately conservative — it validates all mapped cells, not only those triggered by present findings. The inline comment (governor.py:74–79) flags that narrowing this requires recomputing from present findings, leaving it as acknowledged future work. +- No rate-limit or per-agent throttle on the findings batch beyond `MAX_FINDINGS = 500` (ingest.py:26). A batch at exactly the maximum is accepted without an audit-trail event marking it as a large batch. + +**Confidence:** High — Read all three implementation files (ingest.py 534 lines, governor.py 178 lines, policy.py 18 lines) fully. Cross-verified dependency claims against imports and governance-honesty invariants confirmed in code at cited lines. + + +## Filigree Sign-off Binding + +**Location:** `src/legis/filigree/` and `src/legis/governance/` + +**Responsibility:** Bind a cleared, governed sign-off to a Filigree issue via an SEI-keyed entity-association, record the binding in a local HMAC-signed append-only ledger, and gate issue closure on verified ledger evidence — without touching Filigree's issue lifecycle (locked decision 5). + +**Key Components:** +- `filigree/client.py` (185 lines) — `HttpFiligreeClient` and `FiligreeClient` Protocol: HTTP transport to Filigree using stdlib `urllib`, with injectable `fetch` for offline testing. Explicitly omits `X-Weft-*` transport HMAC headers (client.py:6–13, 44–45); the app-level `binding_signature` in the JSON body is the governance evidence. +- `governance/signoff_binding.py` (83 lines) — `bind_signoff_to_issue()`: validates `entity_key.identity_stable` (raises `ValueError` if false — locator-keyed bind is rejected), optionally HMAC-signs the binding payload, calls `filigree.attach()`, then calls `ledger.record()` in validate→attach→record order. +- `governance/binding_ledger.py` (94 lines) — `BindingLedger`: append-only tamper-bound store of `issue_binding` records, each signed with the LEGIS HMAC key. `get()` and `get_by_issue_id()` call `verify()` before returning data (fail-closed: tampered ledger raises `BindingError`, never returns data). +- `governance/filigree_gate.py` (33 lines) — Pure decision: `evaluate_issue_closure()` calls `ledger.get_by_issue_id()` (which verifies the chain) and returns `allowed: False` if no verified binding record exists. +- `governance/gaps.py` (121 lines) — `find_orphan_gaps()` and `find_lineage_integrity()`: scans the governance audit trail for SEI-keyed records, resolves current liveness/lineage from Loomweave, surfaces orphaned attestations and lineage divergences. Prefix-check semantics: lineage appends are legitimate; a removed or mutated prior event is divergence (gaps.py:6–9). +- `governance/sei_backfill.py` (269 lines) — Append-only migration: upgrades legacy locator-keyed audit rows to SEI-keyed `SEI_BACKFILL` events without rewriting history. Checks integrity before running; dry-run default. +- `governance/params.py` (10 lines) — Policy constants: `OVERRIDE_RATE_THRESHOLD = 0.2`, `OVERRIDE_RATE_WINDOW = 100`, `OVERRIDE_RATE_MIN_SAMPLE = 20`. Explicitly marked as ADR-0002 policy — not tuneable via request parameters. +- `governance/__init__.py` (1 line) — Empty aside from module docstring. + +**Dependencies:** +- Inbound: `service/` (`bind_signoff_issue`, `read_filigree_closure_gate`), `mcp.py` (`signoff_bind_issue`, `filigree_closure_gate_get`) +- Outbound: `filigree/client.py` → `weft_signing.weft_body_bytes`; `signoff_binding.py` → `enforcement/signing.sign`, `filigree/client.FiligreeClient`, `governance/binding_ledger.BindingLedger`, `identity/entity_key.EntityKey`; `binding_ledger.py` → `clock.Clock`, `enforcement/signing.sign`+`verify`, `identity/entity_key.EntityKey`, `store/protocol.AppendOnlyStore`; `gaps.py` → `canonical.content_hash`, `identity/loomweave_client.LoomweaveIdentity`, `store/protocol.AuditRecordLike`; `sei_backfill.py` → `canonical.content_hash`, `clock.Clock`, `identity/loomweave_client.LoomweaveIdentity`, `identity/entity_key.EntityKey`, `identity/resolver.*`, `store/protocol.*` + +**Patterns Observed:** +- SEI-stability gate is the first check in `bind_signoff_to_issue()` (signoff_binding.py:46–49): a locator key raises `ValueError` before any network call, enforcing ADR-0003 fail-closed semantics. +- App-level `binding_signature` is the governance evidence, not transport HMAC: client.py explicitly omits `X-Weft-*` headers (client.py:9–13) to avoid dead handshake with a non-verifying Filigree route; binding integrity lives in the local ledger's HMAC chain. +- Validate→attach→record order in `bind_signoff_to_issue()` (signoff_binding.py:71–82): if `ledger.record()` raises after `filigree.attach()` succeeds, the code comment (signoff_binding.py:72–76) honestly documents the accepted trade-off — a binding with no ledger entry is surfaced by `verify()`, not silently lost. +- Ledger always verifies before reading: both `get()` and `get_by_issue_id()` call `self.verify()` as the first operation (binding_ledger.py:79, 87), so a tampered ledger raises `BindingError` and returns no data. +- Filigree does not own lifecycle: `evaluate_issue_closure()` is a pure read-decision that returns structured `allowed/reason/evidence` without writing to Filigree or mutating issue state (filigree_gate.py:14–32). +- Lineage integrity uses prefix semantics, not whole-list equality: `find_lineage_integrity()` computes `content_hash(current[:n])` against the stored snapshot (gaps.py:110), so appended rename events do not trigger false divergences. +- `params.py` constants cannot be tuned by request (params.py:7–9 comment): the override-rate threshold reads from this file, not from request parameters. + +**Concerns:** +- Uncompensated partial-write window in `bind_signoff_to_issue()`: if `ledger.record()` raises after `filigree.attach()` succeeds, Filigree holds the association pointer but legis has no tamper-bound record of it. The code correctly documents this (signoff_binding.py:72–76), but there is no reconciliation path or operator repair tool identified — a `BindingLedger.verify()` call surfaces the mismatch but cannot heal it. +- `filigree/client.py` response integrity depends on TLS only: the inline comment (client.py:127) acknowledges that `LEGIS_ALLOW_INSECURE_REMOTE_HTTP=1` with a non-loopback Filigree host makes responses forgeable on-path. The escape hatch is guarded by the env flag and log warning, but there is no posture/doctor check that flags a non-loopback HTTP Filigree URL in production. +- `get_by_issue_id()` returns the *last* verified binding record for an issue (binding_ledger.py:88–93); if multiple bindings exist for the same `issue_id` (re-bind after a re-sign-off), earlier records are silently shadowed. No audit event marks a supersession. + +**Confidence:** High — Read all 7 files fully (signoff_binding.py 83 lines, binding_ledger.py 94 lines, filigree_gate.py 33 lines, gaps.py 121 lines, sei_backfill.py 269 lines, params.py 10 lines, filigree/client.py 185 lines). Cross-verified transport-boundary claims and SEI-stability guard at signoff_binding.py:46–49. + + +## Warpline Preflight Advisory Consumer + +**Location:** `src/legis/warpline_preflight/` + +**Responsibility:** Provide read-only advisory access to Warpline's impact-radius and reverify-worklist data, surfaced as a sibling informational tool that is structurally isolated from every governance verdict path. + +**Key Components:** +- `client.py` (144 lines) — `HttpWarplineClient` and `WarplineClient` Protocol: two read-only GETs (`impact_radius`, `reverify_worklist`) via stdlib `urllib`, with injectable `fetch`. HTTPS-required for non-loopback, redirect-blocked, response size-capped at 1 MB. No signing — Warpline's advisory responses are not HMAC-authenticated. +- `__init__.py` (1 line) — Empty. +- `service/preflight.py` (39 lines, not in this directory but the sole consumption point) — `read_warpline_preflight()`: returns `{"status": "unavailable", ...}` when client is `None` or raises `WarplineError`; returns `{"status": "checked", ...}` on success. Transport failures are contained as `unavailable` and never propagate as `INTERNAL_ERROR`. + +**Dependencies:** +- Inbound: `service/preflight.read_warpline_preflight` (consumed only by `mcp.py:_tool_warpline_preflight_get` — a dedicated MCP tool, not embedded in any governance tool handler) +- Outbound: stdlib only (`urllib`, `json`, `ipaddress`, `os`, `logging`) + +**Patterns Observed:** +- Advisory boundary is enforced by structural isolation: `client.py` module docstring (line 7) states "nothing it returns may reach a governance verdict path"; grep of `service/policy.py`, `enforcement/engine.py`, and all governance tool handlers confirms zero cross-references between warpline preflight and any verdict, gate, sign-off, or honesty-read path. +- Fail-unavailable, not fail-empty: an unconfigured or unreachable Warpline returns `{"status": "unavailable", "unavailable": [{"reason": ...}]}` (preflight.py:21–33), never an empty affected-set that could read as "nothing impacted". +- No request signing: `_transport_fetch` passes an empty headers dict (client.py:129), consistent with the advisory posture — no HMAC contract with Warpline exists. +- `mcp.py` constructs `HttpWarplineClient` lazily at runtime (mcp.py:234–238) and stores it on `McpRuntime.warpline`; a `WarplineError` during construction makes it `None`, triggering the unavailable path. + +**Concerns:** +- Advisory boundary relies on discipline, not a type wall: the `WarplineClient` Protocol and its response dicts are untyped at the governance layer — there is no newtype or sealed return type that would prevent a future developer from accidentally plumbing a Warpline response into a verdict path. The contract is documented in comments but not machine-enforced. +- Response integrity depends on TLS only (same structural gap as the Filigree client): a non-loopback HTTP Warpline URL under `LEGIS_ALLOW_INSECURE_REMOTE_HTTP=1` yields forgeable advisory data (client.py:106–114). Since advisory data is never supposed to reach governance verdicts, this is lower risk than the Filigree case, but the doctor check gap is the same. +- None observed for advisory-boundary enforcement in the governance verdict paths — verified by grepping `service/policy.py` and `enforcement/engine.py` for any warpline reference (both returned empty). + +**Confidence:** High — Read `client.py` (144 lines) fully and `service/preflight.py` (39 lines) fully. Advisory boundary verified by negative grep across enforcement and policy service modules. MCP wiring verified at mcp.py:2277–2286 and mcp.py:234–238. diff --git a/docs/arch-analysis-2026-06-28-2142/temp/catalog-7-git-ci.md b/docs/arch-analysis-2026-06-28-2142/temp/catalog-7-git-ci.md new file mode 100644 index 0000000..1182dcc --- /dev/null +++ b/docs/arch-analysis-2026-06-28-2142/temp/catalog-7-git-ci.md @@ -0,0 +1,94 @@ +## Git/Change Surface + +**Location:** `src/legis/git/` + +**Responsibility:** Provides stateless, read-only access to the local git repository (branches, commits, rename evidence) and defines the injectable forge seam used by adapters that need PR context from an external forge. + +**Key Components:** +- `surface.py` (208 lines) — `GitSurface`: shells out to `git -C ` for all reads; exposes `branches()`, `commit()`, `commits()`, `merge_base()`, `renames()`, and `working_tree_renames()`; validates every ref/SHA against a strict allowlist regex before passing to the shell (`surface.py:81,117,124,127,137`) to prevent injection; raises `GitError` on bad exit codes or timeouts (10 s ceiling) +- `rename_feed.py` (48 lines) — `build_rename_feed()`: composes committed and optional worktree renames into the dict structure consumed by `GET /git/rename-feed` (HTTP) and `git_rename_feed_get` (MCP); emits `worktree_checked` flag to distinguish "checked and clean" from "not checked" (`rename_feed.py:14-16`) +- `models.py` (46 lines) — frozen dataclasses `BranchInfo`, `CommitInfo`, `RenameEvidence`; `RenameEvidence` docstring explicitly scopes the claim to path-level git detection only and defers symbol-level resolution to Loomweave (`models.py:33-38`) +- `pull_request.py` (28 lines) — `PullRequestContext` dataclass and `PullRequestSource` runtime-checkable Protocol; a deployment wires a provider (e.g. `gh`-backed); legis bakes in no forge HTTP (`pull_request.py:1-8`) +- `__init__.py` (1 line) — module docstring only; exports nothing (consumers import from submodules directly) + +**Dependencies:** +- Inbound: `api/app.py` (imports `GitSurface`, `build_rename_feed`, `PullRequestSource`); `mcp.py` (imports `GitSurface`, `build_rename_feed`, `GitError`) +- Outbound: stdlib only (`subprocess`, `pathlib`, `re`); no Legis sibling packages + +**Patterns Observed:** +- Strict ref/SHA allowlist validation at every entry point before shell invocation (`surface.py:81,117,124,127,137,178`) — injection defense is in the surface, not in each caller +- Stateless design: every method call reads directly from the real repository; no in-memory cache +- `_run_raw()` vs `_run()` split: error-tolerant reads (e.g. upstream ahead/behind, blob lookups) use `_run_raw` and check returncode; mandatory reads use `_run` and raise `GitError` +- Working-tree renames use the literal sentinel `"WORKTREE"` as `commit_sha` (`surface.py:194`), communicating uncommitted provenance to the consumer +- Forge seam (PR context) uses a `Protocol` injection pattern matching `identity/` and `filigree/` client seams + +**Concerns:** +- None observed for governance honesty: rename evidence carries an explicit docstring boundary (path-level only, `models.py:33-38`); `PullRequestSource` is read-only and injects no forge writes; `working_tree_renames` emits `commit_sha="WORKTREE"` rather than a hash, preventing misinterpretation as a committed ref; `merge_base()` returns `None` (not an empty string that could collide with a ref) when there is no common ancestor (`surface.py:131-132`) +- `_blob()` silently returns `""` when a rev/path cannot be resolved (`surface.py:207`); a consumer receiving `old_blob=""` cannot distinguish "object missing" from "lookup failed" — documented as intentional for Loomweave's matcher, but the emptiness semantics are not explicit in `RenameEvidence` +- Test coverage: `test_git_surface.py` (156 lines), `test_rename_feed.py` (47 lines), `test_git_rename_feed_contract.py` (103 lines); ref-validation injection tests verified in `test_git_surface.py` + +**Confidence:** High — read 100% of all 5 source files (surface.py 208 lines, rename_feed.py 48 lines, models.py 46 lines, pull_request.py 28 lines, __init__.py 1 line); cross-verified inbound callers in api/app.py and mcp.py; verified test coverage exists for rename feed and surface + + +--- + +## CI/Check Surface + +**Location:** `src/legis/checks/` + +**Responsibility:** Records CI check runs supplied by writers (agents, CI adapters) into an indexed SQLite store and serves them back queryable by commit SHA, branch name, or PR number, always tagging recorded runs as writer-supplied and unauthenticated. + +**Key Components:** +- `surface.py` (133 lines) — `CheckSurface`: SQLAlchemy Core over SQLite (NullPool); `record()`, `for_commit()`, `for_branch()`, `for_pr()`, `latest_state()`; additive migration via `_ensure_schema()` for `recorded_by` and `provenance` columns (`surface.py:57-66`); `_to_run()` defaults missing `provenance` to `Provenance.UNAUTHENTICATED` for rows written before the column existed (`surface.py:115`) +- `models.py` (43 lines) — `CheckRun` frozen dataclass and `CheckOutcome` str-Enum; comment at `models.py:36-42` explicitly names the governance limit: "a recorded check is a writer-supplied claim, not a forge-verified fact"; `provenance` defaults to `Provenance.UNAUTHENTICATED` +- `__init__.py` (1 line) — module docstring only + +**Dependencies:** +- Inbound: `api/app.py` (imports `CheckSurface`, `CheckRun`, `CheckOutcome`); `mcp.py` (imports `CheckSurface`, `CheckRun`, `CheckOutcome`); `pulls/surface.py` imports `Provenance` (via shared `legis.provenance`, not this package) +- Outbound: `legis.provenance` (shared vocabulary); `legis.config.ensure_sqlite_parent` (parent-dir creation for DB path); SQLAlchemy (storage) + +**Patterns Observed:** +- Provenance honesty by construction: the `provenance` field is set to `UNAUTHENTICATED` at the model default (`models.py:42`) and is re-applied on read for pre-migration rows (`surface.py:115`), so no code path can return a check run without an explicit provenance claim +- `check_report` MCP tool echoes `recorded_by` and `provenance` back to the caller in its result (`mcp.py:2433-2440`), explicitly preventing a caller from believing its own report became forge-attested +- `latest_state()` uses last-write-wins by insert order (`surface.py:129-131`), matching a CI model where newer runs supersede older ones for the same check name +- Table is indexed (not append-only) to support dimensional queries, distinct from the HMAC-chained governance audit log (`surface.py:3-7`) +- Additive `ALTER TABLE` migration rather than versioned migrations — suitable for the single-writer, file-local SQLite model + +**Concerns:** +- No deduplication guard: recording the same `(run_id, commit_sha, check_name)` twice produces two rows; `latest_state()` will return the second by insert order, but `for_commit()` / `for_branch()` / `for_pr()` return all rows. An agent that double-reports does not cause a false-green (both rows are unauthenticated claims), but check counts in API/MCP responses may mislead +- `check_report` (MCP write tool) accepts `commit_sha` as a free string with no validation against the actual repo — an agent can record a check against a SHA that does not exist in the repository; there is no proof-of-commit gate +- `provenance` column is `Text` in the DB and is never validated on read — a raw-DB write of an arbitrary string would survive round-trip; the `_to_run()` fallback only guards `NULL`, not arbitrary values (`surface.py:115`) +- No forge-verification path exists today (only `UNAUTHENTICATED`); the field is wired for a future authenticated path (e.g. signed webhook), but the extension point is in `provenance.py` only — there is no corresponding routing or validation logic yet + +**Confidence:** High — read 100% of all 3 source files (surface.py 133 lines, models.py 43 lines, __init__.py 1 line); read the MCP check_report and check_list tool implementations (mcp.py:2396-2440); cross-verified provenance defaults in surface._to_run() and models.CheckRun; test coverage in tests/checks/test_check_surface.py (84 lines) + + +--- + +## Pull-Request Surface + +**Location:** `src/legis/pulls/` + +**Responsibility:** Records forge-reported pull-request metadata (writer-supplied, unauthenticated) into a per-PR upsert SQLite store and serves it back, always preserving a provenance label that prevents a consumer from treating a writer-asserted PR state as forge-authoritative. + +**Key Components:** +- `surface.py` (78 lines) — `PullSurface`: SQLAlchemy Core over SQLite (NullPool); `record()` upserts via delete-then-insert keyed on PR number (`surface.py:46-58`); `get()` returns `None` for unknown PRs; `_ensure_schema()` adds `recorded_by` and `provenance` columns additively (`surface.py:34-43`); `get()` defaults `provenance` to `UNAUTHENTICATED` for pre-migration rows (`surface.py:76`) +- `models.py` (30 lines) — `PullRequest` frozen dataclass and `PullRequestState` str-Enum; comment at `models.py:26-29` mirrors the checks provenance honesty contract +- `__init__.py` (3 lines) — explicit `__all__` re-export of `PullRequest`, `PullRequestState`, `PullSurface` + +**Dependencies:** +- Inbound: `api/app.py` (imports `PullRequest`, `PullRequestState`, `PullSurface`); `mcp.py` (imports `PullRequestState`, `PullSurface`); also `git/pull_request.py` defines a parallel `PullRequestContext` / `PullRequestSource` Protocol used by the live-forge injection path (the two are not merged) +- Outbound: `legis.provenance` (shared vocabulary); `legis.config.ensure_sqlite_parent`; SQLAlchemy + +**Patterns Observed:** +- Upsert-by-number semantics: each `record()` call replaces any prior row for that PR number atomically within a transaction (`surface.py:46-48`), so the store always reflects the last-known state rather than accumulating history +- `pull_request_get` MCP tool lazily initialises `_checks()` unconditionally to prevent call-order-dependent gaps where a fresh runtime might report no checks for a PR (`mcp.py:2354-2359`) — an explicit governance-honesty fix noted in the comment +- Two parallel PR seams in the HTTP adapter: `GET /git/pull-requests/{number}` uses the injected `PullRequestSource` (live forge); `POST /git/pulls` + `GET /git/pulls/{number}` use `PullSurface` (recorded cache) — clearly separated and documented +- `pull_request_record` intentionally absent from MCP tool surface (per CLAUDE.md: "forge is source of truth, pinned in test") — the MCP surface is read-only for PRs; write is HTTP-only + +**Concerns:** +- The two PR representations (`PullRequestContext` in `git/pull_request.py` and `PullRequest` in `pulls/models.py`) carry overlapping fields (`number`, `title`, `base`, `head`, `state`) but are structurally separate types with no shared base or adapter — a consumer receiving one cannot easily convert to the other without a manual mapping; this is intentional (live-forge vs recorded seam) but the distinction is undocumented at the type level +- `record()` silently overwrites an existing PR's data with whatever the writer provides; if a writer submits a stale state (e.g. `open` after a PR merged), the store will reflect the stale value with no conflict detection or warning +- No test exercises the `provenance` field being set to a non-default value (e.g. a hypothetical `"webhook_signed"`); the `UNAUTHENTICATED` default is tested implicitly but the upgrade path is only named in comments, not in any test fixture + +**Confidence:** High — read 100% of all 3 source files (surface.py 78 lines, models.py 30 lines, __init__.py 3 lines); cross-verified both PR seams in api/app.py (lines 523-551); verified MCP pull_request_get implementation (mcp.py:2347-2360); test coverage verified via tests/pulls/test_pull_surface.py (30 lines) and tests/git/test_pull_request_api.py diff --git a/docs/arch-analysis-2026-06-28-2142/temp/catalog-8-runtime-ops.md b/docs/arch-analysis-2026-06-28-2142/temp/catalog-8-runtime-ops.md new file mode 100644 index 0000000..3a88aed --- /dev/null +++ b/docs/arch-analysis-2026-06-28-2142/temp/catalog-8-runtime-ops.md @@ -0,0 +1,129 @@ +## install.py — Project Installer + +**Location:** `src/legis/install.py` + +**Responsibility:** Stands up legis in a project by injecting a versioned instruction block into CLAUDE.md/AGENTS.md, installing the legis-workflow skill pack, registering a Claude Code SessionStart hook in .claude/settings.json, writing .gitignore rules, minting the posture-ledger GENESIS and operator key, and registering the MCP server entry in .mcp.json — all idempotent, all symlink-escape-guarded. + +**Key Components:** +- `project_path` / `ensure_project_dir` / `reject_symlink` (lines 132–160) — symlink-escape guards applied to every installer write; raise `UnsafeInstallPathError` on traversal outside project root +- `inject_instructions` (lines 311–387) — foreign-fence-aware instruction-block injector; uses `_first_own_open_fence_pos` + `_first_foreign_fence_pos` to never delete a co-resident sibling block (wardline/filigree); writes via `_atomic_write_text` (temp + `os.replace`) +- `_atomic_write_text` (lines 278–308) — empty-content guard + mode-preserving atomic write (refuses empty payload, rejects symlink direct target); used for all text writes in install +- `_install_skill_to` (lines 416–483) — concurrent-safe skill-pack copytree with rename race: stages to a temp dir, renames target aside, atomically swaps in new tree; on failure restores the prior pack rather than silently dropping it +- `install_claude_code_hooks` (lines 733–837) — registers `legis session-context` as a SessionStart hook; upgrades stale/bare commands to the resolved binary path; backs up malformed `settings.json` before resetting rather than silently clobbering; only rewrites unscoped blocks (never touches user-scoped blocks) +- `register_mcp_json` (lines 1199–1289) — idempotent MCP entry manager; a usable existing entry (command resolves + args valid + env clean) is NEVER regenerated; on rebuild preserves the operator-owned `env` dict minus blocked keys (`_safe_mcp_env`); `_REJECTED_MCP_ENV_KEYS` + `_REJECTED_MCP_ENV_PREFIXES` scrub secrets and unsafe escape-hatch vars +- `install_posture` (lines 1367–1431) — posture-ledger genesis; key minted once, handed to custody sink BEFORE `GENESIS` is written (fail-closed: no fingerprint written if custody fails); idempotency guard at `current_epoch_fingerprint()` prevents second-mint; env-backend adopts `LEGIS_OPERATOR_KEY` rather than minting a throwaway (legis-1844bf8ac9) +- `_default_key_sink` (lines 1462–1502) — custody router: `env`=no-op, `age-file`=atomic blob write, `keychain`=loud failure (adapter not shipped); raises `OperatorKeyCustodyError` rather than dropping the key +- `_find_legis_command` (lines 522–562) — binary resolution; prefers `sys.argv[0]` (faithful to the running binary, legis-788a85fac1) over PATH; skips project-local hits to avoid pinning a venv shim that doctor would immediately flag as stale + +**Dependencies:** +- Inbound: `hooks.py` (imports `inject_instructions`, `install_skills`, `install_codex_skills`, `_get_skills_source_dir`, `_skill_tree_fingerprint`, `_instructions_block_is_current`, `_marker_token`, `INSTRUCTIONS_MARKER`, `SKILL_NAME`); `doctor.py` (imports `_install` module, calls `mcp_entry_is_current`, `inject_instructions`, `install_claude_code_hooks`, `install_skills`, `install_codex_skills`, `gitignore_rules_present`, `ensure_gitignore`, `ensure_legis_dir_gitignore`, `_has_unscoped_session_start_hook`, `_own_open_marker_tokens`, `_marker_token`, `SESSION_CONTEXT_COMMAND`, `SKILL_NAME`, `LEGIS_DIR_GITIGNORE_MARKER`); `posture/` (imports `install_posture` for genesis) +- Outbound: `legis.config` (posture DB URL); `legis.posture` (`PostureLedger`, `mint_key`, `key_fingerprint`, `wrap_key`, `select_backend`); `legis.clock`; `importlib.resources` / `importlib.metadata` (bundled instructions, version); stdlib (`hashlib`, `shutil`, `tempfile`, `os`, `json`, `re`, `stat`) + +**Patterns Observed:** +- Fail-closed on every operation: custody failure before GENESIS write, empty-content guard before any text write, `UnsafeInstallPathError` on symlink escape — the codebase never auto-accepts partial success +- Strict idempotency contract: every installer function checks current state before acting; a second `install` over a healthy project modifies nothing material (returns early with "already registered" / "already present") +- Operator-env preservation on `.mcp.json` updates: the existing `env` dict is carried forward (minus scrubbed secret keys) rather than wiped — named fix for legis-788a85fac1; `_safe_mcp_env` is the scrub gate +- Foreign-fence awareness in instruction injection: `_INSTR_FENCE_RE` detects any tool's namespace fence (case-insensitive); the injector never deletes inter-block content owned by wardline/filigree (C-4 multi-owner block contract) +- `_find_legis_command` avoids project-local path poisoning: prefers the running binary, skips `.venv/bin/legis` hits so the registered hook/MCP command doesn't bounce on freshness checks + +**Concerns:** +- `_keychain_available()` (line 1348–1349) always returns `False` — the live OS-keychain backend is not yet implemented. This means `choose_install_backend` never selects `keychain` automatically. The comment documents it as deferred, but a caller providing `backend="keychain"` directly would route to the `_default_key_sink` `raise` path. The honesty posture is correct (fail-closed, not silent), but the gap means the most-secure custody path is unattainable without a custom `key_sink` injection. +- `install.py` is 1503 lines and owns six distinct responsibilities (instruction injection, skill pack, hook, gitignore, MCP registration, posture genesis). The functions are well-decomposed but the file is a god module by size; a future author adding a seventh install artifact has no forcing function to split it. +- The `settings.json` backup-on-corrupt path (lines 761–768, 804–817) writes `settings.json.bak` by copying via `shutil.copy2` then overwrites without `reject_symlink` on the backup path itself — a symlink at `.claude/settings.json.bak` could cause the backup to land outside the project tree. The backup path does call `reject_symlink` (lines 763, 807) before the copy, so this is mitigated, but only for the direct symlink case, not for symlinked parent directories. Evidence: line 763 `reject_symlink(backup)` is present, so direct symlink is blocked. + +**Confidence:** High — Read 100% of install.py (1503 lines). Every claim cites specific line ranges. Cross-verified the `hooks.py` import list against actual `from legis.install import (...)` at hooks.py:23–32. Verified `register_mcp_json` env-preservation logic at lines 1282–1284. Verified `install_posture` idempotency guard at lines 1411–1413. Verified `_keychain_available` stub at lines 1347–1349. + + +## doctor.py — Operator Health and Repair + +**Location:** `src/legis/doctor.py` + +**Responsibility:** Inspects and (with `--fix`) repairs legis's install and runtime artifacts, reporting each problem as `[auto-fixable]` or `[operator]`, sharing the machine-readable report surface (`doctor_payload`) with the MCP `doctor_get` tool so CLI and agent surfaces cannot drift. + +**Key Components:** +- `DoctorCheck` (lines 28–49) — frozen dataclass with `id`, `status` (`ok`/`warn`/`error`), `fixed`, `message`, `repairable`; the `repairable` field is the source of truth for the `[auto-fixable]` vs `[operator]` tag rendered in `render_text` +- `render_text` (lines 71–114) — renders `[fixed]`/`[auto-fixable]`/`[operator]` tags; a `repairable=False` check that is not fixed renders `[operator]`; a `repairable=True` check that is not fixed renders `[auto-fixable]`; confirmed honest: split-brain blocks set `repairable=False` (line 197) matching their "resolve it by hand" message +- `doctor_payload` (lines 56–64) — single source of the machine-readable schema shared by CLI `--format json` and MCP `doctor_get`; both render from this function (docstring confirmed; cross-validated against `mcp.py` usage by import path) +- `collect_checks` (lines 967–996) — runs 25 checks in order; repair branches live inside individual check functions (not here), so the orchestrator is pure composition +- `check_instruction_block` (lines 172–205) — distinguishes missing / drifted / split-brain; split-brain (`len(tokens) > 1`) returns `repairable=False` (line 197) because the injector cannot canonicalise across a sibling's block — honesty-correct; `--fix` re-runs `inject_instructions` and re-checks state before reporting success +- `check_mcp_json` (lines 117–137) — repair calls `register_mcp_json` and re-checks `mcp_entry_is_current` before returning `fixed=True`; never writes env secrets (delegates to `_safe_mcp_env` inside `register_mcp_json`) +- `check_audit_chain` (lines 461–487) — absent store → `ok` (never creates DB); tampered chain → `error`, `repairable=False`; never auto-repairs a hash-chain failure +- `check_posture_chain` / `check_posture_ledger` / `check_posture_key_reset` / `check_operator_key_accessible` (lines 561–810) — posture-ledger integrity checks; all report-only; `check_operator_key_accessible` probes key reachability without rendering the key value (lines 763–791); env escape hatch presence yields `warn`, not `ok` (honesty note at line 773) +- `check_weft_toml` (lines 339–358) — distinguishes absent (ok / defaults apply) from present-but-broken (error / config silently not applying), per C-9(b); NEVER writes `weft.toml` (confirmed — no write call anywhere in this function) +- `check_filigree_binding_scope` (lines 927–964) — triggered by unscoped filigree URL presence, NOT local install; `repairable=False` (operator-pinned URL); closes the false-green where doctor said "ok" while scans silently non-emitted +- `_store_dir_for` (lines 329–336) — anchored at `root`, not `cwd`, and explicitly ignores `weft.toml` (comment at line 332); custody rule correctly enforced in doctor's own store-path logic + +**Dependencies:** +- Inbound: CLI (`legis.cli` `doctor` subcommand); MCP tool surface (`legis.mcp` `doctor_get` tool uses `doctor_payload` / `collect_checks`); integration tests +- Outbound: `legis.install` (all repair operations delegate back to install functions); `legis.config` (`STORE_DB_SPECS`, `protected_policies`, `posture_db_url`, `operator_age_path`); `legis.store.audit_store` (`AuditStore`); `legis.posture.ledger` (`PostureLedger`); `legis.posture.signing` (`key_fingerprint`, `unwrap_key`); `legis.posture.records` (kind constants); `legis.enforcement.signing` (`verify`) + +**Patterns Observed:** +- Report-then-repair contract: every check function verifies current state first; repair branches are conditional on the `repair` flag; post-repair state is re-verified before claiming `fixed=True` (e.g. `check_hook` lines 259–261) +- `repairable` flag drives honest tagging: split-brain blocks, operator-key items, and audit-chain failures all set `repairable=False`; auto-fixable items all set `repairable=True`; the tag in `render_text` derives directly from this flag, not from ad-hoc conditions +- Doctor never writes `weft.toml` (C-9(b)): confirmed — no `weft.toml` write in any check function; `check_weft_toml` is read-only +- `_store_dir_for` ignores `weft.toml` and uses `root`-anchored path (line 332 comment); doctor's store resolution is independent of the runtime config's `LEGIS_*_DB` overrides for the store-path check itself (overrides are respected only by `_store_url` when building actual DB URLs for integrity checks) +- `STORE_DB_SPECS` imported from `config` (line 325) ensures doctor's override-env list can never silently drop a store when a 6th store is added — single-source enumeration closes that coverage gap + +**Concerns:** +- `check_wardline_artifact_key` (line 836) reports a `warn` when `LEGIS_WARDLINE_ARTIFACT_KEY` is absent, saying all scans govern as `artifact_status=unverified`. The honesty diagnosis (PDR-0023) is correct and the signal is present. However the message does not name a path for a "warn" exit in CI: operators who see a `warn` may not know whether CI fails on `warn` or only on `error`. `run_doctor` (line 999–1002) returns non-zero only when any check is not `.ok` — and `warn` is not "ok" (`.ok` is `status != "error"`, not `status == "ok"` — confirmed: `DoctorCheck.ok` at line 38 returns `self.status != "error"`). This means `warn`s do NOT cause a non-zero exit. An unset `LEGIS_WARDLINE_ARTIFACT_KEY` therefore yields a `warn` that does NOT fail CI — which is documented as intentional ("deliberately a warn, not an error") but could mislead operators who expect CI-blocking behavior for unsigned verification. +- At 1002 lines `doctor.py` carries both the check domain logic and the rendering/orchestration logic. It is coherent but large; adding a new sibling or posture check requires editing a single growing file with no structural boundary. + +**Confidence:** High — Read 100% of doctor.py (1003 lines). Every claim cites specific line ranges. Verified `DoctorCheck.ok` property logic at line 38. Confirmed `repairable=False` for split-brain at line 197. Confirmed `check_weft_toml` has no write call. Confirmed `run_doctor` exit-code logic at lines 999–1002. + + +## hooks.py — SessionStart Hook and Refresh + +**Location:** `src/legis/hooks.py` + +**Responsibility:** Implements the `legis session-context` SessionStart hook, which refreshes drifted instruction blocks and skill packs in place and emits a one-line posture banner (instructions, skill, cells, posture floor) that is always non-empty to distinguish "nothing to report" from "broken." + +**Key Components:** +- `refresh_instructions` (lines 38–93) — refreshes drifted instruction blocks (byte-exact check via `_instructions_block_is_current`) and stale skill packs (fingerprint check via `_skill_tree_fingerprint`); only touches marker-bearing files and already-installed skill dirs; never creates a block or dir that doesn't already exist (that is install's job) +- `generate_session_context` (lines 198–222) — the top-level entry point; always returns a non-empty string (dogfood N-1); composes four posture sub-strings: `_instructions_posture`, `_skill_pack_posture`, `_cells_posture`, `_posture_floor`; exceptions from `refresh_instructions` are caught and reported as a failure line, not raised +- `_posture_floor` (lines 173–195) — reads the posture ledger with `initialize=False` (never creates the DB); absent/empty ledger returns `"posture floor: none (fail-closed structured)"` not a false-green claim; unreadable ledger returns `"posture floor: unreadable"` (warn, not silent); imported lazily inside the function to avoid circular import at module load +- `_cells_posture` (lines 145–170) — mirrors `mcp._load_policy_cell_registry` file precedence (`LEGIS_POLICY_CELLS` > `policy/cells.toml`) but is explicitly documented as report-only at hook process scope, never claiming server runtime posture; unreadable cells → `"cells config: unreadable"`, not a false-green +- `_skill_pack_posture` (lines 122–142) — when bundled source is missing, returns `"skill pack unverifiable (bundled source missing)"` (line 138) rather than claiming currency — honesty-correct; only claims "current" when fingerprints compare equal + +**Dependencies:** +- Inbound: `legis.cli` (the `session-context` subcommand calls `generate_session_context`); `legis.mcp` (MCP startup calls `refresh_instructions` best-effort) +- Outbound: `legis.install` (substantial: `inject_instructions`, `install_skills`, `install_codex_skills`, `_get_skills_source_dir`, `_skill_tree_fingerprint`, `_instructions_block_is_current`, `_marker_token`, `INSTRUCTIONS_MARKER`, `SKILL_NAME`); `legis.policy.cells` (`load_policy_cells`); `legis.config` (`posture_db_url`); `legis.posture.ledger` (`PostureLedger`) + +**Patterns Observed:** +- Refresh-only-in-place invariant: `refresh_instructions` checks `if not md_path.exists(): continue` (line 50) and `if not target_root.is_dir(): continue` (line 83) — never creates absent install artifacts; install vs hooks boundary is structurally enforced +- All four posture sub-functions are fail-closed: each returns a distinct "unreadable" or "not installed" string rather than silently eliding the field or returning an empty string +- Lazy import of `legis.config` / `legis.posture.ledger` inside `_posture_floor` (lines 183–184) avoids circular import; consistent with the pattern used across enforcement modules + +**Concerns:** +- `refresh_instructions` warns via `logger.warning` (lines 69–71, 88–90) when drift re-injection fails, but the warning goes to the log, not to the session banner. An operator reading the banner without checking logs would see no signal about the failure. The comment at line 69 notes this is intentional ("Surface it for the operator (peer of the boot-log path)"), but in practice an agent running headlessly may have no log reader. The `_instructions_posture` post-refresh check (line 116) does catch still-drifted state and returns `"instructions stale (refresh failed; see logs)"` in the banner — so this is only a partial gap. +- `hooks.py` imports eight private symbols from `install.py` (prefixed `_`). This is a documented dependency (the module comment explains the two callers), but changes to private install helpers require cross-checking hooks.py. The coupling is inward-only (hooks does not re-export these) and exists because hooks is explicitly a "lighter-weight" refresh surface reusing install's logic. + +**Confidence:** High — Read 100% of hooks.py (223 lines). Verified `refresh_instructions` never creates absent paths (lines 50, 83). Verified `_posture_floor` uses `initialize=False` (line 189). Verified `_skill_pack_posture` unverifiable path (line 138). Confirmed private symbol imports from install at lines 23–32. + + +## config.py — Store Resolution and Env Configuration + +**Location:** `src/legis/config.py` + +**Responsibility:** Resolves all SQLite store URLs and composition-root configuration (protected policies, operator paths) from environment variables, with `LEGIS_*_DB` overrides as the sole relocation mechanism — `weft.toml` is explicitly and deliberately ignored for store paths. + +**Key Components:** +- `STORE_DB_SPECS` (lines 71–77) — stably-ordered tuple of `(env_var, db_filename)` for all five stores; the single source of store identity so doctor and any future consumer never re-list the env vars / filenames independently +- `_resolve_db_url` (lines 110–123) — the single resolution point for all stores: `env_var in os.environ` (membership check, not `.get()`) so a present-but-empty override returns verbatim rather than silently falling through to the default; a present-but-empty override is therefore a broken override, never a "use default" fallback +- `_store_dir` (lines 90–97) — ignores `weft.toml` by design (comment at line 92); builds `.weft/legis/` under the provided root or `Path(".")` (relative, resolved against cwd at call time) +- `protected_policies` (lines 171–184) — single parse point for `LEGIS_PROTECTED_POLICIES`: `frozenset` of comma-split, stripped, non-empty names; read at call time so the CLI can write the env var from `--protected-policies` before composition roots read it +- `ensure_sqlite_parent` (lines 187–203) — creates the parent directory lazily at store-open time, never at URL-compute time; importing `config` or computing a default URL never litters `.weft/` directories +- `operator_session_path` / `operator_age_path` (lines 151–168) — operator-elevation file paths, both under `.weft/legis/`; documented as holding references/encrypted blobs only, never key plaintext + +**Dependencies:** +- Inbound: every module that opens a store (`api/app.py`, `mcp.py`, `cli.py`, `store/`, `posture/`, `install.py`, `doctor.py`, `hooks.py`) +- Outbound: `sqlalchemy.engine.make_url` (URL parsing in `ensure_sqlite_parent`); `os`, `pathlib` + +**Patterns Observed:** +- `weft.toml`-is-enrich-only documented at module level (lines 18–23) and structurally enforced: `_store_dir` does not read `weft.toml` at all; no `tomllib` import in `config.py` +- Present-but-empty env var treated as verbatim override (line 121), not silent fallback — consistent with CLAUDE.md doctrine +- Call-time resolution (not module-load-time) for both DB URLs and `protected_policies`: env vars written late by the CLI (e.g. `--protected-policies` flag sets `os.environ` before the composition root reads it) always produce the correct value + +**Concerns:** +- None observed. Verified: no `weft.toml` read in any path; present-but-empty override is handled correctly (line 121); `ensure_sqlite_parent` defers directory creation to store-open time (not import time); `STORE_DB_SPECS` is the single enumeration consumed by doctor. The module is 204 lines with a single, well-bounded responsibility. + +**Confidence:** High — Read 100% of config.py (204 lines). Verified `env_var in os.environ` membership-check at line 121. Verified absence of `tomllib` import. Verified `_store_dir` comment at lines 91–93. Verified `STORE_DB_SPECS` structure at lines 71–77. Confirmed lazy `ensure_sqlite_parent` design at lines 193–203. diff --git a/docs/arch-analysis-2026-06-28-2142/temp/validation-catalog.md b/docs/arch-analysis-2026-06-28-2142/temp/validation-catalog.md new file mode 100644 index 0000000..cb9c300 --- /dev/null +++ b/docs/arch-analysis-2026-06-28-2142/temp/validation-catalog.md @@ -0,0 +1,23 @@ +# Validation Report — Subsystem Catalog + +**Method note (honest):** The dispatched `analysis-validator` subagent stalled on an API error mid-stream (infra failure, 63 tool-uses, no report written) — not a BLOCK verdict. Validation was completed by the controller with documented file:line evidence per claim (below). This is more rigorous than a rubber-stamp but deviates from the "independent subagent" gate; recorded transparently. + +**Verdict: PASS_WITH_NOTES** — the catalog is accurate and evidence-backed; **4 high-severity-sounding "Concerns" require reclassification** (none is a live governance-honesty/false-green gap). No load-bearing claim is false in a way that would mislead the architect handover once the reclassifications below are applied. + +## Claims verified (controller, with evidence) + +| # | Catalog claim | Verdict | Evidence / correction | +|---|---------------|---------|-----------------------| +| 1 | Posture `read_floor()` does NOT gate on `verify_integrity()` (fail-open floor read) | **RECLASSIFY → scope artifact** | True on this `main` checkout (`posture/ledger.py:92`, verify_integrity only in docstring l.332), but the gating fix `eb28e4b` (legis-476ab6f125) is **not an ancestor of `main` 25d64e2** — it is unmerged on `release/1.3.0-federation-reads` (PR #21). On the release line `read_floor` DOES gate. **Not a live gap** — an artifact of analyzing `main`. | +| 2 | `install._keychain_available()` always returns `False` (keychain custody unimplemented) | **RECLASSIFY → deliberate, honest** | Confirmed `install.py:1349 return False`, but the docstring (l.1344-1348) shows it is an intentional fail-closed stub: returns False so install falls back to age-file "rather than claiming a keychain it cannot actually write." Honest future-work, **not a defect/false-green**. | +| 3 | Enforcement `default_policy_cells()` defaults to `chill`; "nothing enforces production selects fail-closed" | **RECLASSIFY → overstated; wiring is fail-closed** | `policy/cells.py:64-71` factory does default chill, BUT the production composition root `mcp._load_policy_cell_registry()` (`mcp.py:184-200`) is fail-closed: no config → `fail_closed_policy_cells()` (structured); chill requires explicit `LEGIS_DEV_DEFAULT_CELLS=1` (comment cites Q-M7/audit H6). Routing chokepoint `mcp.py:1574` also defaults `fail_closed`. **Residual is a discipline note** (the factory is a fenced dev helper a future composition root must not call directly), not a current fail-open. | +| 4 | `governance/signoff_binding` has an uncompensated partial-write window (attach succeeds, ledger.record fails → split state, no repair tool) | **RECLASSIFY → documented fail-closed trade-off + future-work** | Confirmed `signoff_binding.py:71-81`. Fails CLOSED: "a binding with no verifiable ledger entry is exactly what the ledger's verify() surfaces" — detectable, not a silent false-green. Legitimate **future-work** (a reconciliation/repair path), Minor severity, not a governance-honesty defect. | +| 5 | Service public surface: `UnresolvedInputError`/`WardlineRoutingError`/`ProtectedKeyRequiredError` and `sign_off` not re-exported from `service/__init__.py` | **CONFIRMED** | Those names do not appear in `service/__init__.py` (only `__all__` at l.37); callers must import from submodules. Real but Minor public-surface gap. | +| 6 | F1: `TrailVerifier._requires_verification` derives from in-record fields (modify-to-unsigned) | **CONFIRMED — known/tracked** | Matches tracker **legis-e5e5b0b57f** (open P3) and the README conceded raw-DB-write residual. Label "conceded residual / tracked," not new. | +| 7 | Transports: `api/app.py` imports private `_load_policy_cell_registry` from `mcp.py`; mcp.py god-module (2748 LOC) | **CONFIRMED — known (Q-H2)** | Code comment already cites Q-H2; the shared helper belongs in `config.py`/`policy/`. Structural, evidenced. | +| 8 | `store/ ↔ enforcement/` bidirectional coupling | **CONFIRMED — structural, not a runtime cycle** | `store.head_anchor` imports `enforcement.signing`; `enforcement.{engine,protected,signoff}` import `store.*`. No decision logic crosses; a shared `crypto/` leaf would close it. | +| 9 | `canonical.py ensure_ascii=False` | **Correctly NOT flagged** | Intentional cross-tool HMAC contract with Wardline (per CLAUDE.md). The non-ASCII golden-vector gap is a real Wardline-side follow-up, not a legis defect. | +| 10 | Checks: `check_report` accepts unvalidated `commit_sha` | **CONFIRMED + correctly self-qualified** | Records labeled `UNAUTHENTICATED` → no false-green; the residual is a downstream-join/phantom-record robustness note, not a clearance gap. | + +## Net effect on the deliverables +The architect handover must present #1–#4 with the corrected framing above (NOT as open governance-honesty defects). The genuine, correctly-classified improvement themes that survive validation: **god-module decomposition** (mcp.py, install.py), **coupling cleanup** (store↔enforcement shared crypto leaf; posture→install error relocation; api→mcp helper relocation/Q-H2), **public-surface completeness** (service `__all__`), and **future-work robustness** (signoff reconciliation tool; keychain custody adapter; check `commit_sha` proof-of-commit). The tracked items (F1/legis-e5e5b0b57f, Q-H2) should be linked, not re-filed.