Skip to content

feat: rework Priors as a Claude Code plugin (memory always on, writing gated by mode)#13

Closed
claudialnathan wants to merge 27 commits into
mainfrom
feat/plugin-rework
Closed

feat: rework Priors as a Claude Code plugin (memory always on, writing gated by mode)#13
claudialnathan wants to merge 27 commits into
mainfrom
feat/plugin-rework

Conversation

@claudialnathan

Copy link
Copy Markdown
Owner

Summary

Reworks Priors from CLI-first into a Claude Code plugin-first product, while keeping the safety substrate (MCP server, deterministic brief, quote-or-refuse, append-only audit, idempotency, project-as-subject) intact.

Users used to type priors stage --source-content @… --candidates @… and manage a staged/ queue by id. With this PR, day-to-day flow moves into Claude Code:

/plugin marketplace add https://github.com/claudialnathan/priors
/plugin install priors@priors

Then /recall, /log, /rules, /why, /impact, /reflect, etc.

What's new

  • Plugin scaffold at repo root: .claude-plugin/plugin.json, .claude-plugin/marketplace.json, commands/ (10 slash commands), agents/priors-steward.md, hooks/hooks.json + bounded shell scripts, .mcp.json.
  • Two modes (auto | manual) in .priors/config.json. Memory use is always on; memory writing toggles between modes. Auto mode is bounded by a significance gate.
  • New entry kind rule. User-authored rules write directly via priors rule add (no quote-or-refuse — the user typed the claim). Agent-inferred rules still pass through the review queue.
  • Readable IDs (D-001, F-004, R-002) for human-facing flows. Canonical slug ids preserved in metadata, exports, and --json output.
  • New surfaces: /why, /impact, /reflect, /log, /rules, /rule-add, /status. Pushback formatter, natural-language log-intent detector, and significance gate live as pure modules under src/intent/ and src/session/.
  • Cursor compatibility: .cursor/rules/priors.mdc (always-apply rule) + .cursor/mcp.json.
  • Lifecycle hooks (SessionStart, UserPromptSubmit, PreCompact, Stop) — all bounded, none call an LLM.

What's preserved

  • MCP server and full tool surface (recall, get_entry, stage_learning, edit_staged, discard_staged, commit_learning, mark_stale, link_entries, propose_edge / commit_edge / discard_edge).
  • Deterministic priors://brief (no LLM in the assembler — same store in, same bytes out).
  • Quote-or-refuse on agent-proposed candidates (substring + Dice-coefficient grounding floor).
  • Append-only audit logs.
  • Idempotency keys on every MCP write tool.
  • The .priors/ store layout. The new entries/rules/ directory is additive.
  • Seven-task regression suite. Every existing CLI subcommand.

Backward compatibility: existing entries parse unchanged (all new frontmatter fields are optional). Existing CLI users keep their full surface.

Files / scope

  • 57 files changed (+3988 / -606).
  • 9 new source files; 7 new test files.
  • 10 new CLI subcommands: mode, status, log, rules, rule add, why, impact, reflect, resolve, hook.
  • Docs: README.md, AGENTS.md, CLAUDE.md rewritten; new docs/plugin-architecture.md, docs/maintainer-guide.md, CHANGELOG.md.
  • Historical drafts preserved at AGENTS.md.old and CLAUDE.md.old.

Test plan

  • npm test — 232/232 passing (was 186 before this PR)
  • CLI smoke test in /tmp/priors-smoke: init, mode auto/manual, rule add, log, rules, status, resolve, recall, hook session-start/user-prompt, impact, reflect, why — all working
  • Plugin install via /plugin marketplace add — verify hooks fire and slash commands appear in a fresh Claude Code session
  • Verify .priors/ lands in the user project (not the plugin dir) when installed

After merge

  1. Bump package.json version to 1.1.0-rc.1 (additive, still rc).
  2. Tag v1.1.0-rc.1.
  3. npm publish for the CLI / MCP-only consumers.

🤖 Generated with Claude Code

…eplace with inference-first (Flow A) / three-question (Flow B) dispatch. All six deliverables landed and contract test passes 32/32.

  Next: review the new commands/init.md and docs/onboarding-design.md. (disable recaps in /config)
* "Claude PR Assistant workflow"

* "Claude Code Review workflow"
- add standalone stdio MCP server, CLI wrapper, and package metadata
- add AGENTS.md, MCP architecture docs, license, security policy, and CI
- move legacy Claude plugin, old workflows, fixtures, and hook tests to ignored reference storage
- add neutral ~/.priors store handling, staged distillation, audit logging, and safe emissions
- update tests to cover MCP protocol, schemas, structured results, config generation, and security boundaries
Added branch protection + the repo itself documents/reinforces consistent GitHub workflow for priors development. Reduces accidental process drift and makes AI-assisted contributions safer and more predictable.
* chore(priors): remove v0.3 implementation

The pre-rejig MCP server (with a global `~/.priors` store, decay scoring,
`priors.reinforce`, `priors.emitConstraint`, and the dependency-free SDK
shim) is preserved at the `legacy/v0.3.0` tag. v1 replaces all of these
with a project-scoped, file-based, curation-first contract.

Removed:
- bin/priors-mcp.js — old executable wrapper
- src/priors-mcp.ts — single-file v0.3 server
- tests/mcp/run-tests.mjs — old MCP harness

Recover with: `git checkout legacy/v0.3.0`

Made-with: Cursor

* feat(priors): v1 mcp server and cli

Single binary at bin/priors.js exposes a stdio MCP server and a
mirroring CLI. Both call the same store/logic; whatever changes in one
must change in the other in lockstep.

MCP surface (resources): priors://brief, priors://index,
priors://entry/{id}, priors://audit/{id}.

MCP surface (tools): recall, get_entry, stage_learning,
commit_learning, mark_stale, link_entries.

CLI adds: init, init-config, health, evals, export-pack, import-pack.

Architecture:
- TypeScript on Node 25, native .ts type stripping, zero runtime deps.
- Modules under src/<area>/: brief, recall, distill, curation, export,
  health, evals, clients, store, schema, util, mcp, cli.
- Deterministic brief assembly (no LLM in the assembler).
- Quote-or-refuse for staged learnings (verbatim substring check).
- Idempotency keys on every write tool.
- JSONL audit log on every write/link/mark-stale/import.
- All I/O confined to <project-root>/.priors/ with path-traversal
  rejection at the schema layer.

Tooling:
- package.json scripts: test (unit + snapshots + regression), brief, mcp.
- Makefile delegates to npm.
- tsconfig.json provides editor type-checking only; no build step.

Made-with: Cursor

* test(priors): v1 unit, regression, and snapshot suites

160 tests covering every public surface and every non-negotiable.

tests/unit/
- util/{atomic-write,clock,json-rpc,safe-path,tokens,uuid,yaml}.test.ts
- schema/{entry,mcp}.test.ts
- store/{audit-query,entries,index,paths,project}.test.ts
- brief.test.ts — token ceiling, ranking, sectioning
- recall.test.ts — input validation, scoring, all filters
- distill.test.ts — quote-or-refuse, forbidden kinds, max 5/pass,
  staged → committed promotion
- curation.test.ts — supersedes/contradicts links, cycle rejection,
  mark_stale, contested status
- export.test.ts — pack/unpack, dry-run, schema-version rejection
- mcp.test.ts — initialize, list, read, call, idempotency cache
- health.test.ts — broken links, stale-without-reason, --fix
- clients.test.ts — claude/cursor/codex/raw config rendering

tests/regression/
- seven-scenarios.test.ts wraps `priors evals` so the regression suite
  runs as part of `npm test`. The suite is the contract: fresh-handoff,
  dead-end recall, mark-stale flow, conflict/contested, distillation
  safety, emission deferred, cross-client.

tests/snapshots/
- brief-determinism.test.ts asserts byte-identical output across three
  runs of the same fixture, with the project UUID normalised. This
  enforces "the brief is deterministic" mechanically.

tests/helpers/
- temp-store.ts gives every test an isolated .priors/ root with a fixed
  clock, so failures are reproducible.

Made-with: Cursor

* docs(priors): v1 contracts, specs, integrations, and evals

Promotes the operating contract and locked specs into the public docs
tree. Internal handover notes remain in internal/ (gitignored) as the
authoritative private copy.

Operating contract:
- AGENTS.md — short, framing-first contract every agent reads first.
  Project-as-subject, quote-or-refuse, stage-only distillation,
  deterministic brief, idempotency, failures-as-data.
- CLAUDE.md — Claude Code specific operational notes that defer to
  AGENTS.md and the specs.
- .cursorrules — Cursor specific notes, mirroring CLAUDE.md.

Public docs:
- docs/project-brief.md — what Priors is and is not. "Why this is a
  different category" is the framing test.
- docs/specs/brief-resource.md — locked spec for priors://brief.
  Token ceiling, section budgets, ranking, edge cases.
- docs/specs/staged-distillation.md — locked spec for stage_learning.
  Quote-or-refuse rule, forbidden kinds, max 5 candidates per pass.
- docs/integrations.md — wiring Priors into Claude Code, Cursor,
  Codex CLI, and raw stdio. Per-project vs per-machine setup.
- docs/evals.md — the seven-task regression suite, why each task
  exists, and how to add new tasks.
- docs/mcp-architecture.md — refreshed for v1 surface.
- docs/github-workflow.md — branch-first flow, conventional commits,
  PR template, release tags.

Top-level:
- README.md — install, quickstart, the surface in one page, link to
  the contract and specs.
- SECURITY.md — path-traversal rejection, no shell interpolation,
  zero runtime deps, audit log on every write.

tests/README.md — points contributors at the four test buckets and
explains what each one enforces.

Made-with: Cursor

* chore(priors): dogfood v1 store with seed entries

Removes .priors/ from .gitignore so this repo's own project record is
tracked alongside the code that produces it. Adds cln-w.md to the
ignore list (local scratch file, not part of the project record).

Seeds 14 active entries that capture the v0.3 → v1 design:

decisions/
- typescript-node25 — language and runtime choice, zero deps
- in-repo-store — store lives at <project-root>/.priors/, not ~/.priors
- deterministic-brief — priors://brief is mechanically assembled
- stage-only-distillation — stage_learning never auto-commits
- mcp-cli-mirror — CLI mirrors MCP one-to-one

constraints/
- zero-runtime-deps — package.json dependencies block stays empty
- quote-or-refuse — verbatim substring check on every staged claim
- brief-token-ceiling — priors://brief stays under ~2000 tokens
- idempotent-writes — every write tool accepts client_request_id
- no-path-traversal — all I/O confined to .priors/, schema-rejected

questions/
- contested-resolution-ux — right CLI/MCP UX for resolving contested?
- long-source-chunking — how to handle stage sources > ~32K tokens?
- rename-migration — handling external refs when project moves

failures/
- decay-and-emit-overreach — v0.3 active decay, helpful/harmful
  counters, and emit_constraint were rejected. Curation, not retrieval
  gymnastics, is the product.

The seed audit log records each write so health checks pass and the
brief assembler has real data to work with. `node bin/priors.js brief`
generates a real orientation brief from this store.

Made-with: Cursor

* ci(priors): point CI at v1 binary and add regression evals step

The previous CI workflow smoke-tested `bin/priors-mcp.js --version`,
which the v0.3 → v1 rejig deleted. v1 ships a single binary at
`bin/priors.js` with a `--help` flag that exits 0 cleanly, so use that
as the smoke check instead.

Adds a separate `Run regression evals` step that executes the
seven-task suite. Per AGENTS.md, the suite is the v1 contract — failure
on any one task is a merge blocker, so it gets its own line in the CI
log rather than hiding inside `npm test`.

Made-with: Cursor

* docs: align README and rituals with v1 CLI and MCP surface

- Fix npm package name (priors), install/MCP flow, recall filters, export/import
- Correct dead-ends ritual to recall(kind: failure); remove nonexistent review-staged
- Point doc links at docs/; set clone URL to claudialnathan/priors; Clarify MIT
- Add package.json author; brief footer text when >50 old staged entries

Made-with: Cursor
Document local npm install usage with npm exec/npx and add a global-install hint to reduce first-run command-not-found errors.

Made-with: Cursor
Document local npm install usage with npm exec/npx and add a global-install hint to reduce first-run command-not-found errors.
Implements the four-task product brief end-to-end. Tests 160 → 186.

Task 1 (v1) — Curation event log
- New append-only .priors/audit/curation.log with 6 typed event kinds
  (propose/stage/edit/accept/reject/discard); captures source model,
  original proposal payload, and human edit deltas as research data
- New discard_staged / edit_staged MCP tools + CLI; new
  `priors audit curation` filter command

Task 2 (v1) — Quote-or-refuse verification
- New project-local .priors/config.json (groundingMode + commitThreshold)
- Deterministic claim↔evidence dice-coefficient floor (0.15) atop the
  existing verbatim-substring check; new ungrounded_claim reason code
- groundingMode "strict" rejects; "warn" stages with grounding_warning
  flag and details on the stage event
- 10 adversarial fixtures + strict/warn harness in tests/grounding/

Task 3 (v2) — Typed causal edges
- Vocabulary 4 → 8: supersedes, contradiction_of, derived_from,
  reinforces, caused_by, blocks, depends_on, refutes (hard cap;
  ninth kind requires removing one)
- Renamed contradicts → contradiction_of; kept reinforces; dropped
  validates per the brief
- New propose_edge / commit_edge / discard_edge MCP tools mirroring the
  entry staging pipeline; inbound edges surfaced on get_entry and
  `priors get`
- One-shot `priors migrate-relations` for legacy stores (raw-YAML rewrite
  bypasses schema validation)

  Task 4 (v2) — Composite quality score
- Six deterministic sub-scores in src/distill/score.ts (no LLM scoring),
  min-composite documented inline
- commitThreshold gates entries that pass hard checks but score low;
  default 0.0 preserves current behaviour
- Sub-scores + composite logged on every propose and reject event
…g gated by mode)

Priors used to be CLI-first: users typed `priors stage --source-content @… --candidates @…` and managed a `staged/` queue by id. This rework moves the human surface into Claude Code (and Cursor where practical) while keeping the safety substrate — MCP server, deterministic brief, append-only audit, quote-or-refuse, idempotency, project-as-subject — untouched.

What's new

- Plugin scaffold at repo root using the modern Claude Code layout per https://code.claude.com/docs/en/plugins:
    .claude-plugin/plugin.json
    .claude-plugin/marketplace.json   # single-plugin marketplace catalog
    skills/<name>/SKILL.md            # one folder per slash command
    agents/priors-steward.md
    hooks/hooks.json + hooks/scripts/
    .mcp.json
- Slash commands are auto-namespaced as /priors:<name>: /priors:status, /priors:brief, /priors:recall, /priors:why, /priors:impact, /priors:reflect, /priors:log, /priors:rules, /priors:rule-add, /priors:export.
- Single-plugin marketplace install:
    /plugin marketplace add https://github.com/claudialnathan/priors
    /plugin install priors@priors
- Cursor compatibility: .cursor/rules/priors.mdc (always-apply rule) + .cursor/mcp.json. Legacy single-file .cursorrules removed in favour of the modern .cursor/rules/*.mdc directory format.
- Two modes (auto | manual) in .priors/config.json. Memory use is always on; memory writing toggles between modes. CLI: `priors mode [auto|manual]`.
- New entry kind `rule`. User-authored rules write directly via `priors rule add` (or /priors:rule-add); agent-inferred rules still flow through the quote-or-refuse review queue.
- Readable IDs (D-001, F-004, R-002) for human-facing flows; canonical slug ids preserved in metadata, exports, and --json output.
- New surfaces: /priors:why, /priors:impact, /priors:reflect, /priors:log, /priors:rules, /priors:rule-add, /priors:status. Pushback formatter, natural-language log-intent detector, and significance gate as pure modules under src/intent/ and src/session/.
- Lifecycle hooks (SessionStart, UserPromptSubmit, PreCompact, Stop) all bounded; none call an LLM.

What's preserved

- MCP server, tool surface, and resource paths.
- Deterministic priors://brief (no LLM in the assembler).
- Quote-or-refuse on agent-proposed candidates (substring + Dice-coefficient grounding floor).
- Append-only audit logs.
- Idempotency keys on every MCP write tool.
- The .priors/ store layout. New entries/rules/ directory is additive.
- Seven-task regression suite.
- All existing CLI subcommands.

Cleanup and conventions

- Removed previously-tracked personal Claude Code dev-tool installs (.agents/, .claude/skills/, .cursorrules legacy single-file format) from the public tree. Convention: anything at .claude/agents/, .claude/skills/, CLAUDE.local.md is gitignored personal dev tooling; the plugin's own published assets live at agents/, skills/, hooks/, .claude-plugin/, .mcp.json (no leading dot for the plural-noun directories).
- CLAUDE.local.md template added (gitignored) so maintainers can keep personal notes without leaking to plugin users.

Tests: 186 → 232 (+46), all green.

Docs: README, AGENTS.md, CLAUDE.md fully rewritten. New docs/plugin-architecture.md and docs/maintainer-guide.md. New repo-level CHANGELOG.md (excluded from npm package).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant