Skip to content

Cluster A: directory-mcp pattern adoption (S197, S199-S209)#32

Merged
spaceshipmike merged 18 commits into
mainfrom
feature/cluster-a-interactions
Jun 7, 2026
Merged

Cluster A: directory-mcp pattern adoption (S197, S199-S209)#32
spaceshipmike merged 18 commits into
mainfrom
feature/cluster-a-interactions

Conversation

@spaceshipmike

Copy link
Copy Markdown
Owner

Adopts the nine confirmed patterns from ePaint/directory-mcp covered by
spec 0.34's directory-mcp pattern wave (S197–S209): proactive-use server
instructions, open-vocabulary normalization with the vocab tool, the
append-only interactions log with derived recency/frequency and a
configurable retention policy, and the ambiguous-envelope response on
identity-shaped queries.

What ships

Chunk Scenarios Highlights
A1 S207 (cascade) Schema v17 — interactions table + FK cascade + indexes. logInteraction() / lastTouchedAt() / touchCountSince() / pruneInteractions() / DEFAULT_RETENTION_POLICY exported from @setlist/core.
A2 S199, S201, S202 (write) packages/core/src/vocab.ts — canonical slugs, alias reverse-map, single-pass normalization for tech_stack, patterns, topics, capability_type. Wired into every write path (register_project, enrich_project, write_fields, register_capabilities).
A3 S200 New MCP tool vocab (tool #58) returns {field, canonical, in_use: [{slug, count}], aliases}. Read-only — does not write to interactions.
A4 S204, S205, S206, S207 logInteraction calls on every read surface. cross_query produces N rows for N matches. portfolio_brief joins through MAX(at) and COUNT(*) over 7 days — no mutable counters. configure_memory accepts interactions_retention_days (default 90) and interactions_max_rows_per_project (default 10000); reflect() runs the prune and reports it.
A5 S208, S209 packages/core/src/ambiguity.tsdetectAmbiguity() with the 15% relative-gap literal and 4-alternative cap. Wired into search_projects (envelope when ambiguous, bare array when clean), recall (project-scope ambiguity), and cross_query (top-two score gap).
A6 [GATE] S197 ONBOARDING_INSTRUCTIONS rewritten from descriptive overview to imperative directive: opens USE PROACTIVELY, names LOOK UP FIRST / CAPTURE OPPORTUNISTICALLY / STAY CONSISTENT, workflow + capability shape underneath. 103 words.

Verification

  • 698 tests pass across all packages
  • npm run typecheck clean across all four packages
  • Schema v17 migration is idempotent on re-run
  • Pre-0.34 rows are not retroactively normalized — they read as stored until the next write rewrites them (S199 explicit contract)
  • vocab is the documented read-only exception — no interactions row written
  • Ambiguous envelope is additive: callers reading only result see identical output

Release versioning

Five patch tags created locally, held back until merge:

  • `v0.6.1-beta.14` — Chunk A2
  • `v0.6.1-beta.15` — Chunk A3
  • `v0.6.1-beta.16` — Chunk A4
  • `v0.6.1-beta.17` — Chunk A5
  • `v0.6.1-beta.18` — Chunk A6 (gate)

Plus the two predecessor tags from earlier in the branch:

  • `v0.6.1-beta.12` — Chunks 0a + 0b (spec evolve + config reconcile)
  • `v0.6.1-beta.13` — Chunk A1 (schema v17)

What's next

  • Cluster B (bundled skills + installer + CLAUDE.md sentinel injection — S198, S210, S211, S220, S221) — B1 is the next chunk.
  • Cluster C (schema v18, decision_history, is_builtin, self_test, bootstrap-config projection — S215–S219, S224) sits behind a gate so the v17-only releases above can stabilize independently.

Test plan

  • `npm test` passes (698 tests)
  • `npm run typecheck` clean
  • Schema migration idempotent — re-run does not error
  • `vocab` tool returns canonical + in_use + aliases per field
  • Open-vocab values normalize at write boundary; unknown terms pass through
  • Ambiguous envelope appears when 2nd-place within 15% gap, absent otherwise
  • `reflect()` reports `interactions_pruned`
  • MCP initialize response carries the proactive-use directive

🤖 Generated with Claude Code

Lands the dirty working-tree spec mutations into git so the build has a
clean starting point. Reconciles versions.external.current from beta.4
(stale) to beta.11 (package.json truth).

- .fctry/spec.md: 0.33 → 0.34 (227 scenarios, two pattern-adoption waves)
- .fctry/scenarios.md: S197-S227 added (directory-mcp + app-it patterns)
- .fctry/changelog.md: ref entries for directory-mcp and app-it evolves
- .fctry/config.json: versions.external.current 0.6.1-beta.4 → beta.11

No code touched — spec catches up to disk; subsequent chunks bring code
up to spec.
Adds the schema v17 migration introducing the `interactions` table — the
append-only log backing derived recency (MAX(at)) and frequency (COUNT(*))
signals for every registry read surface. Rows are immutable; recency and
frequency are computed by query, never stored as mutable counters.

- packages/core/src/db.ts: SCHEMA_VERSION 16 → 17, CREATE TABLE interactions
  with FK ON DELETE CASCADE to projects (S207), two indexes for recency-
  per-project and global pruning, v16→v17 migration block
- packages/core/src/interactions.ts (new): logInteraction (best-effort
  append; swallows errors so reads never fail because the log did),
  lastTouchedAt, touchCountSince, pruneInteractions, DEFAULT_RETENTION_POLICY
  (90 days, 10000 rows per project per the spec)
- packages/core/src/index.ts: export the new surfaces
- packages/core/src/introspect-exports.ts: declare the new exports for the
  library manifest parity check
- packages/core/tests/interactions.test.ts (new): 12 tests covering table
  shape, indexes, append behavior, error-swallowing, derived recency /
  frequency, age-based pruning, per-project-cap pruning, and FK cascade
- Bump schema-version assertions in db.test.ts, recipes-store.test.ts,
  compatibility.test.ts (16 → 17)

Partially satisfies: S204, S205, S206, S207 (the table contract is in
place; logging on read surfaces lands in chunk A2). Spec sections:
#schema (5.2), #entities, #portfolio-memory (2.12).

All 502 core tests pass; typecheck clean across core / cli / mcp.
…, S202)

Spec 0.34 #capability-declarations 2.11 — write-side of open-vocab fields.

New module packages/core/src/vocab.ts:
- VOCAB_FIELDS: the four normalized fields (tech_stack, patterns, topics,
  capability_type)
- CANONICAL_VOCAB: editorially-curated canonical slugs per field
- VOCAB_ALIASES: canonical slug → variant list for reverse lookup
- normalize/normalizeList/normalizeFieldValue/normalizeRecord: the pipeline
  (lowercase → hyphenate → strip non-[a-z0-9-] → alias lookup → pass-through
  for unknown terms)

Wired into registry.ts write paths:
- register() → normalizeRecord(fields)
- updateFields() → normalizeRecord(fields)
- enrichProject() → normalizeList('topics', ...) (topics is the open-vocab
  field on the legacy column path)
- registerCapabilities() / registerCapabilitiesForType() → normalize
  capability_type per row
- queryCapabilities() → normalize the capability_type filter param so agents
  can pass aliases on the read side too

Tests:
- packages/core/tests/vocab.test.ts: 23 new tests covering normalization
  semantics + registry integration for S199/S201
- Existing test expectations updated to canonical slugs (TypeScript →
  typescript, cli-command → command, MCP Server → mcp-server) — this is
  the spec contract, not a regression

Notes:
- Pre-0.34 rows are not migrated — they read as stored until the next
  write rewrites them through the normalizer (S199 explicit)
- Unknown terms pass through normalized but never rejected (open-vocab)
- The 'cli-command' surface label in self-register.ts is internal and
  kept for log clarity; storage normalizes to canonical 'command'

525 core tests pass. 667 total pass.
Spec 0.34 #capability-declarations 2.11 — read-side of open-vocab fields.

New MCP tool 'vocab' returns the canonical set, in-use values with counts,
and alias reverse-map for one of the four normalized fields (tech_stack,
patterns, topics, capability_type). Read-only by contract — does NOT
write a row to interactions (S204 documented exception).

Packages/core:
- vocab.ts: assembleVocabResponse() builds the {field, canonical, in_use,
  aliases} envelope from a caller-supplied count map. Pure function — no
  DB access.
- registry.ts: countVocabInUse() queries project_fields (tech_stack,
  patterns), projects.topics, or project_capabilities (capability_type)
  to produce slug→count tallies. Tolerates legacy comma-string storage
  and empty rows quietly.
- CANONICAL_VOCAB.capability_type now includes 'library' so the seeded
  introspect-library output appears in the canonical set.

Packages/mcp:
- server.ts: 'vocab' added between query_capabilities and Memory Agent.
  Dispatcher rejects unknown field with a helpful error naming the four
  supported fields.
- Tool count: 57 → 58 (vocab is tool #58)
- Test assertions for the per-type tool count updated to 58 across
  introspect-tools, self-register, and server tests.

Tests:
- server.test.ts: 3 new tests covering canonical+in_use+aliases shape,
  cross-project capability_type tallies, and unknown-field error
- introspect-exports manifest extended with assembleVocabResponse

670 total tests pass.
…e (S204-S207)

Spec 0.34 #portfolio-memory 2.12 — every registry read produces one
append-only row in the interactions table; derived recency and frequency
are computed by query (never stored counters); retention pruning runs in
the reflection cycle.

@setlist/core wiring:
- registry.ts:
  - getProject() and getProjectOrThrow() log surface='get_project',
    project_id=NULL on miss
  - searchProjects() logs one row per call, pinned to top hit (NULL when
    no matches)
  - queryCapabilities() logs one row per call with the type or keyword
    filter as the query
- memory-retrieval.ts: recall() logs surface='recall' for both bootstrap
  and search modes, resolving project_id string scope to row id
- cross-query.ts:
  - logCrossQueryInteractions() helper writes N rows for N matched
    projects (S204 explicit) and one NULL row for zero matches
  - portfolioBrief() now joins through MAX(at) and COUNT(*) over -7 days
    to add last_touched and recent_activity_count per project (S205)

Retention (S206):
- memory.ts:
  - configureMemory() accepts interactions_retention_days and
    interactions_max_rows_per_project knobs (defaults 90 / 10000)
  - getInteractionsRetentionPolicy() reads schema_meta with safe fallback
- memory-reflection.ts: reflect() invokes pruneInteractions() with the
  configured policy and reports interactions_pruned in its result

@setlist/mcp:
- server.ts configure_memory schema documents the new retention knobs
  and the dispatcher passes them through

Archive cascade (S207):
- Archive remains soft (interactions retained); ON DELETE CASCADE on
  the foreign key clears them only on hard project delete (admin path).
  Behavior was already correct from schema v17 — tests now lock it in.

vocab is the documented read-only exception — no interactions row.

Tests:
- packages/core/tests/interactions-logging.test.ts — 13 new tests
  covering per-surface logging, derived recency/frequency, schema-no-
  mutable-counter, retention configurability, prune-in-reflect, and
  the soft-archive vs hard-delete cascade contract

683 total tests pass.
Spec 0.34 #cross-project 2.9 — search_projects, recall, and cross_query
now surface {ambiguous, alternatives} when the second-place candidate's
score is within 15% of the top.

@setlist/core:
- ambiguity.ts (new): detectAmbiguity() — pure function over a sorted
  candidate list. Returns {ambiguous, alternatives} with up to 4 entries.
  Spec literals AMBIGUITY_GAP_THRESHOLD=0.15 and MAX_ALTERNATIVES=4
  exported as constants.
- registry.ts:
  - searchProjectsAmbiguous() — additive wrapper around searchProjects()
    that scores results and attaches the envelope. The scorer is simple
    (exact-name=10, name-contains=5, name-prefix=3, description=2) —
    blunt by design per S209 (registry raises the flag; LLM decides).
  - scoreSearchCandidates() — private helper; not on the public read
    surface.

@setlist/mcp dispatcher:
- search_projects: returns the bare result array when unambiguous (spec:
  'envelope does not bloat clean queries'), the full {result, ambiguous,
  alternatives} envelope when ambiguous.
- recall: detects project-scope ambiguity by running the supplied scope
  through searchProjectsAmbiguous. Wraps the memory result in the
  envelope when ambiguous.
- cross_query: attaches the envelope to its existing {results, summary}
  shape when the top two scores are within 15%.

Tests:
- packages/core/tests/ambiguity.test.ts — 10 tests covering: empty/single
  candidate, 15% threshold semantics, relative-gap vs absolute-floor,
  MAX_ALTERNATIVES cap, additive envelope contract, exact-match
  dominance, and the two-fragment ambiguous case.

693 total tests pass.
Spec 0.34 #capability-declarations 2.11 — rewrites ONBOARDING_INSTRUCTIONS
from descriptive overview to imperative proactive-use directive. The
paragraph now opens with USE PROACTIVELY and names all three rules:

  - LOOK UP FIRST (call get_project / search_projects before asking the
    user what project this is)
  - CAPTURE OPPORTUNISTICALLY (write back identity, capability, and
    memory changes as they happen)
  - STAY CONSISTENT (call vocab(field) before writing tech_stack,
    patterns, topics, or capability_type)

The four-step workflow (register → enrich → write_fields → refresh) and
the capability item shape sit underneath the directive rather than at
the top. The resource pointer to setlist://docs/onboarding still lands
last. Word count: 103 — comfortably under the 150-word ceiling.

Tests:
- packages/mcp/tests/onboarding.test.ts adds five S197 assertions:
  - imperative voice with USE PROACTIVELY in the first 50 chars
  - all three rules named
  - LOOK UP FIRST sentence references get_project or search_projects
  - STAY CONSISTENT sentence references vocab
  - directive sits ABOVE the four-step workflow

This is the Cluster A goal gate. All five preceding chunks (A2 vocab
write, A3 vocab tool, A4 interactions logging, A5 ambiguous envelope,
A6 instructions rewrite) ship together as the directory-mcp pattern
adoption.

698 total tests pass.
The S204 interactions table was being polluted by phantom rows from
internal callers, undermining its value as a per-surface signal.

#1: The MCP recall handler probed search_projects for project-scope
ambiguity, generating a phantom search_projects row alongside the real
recall row. searchProjects/searchProjectsAmbiguous now accept an
optional logInteraction flag (default true); the recall handler passes
false. One recall MCP call → exactly one recall row, zero search_projects.

#4: getProject/getProjectOrThrow now take an optional logInteraction
flag (default true preserves the S204 contract for direct MCP calls).
Internal callers — update_project return, bootstrap pre-flight checks,
registerExistingWorkspace, pinned-menu refresh, copyBriefCommand,
openPath, digest refresh, and the self-register probe — pass false.

#7: query_capabilities now resolves project_name to a project_id and
attributes the interactions row to that project. Unknown project names
and unscoped queries continue to log with project_id=NULL (the failure
is still the signal).

Tests in interactions-logging.test.ts cover all three paths with
assertions on row counts, surface attribution, and project_id pinning.
#3: The v16 → v17 migration now backfills pre-0.34 vocabulary slugs so
the vocab tool stops surfacing legacy aliases (e.g. 'cli-command' next
to 'command') as in_use forever. The new backfillVocabNormalization
helper rewrites three places at upgrade time:
  - project_capabilities.capability_type (single-slug column)
  - projects.topics (JSON array column)
  - project_fields.field_value for tech_stack and patterns (JSON arrays
    or scalars, via the same normalizeList/normalize helpers vocab.ts
    exposes)

The pass runs inside a single transaction so a malformed row aborts the
whole backfill instead of leaving a half-rewritten state. Idempotent on
re-run — already-canonical rows produce zero UPDATEs.

#9: enrichProject's four list fields now share consistent casing
semantics. Previously, topics ran through normalizeList (alias-resolved)
while goals/entities/concerns used ad-hoc lowercase+Set dedup, so the
same input string canonicalized differently per-field. Added
normalizeFreeList(values) — a lowercase+trim+dedupe helper that
preserves first-occurrence order — and routed goals/entities/concerns
through it. topics still gets full alias resolution; the three free-form
fields now consistently lowercase-trim. Same input → same canonical
form per-field.

Tests:
  - db.test.ts: v16 → v17 migration backfill, with planted legacy slugs
    in capability_type, project_fields.tech_stack, and projects.topics.
    Asserts all three normalize, idempotency on re-run.
  - vocab.test.ts: normalizeFreeList unit tests (case/trim/dedupe/order),
    enrichProject consistency across goals/entities/concerns, asymmetry
    documented (topics canonicalizes, entities lowercase-trims the same
    input).
…findings #5, #6 + A5)

#5: The configure_memory MCP handler now validates interactions_retention_days
and interactions_max_rows_per_project at the boundary. Each knob must be a
positive finite integer; non-numeric, NaN, negative, zero, and fractional
values throw InvalidInputError with a specific message naming the bad field
and what shape was expected. Previously these silently coerced to NaN via
String(value) in memory.ts, then fell back to the 90-day default during the
next prune — a confusing UX where "interactions_retention_days: 'forever'"
appeared to be accepted but did nothing.

#6: The MCP tool description, interactions.ts module docstring, and spec
§2.12 (Portfolio Memory) Retention policy paragraph said the bounds compose
permissively ("whichever permits more wins"). The implementation has always
composed them restrictively (bounded growth) — a row is pruned when EITHER
older than retention_days OR the project's row count exceeds the cap. The
implementation is the safer behavior; this aligns the docs with reality
rather than changing behavior. Updated tool description, interactions.ts
header + pruneInteractions docstring, and the two spec sections that
described the old "permissive" wording.

A5: pruneInteractions now runs both age-delete and cap-delete inside a
single db.transaction(). A mid-prune failure rolls back the whole pass,
so we never strand the table in a half-aged / half-capped state. Test
covers this by monkey-patching db.prepare to throw on the cap-phase DELETE
and asserting the age-phase rows are still present afterward.

Tests:
  - server.test.ts: 7 configure_memory boundary cases (string, NaN, negative,
    fractional, zero, max_rows_per_project, valid positive).
  - interactions-logging.test.ts: transactional rollback test.
… findings #2, #8, #10)

#2: search_projects, recall, and cross_query now ALWAYS return the
envelope shape {result, ambiguous, alternatives} regardless of whether
ambiguity triggered. Pre-0.34 the shape switched between bare-array
(unambiguous) and envelope (ambiguous) — callers iterating
`for (const p of result)` crashed on the envelope shape whenever the
fuzzy matcher decided to flag the query. The shape switch was an
unpredictable foot-gun.

The trade is shape consistency: every MCP caller reads `response.result`
on every call. Tests that asserted bare-array were updated to use the
envelope; the desktop app's IPC handler still calls `Registry.searchProjects`
directly (which returns Project[]) and is unaffected.

#8: cross_query previously sliced `cqResult.results` into alternatives
without filtering by source — memory/cc_memory hits carrying
project='global' or memory project_ids surfaced as registered-project
alternatives. The handler now filters to source='registry' rows BEFORE
building alternatives, and replaces the inline 15% / slice(1,5) logic
with the shared `detectAmbiguity` helper + AMBIGUITY_GAP_THRESHOLD +
MAX_ALTERNATIVES constants from ambiguity.ts. Search, recall, and
cross_query can no longer drift apart on the threshold.

#10: scoreSearchCandidates now differentiates match quality across
tiers (exact-name=100, name-prefix=50, word-boundary=15, substring=5,
description=+2). Pre-fix every name-substring match scored 5 → all
short-query results tied at 5 → ambiguity fired with random alternatives.
The dead `score === 0` guard on the prefix branch is gone (no longer
needed with proper scoring).

Also added a minimum-query-length gate: queries shorter than 3 chars
skip ambiguity detection entirely. Single-letter queries (`'e'`) and
two-char queries (`'al'`) no longer flag the entire portfolio as
ambiguous alternatives.

Spec §3.3 (Rules) and ambiguity.ts header updated to reflect the new
shape contract. Tests cover the differentiated-scoring tiers, the
min-length gate, envelope shape consistency on zero-match + bootstrap-mode
recall + cross_query, and the registry-source filter for cross_query
alternatives.
@spaceshipmike spaceshipmike merged commit 5f14453 into main Jun 7, 2026
2 checks passed
@spaceshipmike spaceshipmike deleted the feature/cluster-a-interactions branch June 7, 2026 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant