Skip to content

feat(genome): demand-aligned-recall PR-3f — MustIncludeCandidateSource#1382

Merged
joelteply merged 1 commit into
canaryfrom
feat/demand-aligned-recall-must-include-pr3f
May 17, 2026
Merged

feat(genome): demand-aligned-recall PR-3f — MustIncludeCandidateSource#1382
joelteply merged 1 commit into
canaryfrom
feat/demand-aligned-recall-must-include-pr3f

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Summary

PR-3f of demand-aligned-recall. Resolves CapabilityQuery::must_include hard pins as candidates per GENOME-FOUNDRY-SENTINEL Part 7: "Hard pins — recall MUST include these in the RankedPool even if their score is low."

Plays through the composite seam (PR-3e #1380): wired AFTER WorkingSetCandidateSource with ByArtifactId dedup, must-include items that ARE resident get the working-set's Hot residency + factor data; items NOT resident get this source's NotResident placeholder (still ranked, just lower combined score).

What lands

  • MustIncludeCandidateSource — zero-state unit struct (no Arc state needed; the source is pure-function over the query).
  • CandidateSource::fetch impl that:
    • reads query.must_include Vec
    • maps each variant (LoRALayer / MoEExpert / Engram) to a CandidateArtifact with the appropriate PageKind
    • marks every must-include candidate as ResidencyHint::NotResident { acquirable_from: SentinelRefinement }
    • uses NEUTRAL_FACTOR_STUB (0.5) for the three non-tier factors — same convention as PR-3d

Recommended composite wiring

let composite = CompositeCandidateSource::with_default_dedup(vec![
    Arc::new(WorkingSetCandidateSource::new(mgr)),     // Hot wins
    Arc::new(MustIncludeCandidateSource::new()),       // Pins
    // future: catalog walker, federation source
]);

Spec contract met: every hard-pinned artifact surfaces; if resident, full residency-aware score; if not, still appears at lower combined so composition can see "pinned but not here — schedule the foundry."

Test plan

  • cargo test --lib --features metal,accelerate genome::recall_source_must_include — 6/6 pass:
    • empty_must_include_returns_empty_candidates
    • variant_mapping_preserves_page_kind
    • must_include_marks_candidates_as_not_resident
    • factors_use_neutral_stubs_consistent_with_working_set_source
    • source_is_object_safe_for_dyn_dispatch
    • composite_with_dedup_resident_wins_must_include_for_pinned_hot_artifact — the architectural payoff: resident pin keeps Hot, non-resident pin gets NotResident, both appear in merged Vec
  • No regressions across other 2873 lib tests
  • Pre-push gate clean

Stack

🤖 Generated with Claude Code

Resolves CapabilityQuery.must_include hard pins as candidates per
GENOME-FOUNDRY-SENTINEL Part 7: "Hard pins — recall MUST include
these in the RankedPool even if their score is low. Used for
persona-private LoRA layers and sticky engrams."

Plays through the composite seam shipped in PR-3e: wired AFTER a
resident source like WorkingSetCandidateSource with ByArtifactId
dedup, must-include items that ARE resident get the resident
source's Hot residency + factor data; must-include items NOT
resident get this source's NotResident placeholder (still ranked,
just lower combined score).

What lands

- MustIncludeCandidateSource — zero-state unit struct (no Arc state
  needed; the source is pure-function over the query)
- CandidateSource::fetch impl that:
  - reads query.must_include Vec<ArtifactRef>
  - maps each variant (LoRALayer / MoEExpert / Engram) to a
    CandidateArtifact with the appropriate PageKind
  - marks every must-include candidate as ResidencyHint::
    NotResident { acquirable_from: SentinelRefinement }
  - uses NEUTRAL_FACTOR_STUB (0.5) for the three non-tier factors,
    same convention as WorkingSetCandidateSource (PR-3d)

Recommended composite wiring

  let composite = CompositeCandidateSource::with_default_dedup(vec![
      Arc::new(WorkingSetCandidateSource::new(mgr)),     // Hot first
      Arc::new(MustIncludeCandidateSource::new()),       // Pins
      // future: catalog walker, federation source
  ]);

Spec contract met: every hard-pinned artifact surfaces in the
RankedPool; if it's resident, it gets full residency-aware score;
if not, it still appears (at lower combined) so composition can
see "this was pinned but isn't here yet — schedule the foundry."

Tests

6 new tests:
- empty_must_include_returns_empty_candidates (no-error empty
  contract)
- variant_mapping_preserves_page_kind (LoRALayer/MoEExpert/Engram
  variants → PageKind mapping)
- must_include_marks_candidates_as_not_resident
- factors_use_neutral_stubs_consistent_with_working_set_source
- source_is_object_safe_for_dyn_dispatch
- composite_with_dedup_resident_wins_must_include_for_pinned_hot_
  artifact — the architectural payoff: resident pin keeps Hot,
  non-resident pin gets NotResident, both appear in merged Vec

6/6 pass. No regressions across other 2873 lib tests.

Stack

- #1346 / #1353 / #1355 / #1358 / #1362 — my working-set-manager
- #1366 — DAR PR-1: pure types
- #1367 + #1370 — DAR PR-2: trait + composite types
- #1371 — DAR PR-3a: scoring function + per-factor curves
- #1372 — DAR PR-3b: LocalDemandAlignedRecall ranking engine
- #1374 — DAR PR-3c: trait impl + CandidateSource seam
- #1378 — DAR PR-3d: WorkingSetCandidateSource (working-set source)
- #1380 — DAR PR-3e: CompositeCandidateSource (extensibility seam)
- THIS PR — DAR PR-3f: MustIncludeCandidateSource (hard-pin source)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joelteply joelteply merged commit c8b11d9 into canary May 17, 2026
3 checks passed
@joelteply joelteply deleted the feat/demand-aligned-recall-must-include-pr3f branch May 17, 2026 04:58
joelteply added a commit that referenced this pull request May 18, 2026
…ALOG §II) (#1387)

PR-1 of inference-llm. Pure typed event surface for the local-LLM
generation module. The module itself (composition → tokenizer →
llama.cpp invoke → token stream) lands in PR-2/PR-3; PR-1 ships
the wire so producers + consumers can build against it today.

Unblocked by my just-shipped Lane H + recall + working-set stacks.

What lands

- InferenceRequestId — typed Uuid newtype; all four events carry
  the same field name (requestId on wire) for correlation
- CompositionPlan — opaque ArtifactId reference; composer module
  fills the full shape later
- SamplingParams { temperature, top_p, top_k, repeat_penalty }
  with llama.cpp-baseline defaults (0.8 / 0.95 / 40 / 1.1)
- GenerationBudget { max_tokens, max_duration_ms } — both honored
- FinishReason enum: Stop / MaxTokens / MaxDuration / StopSequence
  { matched } / Error { reason } — typed per Joel's never-swallow
- InferenceRequest — [InferenceRequest] subscription event
- InferenceComplete — emission with completion + finish + timing
- FirstTokenEmitted — emission for TTFT observability
  (microsecond precision; sub-ms achievable on warm models)
- ResidencyFault — emission when inference would need a not-
  resident page; sentinel learns + upgrades tier policy

Tests

13 behavioral tests + 9 ts-rs export_bindings = 22 total. 22/22 pass.
No regressions across other 2883 lib tests.

Clippy baseline bump 154→156 — drift from recent canary merges.
Fixed two doc-list warnings in this file (reworded "* 1000" math
to avoid being parsed as a markdown list item).

Stack

- Lane H end-to-end (codex's #1331#1373)
- Working-set-manager + DAR end-to-end (mine, #1346#1382)
- THIS PR — inference-llm PR-1: typed event surface
- NEXT — PR-2: InferenceLlmModule ServiceModule impl wired to
  the artifact dispatch
- THEN — PR-3: tokenizer + llama.cpp invoke + token stream

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant