Skip to content

Draft: SPIRE materialization prune handoff from Task 122#43

Draft
kreneskyp wants to merge 9 commits into
mainfrom
task-122-spire-handoff
Draft

Draft: SPIRE materialization prune handoff from Task 122#43
kreneskyp wants to merge 9 commits into
mainfrom
task-122-spire-handoff

Conversation

@kreneskyp

Copy link
Copy Markdown
Collaborator

Summary

Draft handoff PR for the SPIRE work that was intentionally split out of Task 122 before the TurboQuant-only closeout merged.

This PR exists so the SPIRE-focused coder can inspect or reuse the work without mixing it into the TurboQuant Task 122 landing.

Included work:

  • SPIRE candidate materialization pruning in src/am/ec_spire/*
  • Experimental ec_spire.pre_materialization_prune gate
  • Historical evidence packets restored for visibility:
    • reviews/task-122/002-spire-bounded-materialization-prune/
    • reviews/task-122/003-spire-batched-materialization-prune/
    • reviews/task-122/004-spire-prune-ab-suite/
    • reviews/task-122/005-spire-prune-release-suite/
    • reviews/task-122/006-spire-recall-width-sweep/
    • reviews/task-122/007-spire-latency-storage-width25/

Status

Draft / handoff only. Not reviewed as a merge-ready SPIRE task and not part of Task 122's TurboQuant closeout.

Task 122 already merged as TQ-only in PR #42. This branch is based on the updated main after that merge.

Validation Carried Over

The packet-local artifacts include prior focused tests and 10k/50k/100k SPIRE A/B evidence. A SPIRE owner should decide whether to refresh, retask, or fold this into their current SPIRE branch before merge.

kreneskyp pushed a commit that referenced this pull request Jun 27, 2026
@kreneskyp

Copy link
Copy Markdown
Collaborator Author

Reviewed (SPIRE reviewer, from the Task 121/123 line). Short version: the code is clean and this is a useful lever for Task 123's latency path — but its A/B undersells it because it was measured single-instance.

Code (pre_materialization_prune, 49 lines): skips row materialization once a candidate's cheap quantized ip is below the running top-k keep threshold. Guards are sound — only when replica-dedupe is disabled, only with a bounded limit, skipped when deletions present, truncations recorded. Default-on.

A/B (packet 004) is recall-neutral but single-instance: recall 1.0000 on/off, materialized candidates 251,555 → 8,495 (~30×), latency ~flat (92.4 vs 93.8 ms). Single-instance, materialization is a cheap local row read, so the 30× cut looks marginal.

Why that's the wrong substrate: on the contained multi-instance lane (Task 123 packet 009/011), each materialized candidate becomes a shipped tuple across libpq — the dominant 'communications' cost. A ~30× materialization cut should be a large latency win there, not ~1.5%. This is likely one of the biggest levers for the route-efficient n1024 b2 candidate.

Recommendation: re-home this commit into the Task 123 SPIRE branch and A/B it on the multi-instance lane (≥200q, per-worker object bytes + per-stage timeline), and confirm it actually engages for the b2/b4 configs (the dedupe-mode guard means it may only fire for b0). Full coordination note: reviews/task-123/009-multi-instance-phase-a-baseline/feedback/2026-06-27-02-reviewer.md on the task-121 branch. Not a merge-readiness review — it's a draft handoff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant