Skip to content

Tasks 121/123: SPIRE coarse-routing recall DOE + multi-instance closeout (no-promote)#39

Merged
kreneskyp merged 173 commits into
mainfrom
task-121-spire-coarse-routing-recall-doe
Jul 1, 2026
Merged

Tasks 121/123: SPIRE coarse-routing recall DOE + multi-instance closeout (no-promote)#39
kreneskyp merged 173 commits into
mainfrom
task-121-spire-coarse-routing-recall-doe

Conversation

@kreneskyp

@kreneskyp kreneskyp commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Closes the reopened multi-instance scope for Tasks 121 and 123 as no-promote / re-scope, implementing the packet 020 reviewer acceptance.

Result

  • Recall stable (1.0000) on the contained multi-instance executor.
  • Communications payload bytes are not the dominant local latency driver (packet 017, accepted; ~1540x payload delta with flat latency).
  • The dedupe-aware pre-materialization prune (d2ffbdaa9) is recall-safe and latency-neutral but not a demonstrated latency win; its leaf-side engagement (rows pruned) was never captured (packet 019).
  • No SPIRE default promoted.

Shipped state

  • ec_spire.pre_materialization_prune GUC default flipped true -> false: the feature merges as opt-in plumbing, main's default read behavior is unchanged. Unit tests unaffected (cfg(test) override).

Records

  • Task 123 status sync: reviews/task-123/021-post-ab-closeout/ (reviewer confirm .../feedback/2026-06-30-01-reviewer.md).
  • Task 121 status sync: reviews/task-121/030-multi-instance-closeout/.
  • Closeout acceptance: reviews/task-123/020-post-ab-closeout-request/feedback/2026-06-30-01-reviewer.md.
  • Both task files + plan/tasks/README.md index flipped to closed.

Follow-up -> Task 131

Engagement-instrumented prune, off-disk clean-latency rerun, and recall-safety where the prune actually engages move to plan/tasks/131-spire-streaming-global-topk-pruning.md.

🤖 Generated with Claude Code

Agent IX and others added 30 commits June 21, 2026 07:13
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reviews 003-target-candidate-rank, 004-stage-containment-help, and
005-target-candidate-rank-output (all LGTM, Phase 1 diagnostic plumbing).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
LGTM for packet 006-spire-pipeline-artifact-templates (benchmark
template plumbing; Phase 1 measurement run still owed for AC1).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
needs-evidence: run/provenance are clean and all recall/latency/
candidate-rank numbers trace, but funnel stages 1-3 derive from the
broken target-block-rank snapshot (0%/routing_miss=2000 everywhere),
contradicted by stage 4 showing 1841/2000 truth rows reaching the
candidate frontier. Route/leaf/block attribution is not yet
decision-grade; AC1 only partially satisfied.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
changes-requested: benchmark gate is SATISFIED (full vs l2 A/B at
10k/50k/100k via ecaz bench suite, all numbers trace; per-leaf cap=2
correctly shown recall-unsafe and not promoted). Blocker is packet
hygiene: ~30 tracked raw per-query rank JSONL files (~100MB) committed,
violating the no-operational-exhaust ban; coder must git rm the uncited
dumps and add a .gitignore rule. Minor: latency from a debug build.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
changes-requested (hygiene only): the 007 attribution bug is RESOLVED --
fix 4617b0f makes all six funnel stages internally consistent and tie
to recall (100k/32: 1841/2000=0.9205), provenance/runner/scale all clean.
Blocker: packet 009 recommits the same ~54MB raw per-query exhaust that
commit 1f4c06a just pruned from packet 008; coder must git rm the 15
pipeline-*.jsonl files. Also add small per-scale summary .txt files.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
LGTM: measurement-only negative result, benchmark gate satisfied.
All 10k/50k/100k x 6 variants x nprobe cells trace to suite-results.jsonl
(completed=22 failed=0); recall@10 byte-identical across candidate-cap /
rerank-width variants (recall-neutral is measured, not asserted), while
heap_rerank_sum confirms the width knob did work. Release build, A/B
isolation correct, no promotion claimed. Packet hygiene clean -- first
120 packet without the committed per-query exhaust flagged in 008/009.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@kreneskyp kreneskyp changed the title Tasks 121/123 revised core-algorithm closeout Tasks 121/123 closeout decline response Jun 28, 2026
Agent IX and others added 25 commits June 28, 2026 08:23
…call-doe' into task-121-spire-coarse-routing-recall-doe
…call-doe' into task-121-spire-coarse-routing-recall-doe
…-wording correction

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…t off

Reopened multi-instance core-algorithm scope closes as no-promote / re-scope,
implementing the packet 020 reviewer acceptance:
- recall stable (1.0000) on the contained multi-instance executor;
- communications payload bytes are not the dominant local latency driver (017);
- dedupe-aware pre-materialization prune (d2ffbda) is recall-safe and
  latency-neutral but not a demonstrated latency win; its leaf-side engagement
  (rows pruned) was never captured (019).

Flip ec_spire.pre_materialization_prune GUC default true -> false so the feature
ships as opt-in plumbing rather than a promoted default; main default read
behavior is unchanged. Unit tests unaffected (cfg(test) override returns true).

Status packets: reviews/task-123/021-post-ab-closeout,
reviews/task-121/030-multi-instance-closeout. Both task files flipped to closed.
Follow-up optimization routed to Task 131.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-routing-recall-doe

# Conflicts:
#	.gitignore
#	plan/tasks/README.md
@kreneskyp kreneskyp changed the title Tasks 121/123 closeout decline response Tasks 121/123: SPIRE coarse-routing recall DOE + multi-instance closeout (no-promote) Jul 1, 2026
@kreneskyp kreneskyp merged commit c6a8639 into main Jul 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant