(diversity) Integrate determinant diversity in tokp full precision search and disk search by narendatha · Pull Request #1011 · microsoft/DiskANN

narendatha · 2026-05-04T12:30:38Z

Integrate determinant-diversity into topk full-precision search and disk search

Summary

Wires the determinant-diversity post-processing algorithm as an optional SearchPlugin for two search paths:

Async graph-index (full-precision topk via inmem::FullAccessor)
Disk-index (via DiskIndexSearcher post-processor)

Determinant-diversity reranks search results to increase geometric diversity among the top-k, at a configurable quality/speed tradeoff.

Motivation

Standard nearest neighbor search often returns highly similar/redundant results. For some scenarios like RAG search, it's more valuable to retrieve diverse, complementary documents that cover different aspects of the query. This PR implements a principled approach based on determinant maximization of the Gram matrix of query-scaled vectors.

Algorithm

The algorithm scales each candidate vector by (similarity_to_query)^power and then greedily selects vectors to maximize the determinant of the Gram matrix (equivalent to maximizing the volume of the parallelepiped spanned by the vectors).

Two variants are implemented:

eta = 0: Pure greedy orthogonalization (pivoted QR decomposition)
eta > 0: Ridge-regularized version that balances diversity with numerical stability

Both variants run in O(n k d) time where n = candidates, k = results to return, d = dimension.

Performance Characteristics

Tested on OpenAI embeddings dataset (3072 dimensions, 10K sample):

Metric	Normal Search	Det_Div Search	Notes
QPS	~150	~130	~13% overhead for diversity optimization
Recall@10	98-99%	87-88%	Expected trade-off for diversity

The recall reduction is expected - Det_Div search intentionally selects diverse vectors rather than the absolute nearest neighbors.

Diversity Improvement

We measure diversity using log-determinant of the Gram matrix of scaled vectors (higher = vectors span more volume = more diverse).

RAG search consistently improves diversity (all improvements >= 0)
Average improvement of +2.9 to +3.3 in log_det corresponds to 10-27x more volume spanned by selected vectors

Parameter Tuning Guide

`power` (default: 2.0)

Controls the trade-off between relevance and diversity:

Higher values (e.g., 3.0+): Strongly prefer relevance - vectors closer to the query get much higher weight, resulting in less diversity but higher recall
Lower values (e.g., 1.0): Prefer diversity over relevance - similarity weighting is more uniform, allowing more diverse vectors to be selected
Value of 0: Pure diversity maximization with no relevance weighting

`eta` (default: 0.01)

Ridge regularization parameter that controls numerical robustness and relevance preference:

Higher values (e.g., 0.1+): More robust numerically, but biases toward selecting high-relevance vectors over diverse ones
Lower values (e.g., 0.001): Closer to pure determinant maximization, maximizes diversity but may be less stable
Value of 0: Falls back to pure greedy orthogonalization (no ridge regularization)

Recommended Settings

Use Case	`power`	`eta`
Balanced (default)	2.0	0.01
High diversity	1.0	0.001
High relevance	3.0	0.1

Changes

Core algorithm (`diskann-providers`)

Added determinant_diversity_post_process() in diskann-providers/src/model/graph/provider/async_/determinant_diversity_post_process.rs
Two selection strategies controlled by eta:
- eta == 0: greedy orthogonalization
- eta > 0: eta-weighted residual selection

Async graph-index path (`diskann-benchmark`)

Added FullPrecisionVectorAccessor shim trait + impl for inmem::FullAccessor<f32, ...> so the post-processor can fetch real candidate vectors
Implemented DeterminantDiversity plugin that satisfies glue::SearchPostProcess using for<'a, 'b> SearchStrategy HRTB bound (credit: @hildebrandmw)
Wired run_topk_timed helper to expose QPS/latency/recall metrics in the benchmark output

Disk-index path (`diskann-disk`, `diskann-benchmark`)

Added DeterminantDiversityAndFilter processor to DiskSearchPostProcessor enum
Extended searcher.search() to accept an optional SearchPostProcessorKind::DeterminantDiversity { power, eta }
Extended DiskSearchPhase JSON input to accept post_processor
Result-counting uses result_count from search stats to correctly bound post-processed results

Input/config

New TopkPostProcessor serde enum in diskann-benchmark/src/inputs/post_processor.rs with validation (power > 0, eta >= 0)
Two new example configs: async-determinant-diversity.json, disk-index-determinant-diversity.json

Benchmark JSON examples

// async graph-index
"post_processor": { "type": "determinant-diversity", "power": 1.0, "eta": 0.0 }

// disk-index
"post_processor": { "type": "determinant-diversity", "power": 1.0, "eta": 0.5 }

Notes

Determinant-diversity is only wired for full-precision topk on the async path; quantized (PQ/scalar) paths are intentionally excluded
Spherical, range, and filtered-search plugin variants are unchanged; this post-processor is topk-only

…ugins

Replace kind()-based string equality checks with explicit is_match() and get() phase-shape helpers on plugin structs. This avoids fragile ordering assumptions and makes each plugin responsible for recognising its own phase shape.

Co-authored-by: Copilot <copilot@github.com>

…plugins # Conflicts: # diskann-benchmark/src/backend/index/benchmarks.rs # diskann-benchmark/src/backend/index/product.rs # diskann-benchmark/src/backend/index/scalar.rs # diskann-benchmark/src/backend/index/search/plugins.rs # diskann-benchmark/src/backend/index/spherical.rs # diskann-benchmark/src/inputs/graph_index.rs # diskann-benchmark/src/inputs/mod.rs

Co-authored-by: Copilot <copilot@github.com>

…ojection issue This commit explores approaches to wire real candidate vectors into async determinant-diversity post-processing. Current state: IN COMPILATION ERROR (intentional for analysis) Attempted approaches: 1. Initial shim-trait FullPrecisionVectorAccessor with async get_full_precision_vector() - Resulted in 'implementation not general enough' at search_with() call 2. Removed explicit for<'a> post_processor::DeterminantDiversity bound - Still fails - the constraint is inherent in search_with() signature itself Root cause analysis: - search_with() requires: PP: for<'a> SearchPostProcess<S::SearchAccessor<'a>, T, O> - This means post-processor must work for ANY accessor lifetime 'a - But query = queries.row(query_idx) is borrowed for specific loop iteration lifetime - These are fundamentally incompatible - a borrowed value can't satisfy for<'a> generically Compiler errors (3 total): - 'not general enough': implementation needed for or<'a> but found specific '0 - 'does not live long enough': queries lifetime too short for 'static requirement Files modified: - diskann-benchmark/src/backend/index/benchmarks.rs: * Removed explicit for<'a> post_processor::DeterminantDiversity constraint * Narrowed plugin impl to FullPrecisionProvider<f32> - diskann-benchmark/src/backend/index/post_processor/determinant_diversity.rs: * Added shim trait FullPrecisionVectorAccessor * Async method get_full_precision_vector(&mut self, id) -> impl Future<...> Next steps to investigate: - Move determinant-diversity outside search_with() as post-processing reranking - This avoids HRTB entirely by applying after candidates are returned - Benchmark impact: measure recall/QPS with external reranking vs baseline Related context: - Disk index determinant-diversity works correctly (uses real vectors, shows 51-53% QPS cost) - Shared algorithm fixed (distance-to-similarity scoring direction) - Branch already merged with origin/main

Co-authored-by: Copilot <copilot@github.com>

…educe duplication - Use for<'a, 'b> SearchStrategy bound (user-provided fix) to break HRTB lifetime projection issue in the search_with post-processor constraint - Wire FullPrecisionVectorAccessor shim trait so async det-div post-processor fetches real candidate vectors instead of placeholder distances - Populate QPS/latency metrics in async det-div benchmark path (previously all 'missing') - Extract run_topk_timed helper to eliminate ~100 lines of duplicated loop/timing/recall machinery from DeterminantDiversity::run - Update async-determinant-diversity.json example tag (async-index-build -> graph-index-build) - Fix clippy::manual_async_fn in FullPrecisionVectorAccessor shim trait

codecov-commenter · 2026-05-06T05:22:20Z

Codecov Report

❌ Patch coverage is 51.04670% with 304 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.25%. Comparing base (be804aa) to head (468b5d2).
⚠️ Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
diskann-benchmark/src/backend/index/benchmarks.rs	5.51%	120 Missing ⚠️
diskann-disk/src/search/provider/disk_provider.rs	18.18%	90 Missing ⚠️
...kend/index/post_processor/determinant_diversity.rs	0.00%	36 Missing ⚠️
diskann-benchmark/src/inputs/graph_index.rs	55.26%	17 Missing ⚠️
diskann-benchmark/src/inputs/disk.rs	0.00%	11 Missing ⚠️
diskann-benchmark/src/inputs/post_processor.rs	0.00%	11 Missing ⚠️
...kann-benchmark/src/backend/index/search/plugins.rs	62.96%	10 Missing ⚠️
...vider/async_/determinant_diversity_post_process.rs	96.91%	8 Missing ⚠️
diskann-tools/src/utils/search_disk_index.rs	0.00%	1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (51.04%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1011      +/-   ##
==========================================
- Coverage   90.63%   89.25%   -1.39%     
==========================================
  Files         460      463       +3     
  Lines       85424    86021     +597     
==========================================
- Hits        77427    76781     -646     
- Misses       7997     9240    +1243

Flag	Coverage Δ
miri	`89.25% <51.04%> (-1.39%)`	⬇️
unittests	`89.10% <51.04%> (-1.50%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
diskann-benchmark/src/backend/index/mod.rs	`100.00% <ø> (ø)`
diskann-benchmark/src/backend/index/spherical.rs	`100.00% <ø> (ø)`
diskann-benchmark/src/inputs/mod.rs	`78.26% <ø> (ø)`
diskann-disk/src/build/builder/core.rs	`95.24% <100.00%> (+<0.01%)`	⬆️
diskann-tools/src/utils/search_disk_index.rs	`0.00% <0.00%> (ø)`
...vider/async_/determinant_diversity_post_process.rs	`96.91% <96.91%> (ø)`
...kann-benchmark/src/backend/index/search/plugins.rs	`62.31% <62.96%> (-3.69%)`	⬇️
diskann-benchmark/src/inputs/disk.rs	`4.24% <0.00%> (-0.24%)`	⬇️
diskann-benchmark/src/inputs/post_processor.rs	`0.00% <0.00%> (ø)`
diskann-benchmark/src/inputs/graph_index.rs	`39.93% <55.26%> (+2.91%)`	⬆️
... and 3 more

... and 45 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

- Use for<'a, 'b> SearchStrategy bound to resolve HRTB lifetime mismatch - Wire FullPrecisionVectorAccessor shim trait for async det-div to fetch real vectors - Implement DeterminantDiversity post-processor for async graph-index path - Extract run_topk_timed helper to eliminate ~100 lines of code duplication - Wire post_processor parameter to disk-index search pipeline - Update search parameter handling and result counting for post-processed results - Add TopkPostProcessor input type and necessary imports - Populate QPS/latency metrics in async det-div benchmark path

…rosoft/DiskANN into u/narendatha/det_div_plugins

Co-authored-by: Copilot <copilot@github.com>

hildebrandmw

Thanks Naren! My biggest concern is diskann-benchmark. I would really prefer if we generalized diskann-benchmark-core to handle post-processing, which would reduce bugs and be more broadly applicable.

On the implementation front, there's a bit of work to do regarding being more opinionated on the nature of inputs, establishing invariants of the diversity search with a bespoke struct (which could help with the previous point), reducing code duplication, testing the implementation, and removing allocations with a Matrix.

I'll leave reviewing the diskann-disk changes to others.

hildebrandmw · 2026-05-07T23:28:22Z

+            .is_some_and(|pp| matches!(pp, TopkPostProcessor::DeterminantDiversity { .. }))
+    }
+
+    pub(crate) fn get(phase: &SearchPhase) -> anyhow::Result<(&TopkSearchPhase, f32, f32)> {


Can we return a richer struct rather than to f32? Ideally - the parsed code would validate that a determinant diversity post-processor can be constructed from the parameters (using the actual constructor).

hildebrandmw · 2026-05-07T23:33:29Z

+/// Encapsulates the outer thread/run/L loops, the per-rep timing harness, per-query latency
+/// collection, and [`SearchResults`] construction. The caller supplies only the actual
+/// per-query search as a closure `search_fn(knn_params, query, output) -> Result<()>`.
+fn run_topk_timed(


The code in diskann-benchmark-core abstracting the loop nest exists for several reasons:

It keeps the behavior more or less the same for all users.

More importantly, it very carefully minimizes monomorphizations to help keep compile times under control.

Introducing bespoke implementations like this can very quickly get us back to a place where benchmarks take forever to compile and the set of metrics gathered by different search kinds diverges.

I understand that the KNN runner in diskann-benchmark-core predates parametrizable post-process routines, but we should update it with an additional defaulted generic. I understand that's more work, but it improves the core abstraction and will pay dividends when it comes to exploring additional post-processing routines.

hildebrandmw · 2026-05-07T23:36:43Z

+        ))?);
+        let groundtruth = datafiles::load_groundtruth(datafiles::BinFile(&topk.groundtruth))?;
+
+        let results = run_topk_timed(topk, &queries, &groundtruth, |params, query, output| {


Case and point as to why we should use diskann-benchmark-core: This runs all the searches single-threaded, but the loop nest loops over the number of threads. Not only will this make the benchmark take forever to run, it's also misleading and means that we're measuring overhead related to tokio runtime construction.

hildebrandmw · 2026-05-07T23:37:24Z


    fn kind(&self) -> &'static str {
-        Self::kind().as_str()
+        SearchPhaseKind::Topk.as_str()


We should probably have the plugin types implement an as_str() method instead to make sure they stay up-to-date properly and completely avoid SearchPhaseKind here.

hildebrandmw · 2026-05-07T23:41:34Z

+impl CheckDeserialization for TopkPostProcessor {
+    fn check_deserialization(&mut self, _checker: &mut Checker) -> Result<(), anyhow::Error> {
+        match self {
+            TopkPostProcessor::DeterminantDiversity { power, eta } => {


Shouldn't we have one struct that validates the inputs in the lower level crate and always use that for input validation? That way, there's no risk of these ad-hoc validations ever being wrong.

hildebrandmw · 2026-05-07T23:51:28Z

+ * Licensed under the MIT license.
+ */
+
+use diskann_vector::{MathematicalValue, PureDistanceFunction, distance::InnerProduct};


A module level doc explaining what this is and how it works would be really helpful.

hildebrandmw · 2026-05-07T23:55:53Z

+}
+
+#[cfg(test)]
+mod tests {


Should we have tests that check the algorithmic behavior?

hildebrandmw · 2026-05-07T23:56:15Z

+        .collect()
+}
+
+fn post_process_greedy_orthogonalization_f32<Id: Copy>(


These two routines are very similar. Is it possible to merge them?

hildebrandmw · 2026-05-07T23:56:54Z

+    for (_, distance_to_query, v) in &candidates {
+        let scale =
+            distance_to_similarity(*distance_to_query, distance_range).powf(power) * inv_sqrt_eta;
+        let residual: Vec<f32> = v.iter().map(|&x| x * scale).collect();


Maybe use a Matrix to store the residuals? This will provide better locality in the processing loop and cut down on the number of allocations.

hildebrandmw · 2026-05-07T23:57:51Z


+mod determinant_diversity_post_process;
 pub(crate) mod postprocess;
+pub use determinant_diversity_post_process::determinant_diversity_post_process;


Since this is shared with the disk index, maybe move it up instead of down in the async_ abyss? I know diskann-providers is kind of a nightmare at the moment 😢

Mark Hildebrand and others added 12 commits April 28, 2026 14:47

Make benchmarks stateful.

dcdc377

Checkpoint.

d60ac07

Checkopint.

6cc92bd

Oh boy.

f050e3d

Merge remote-tracking branch 'origin/main' into mhildebr/benchmark-pl…

2d03d11

…ugins

Fix delimiters.

aa78836

Revert formatting.

ccf891a

Make plugins slightly more flexible.

2e060ea

Merge branch 'main' into mhildebr/benchmark-plugins

7a85df8

Add determinant-diversity plugin support on search-plugin architecture

fc8008a

Integrate determinant-diversity via disk search_with post-processor

298d1b8

Restrict determinant-diversity to async full-precision topk

fbd34fd

narendatha changed the base branch from main to mhildebr/benchmark-plugins May 4, 2026 12:41

Keep single async determinant-diversity example JSON

851174e

Base automatically changed from mhildebr/benchmark-plugins to main May 5, 2026 00:45

narendatha and others added 10 commits May 5, 2026 12:41

Use SearchPhaseKind::as_str in benchmark plugin kinds

3a6aa1a

remove serde defaults

ccfe4d7

Co-authored-by: Copilot <copilot@github.com>

minor merge fix

86883b7

hook up actual algorithm, replace placeholder.

554bc7f

Co-authored-by: Copilot <copilot@github.com>

apply mark's beautiful fix for lifetime issue

b73abc8

Co-authored-by: Copilot <copilot@github.com>

Fix CI clippy-features spherical plugin errors and apply formatting

701ce8e

narendatha and others added 4 commits May 6, 2026 16:57

Merge branch 'u/narendatha/det_div_plugins' of https://github.com/mic…

6f47ba3

…rosoft/DiskANN into u/narendatha/det_div_plugins

imrpove code coverage

8acbba2

minor fix

a48e255

Co-authored-by: Copilot <copilot@github.com>

narendatha changed the title ~~(draft) integrate determinant diversity in tokp full precision search and disk search~~ (diversity) Integrate determinant diversity in tokp full precision search and disk search May 6, 2026

narendatha marked this pull request as ready for review May 6, 2026 15:03

narendatha requested review from a team and Copilot May 6, 2026 15:03

Copilot started reviewing on behalf of narendatha May 6, 2026 15:04 View session

cargo fmt

468b5d2

hildebrandmw reviewed May 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(diversity) Integrate determinant diversity in tokp full precision search and disk search#1011

(diversity) Integrate determinant diversity in tokp full precision search and disk search#1011
narendatha wants to merge 28 commits intomainfrom
u/narendatha/det_div_plugins

narendatha commented May 4, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented May 6, 2026 •

edited

Loading

Uh oh!

hildebrandmw left a comment

Uh oh!

hildebrandmw May 7, 2026

Uh oh!

hildebrandmw May 7, 2026

Uh oh!

hildebrandmw May 7, 2026

Uh oh!

hildebrandmw May 7, 2026

Uh oh!

hildebrandmw May 7, 2026

Uh oh!

hildebrandmw May 7, 2026

Uh oh!

hildebrandmw May 7, 2026

Uh oh!

hildebrandmw May 7, 2026

Uh oh!

hildebrandmw May 7, 2026

Uh oh!

hildebrandmw May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

narendatha commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Integrate determinant-diversity into topk full-precision search and disk search

Summary

Motivation

Algorithm

Performance Characteristics

Diversity Improvement

Parameter Tuning Guide

power (default: 2.0)

eta (default: 0.01)

Recommended Settings

Changes

Core algorithm (diskann-providers)

Async graph-index path (diskann-benchmark)

Disk-index path (diskann-disk, diskann-benchmark)

Input/config

Benchmark JSON examples

Notes

Uh oh!

codecov-commenter commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

hildebrandmw left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

narendatha commented May 4, 2026 •

edited

Loading

`power` (default: 2.0)

`eta` (default: 0.01)

Core algorithm (`diskann-providers`)

Async graph-index path (`diskann-benchmark`)

Disk-index path (`diskann-disk`, `diskann-benchmark`)

codecov-commenter commented May 6, 2026 •

edited

Loading