Replace AlignedBoxWithSlice with Vec in PQScratch and disk fp vector caches#960
Replace AlignedBoxWithSlice with Vec in PQScratch and disk fp vector caches#960
AlignedBoxWithSlice with Vec in PQScratch and disk fp vector caches#960Conversation
- Remove dead field `aligned_query_float` (written in `set()` but never read). - Replace `rotated_query: AlignedBoxWithSlice<f32>` with `Vec<f32>`: the downstream SIMD consumers (`populate_chunk_distances` via `diskann-vector`, `TransposedTable::process_into` via `diskann-quantization`) use `_mm256_loadu_ps` / `read_unaligned`, so the 32-byte AVX-aligned-load alignment is not load-bearing. - Rename `PQScratch::new`'s `aligned_dim` parameter to `dim`: the production caller passes `graph_header.metadata().dims` (raw dim), no padding applied. - Drop the unused `norm: f32` parameter from `PQScratch::set`. The only production caller passed `1.0_f32`, so the `query_val / norm` branch was never taken. Simplify the body to a single `copy_from_slice` after `T::as_f32` bulk conversion. The remaining three `AlignedBoxWithSlice` buffers (128-byte alignment for the L2 Adjacent Cache Line Prefetcher) are left untouched — that claim needs benchmarking before being removed.
There was a problem hiding this comment.
Pull request overview
This PR simplifies PQ query scratch handling in diskann-disk by removing unused scratch fields and alignment assumptions, and by tightening the PQScratch API to match actual production usage.
Changes:
- Remove the unused
aligned_query_floatbuffer and drop the unusednormargument fromPQScratch::set. - Replace
rotated_queryfromAlignedBoxWithSlice<f32>toVec<f32>and update callers/tests accordingly. - Rename
PQScratch::newparameter fromaligned_dimtodimand adjust call sites.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
diskann-disk/src/search/provider/disk_provider.rs |
Updates the PQScratch::set call site to match the new signature (drops norm). |
diskann-disk/src/search/pq/pq_scratch.rs |
Removes dead scratch field, switches rotated_query storage to Vec<f32>, and simplifies set() implementation and tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
The refactor in the previous commit used `self.rotated_query[..dim]` as the destination for `copy_from_slice`, assuming `dim` equaled the decompressed f32 length. That holds for `T = f32/f16/i8/u8` (per-element conversion) but breaks for `T = MinMaxElement`, where `T::as_f32` decompresses each element into multiple f32s — the source length differs from `dim`. Size the destination slice by the actual f32 length returned by `as_f32`. Matches the original loop semantics (write indices 0..query.len()). Caught by test_disk_minmax_index_builder::use_sharded_build_2_true.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #960 +/- ##
==========================================
- Coverage 89.31% 89.31% -0.01%
==========================================
Files 447 447
Lines 83260 83250 -10
==========================================
- Hits 74367 74354 -13
- Misses 8893 8896 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Return DimensionMismatchError (matching the function's ANNResult signature) when dim > query.len() or when the decompressed f32 vector does not fit in rotated_query. Previously both conditions would panic via slice indexing and copy_from_slice.
DiskVertexProvider::aligned_vector_buf and Cache::vectors used AlignedBoxWithSlice<T> with FP_VECTOR_MEM_ALIGN (32 bytes) as a leftover from AVX aligned loads (_mm256_load_ps). diskann-vector now uses unaligned loads (_mm256_loadu_ps), so the alignment no longer carries weight. Neither buffer participates in concurrent insert+search on the same buffer (aligned_vector_buf is per-scratch-exclusive, Cache::vectors is read-only after warmup), so the 64-byte cache-line rationale used by FastMemoryVectorProviderAsync does not apply either. Replace both with plain Vec<T> and remove the now-unused FP_VECTOR_MEM_ALIGN constant.
memory_aligned_dimension = dims.next_multiple_of(8) was paired with the now-removed FP_VECTOR_MEM_ALIGN to feed an 8-wide AVX aligned-load SIMD kernel (_mm256_load_ps, 8 f32 at a time). SIMD has since moved to _mm256_loadu_ps and the buffer-start alignment has been removed, so the padding no longer serves any hardware purpose — it only makes get_vector return a slice longer than the raw dim, which the distance kernel silently tolerates because the padding is zero. Replace memory_aligned_dimension with raw dim on DiskVertexProvider and on the Cache it feeds via disk_vertex_provider_factory. Rename the field aligned_vector_buf -> vector_buf, drop the unused public memory_aligned_dimension() accessor, and size vector_buf and Cache::vectors to max_batch_size * dim / capacity * dim respectively.
set() in diskann-diskAlignedBoxWithSlice with Vec in PQScratch and disk fp vector caches
Bumping to 0.50.1 to propagate changes to consumers. Changes since previous bump: ## What's Changed * Add more agentic guard rails by @hildebrandmw in #871 * Cleanup `diskann-benchmark-runner` and friends. by @hildebrandmw in #865 * Use `--all-targets` for the no-default-features CI run. by @hildebrandmw in #874 * Remove unused `normalizing_util.rs` from `diskann-providers` by @Copilot in #902 * Benchmark Support for A/B Tests by @hildebrandmw in #900 * [diskann-garnet] Bump diskann-garnet to 1.0.26 by @tiagonapoli in #925 * Remove the `AdjacencyList` from `diskann-providers` by @hildebrandmw in #915 * [PQ cleanup] Part 1: Move pq_scratch, quantizer_preprocess and pq_dataset to `diskann-disk` by @arkrishn94 in #930 * Forbid Debug in diskann-benchmark by @arrayka in #914 * Remove DebugProvider by @JordanMaples in #923 * [diskann-garnet] Create workflow to publish to nuget by @tiagonapoli in #926 * Move k-means implementation from diskann-providers to diskann-disk by @Copilot in #933 * Inline minmax distance evaluations by @arkrishn94 in #935 * Use `rust-toolchain.toml` in CI by @hildebrandmw in #934 * Add a globally blocking CI gate. by @hildebrandmw in #932 * Remove `utils/math_util.rs` from `diskann-providers` by @Copilot in #921 * Bump rand from 0.9.2 to 0.9.3 by @dependabot[bot] in #945 * Remove OPQ and friends by @arkrishn94 in #947 * Migrate test_flaky_consolidate from diskann_providers to diskann by @JordanMaples in #942 * Remove GraphDataType from diskann-providers by @wuw92 in #950 * Remove unused method extract_best_l_candidates in NeighborPriorityQueue by @doliawu in #951 * Add `Debug` bounds to `VectorRepr`'s distance GATs. by @hildebrandmw in #948 * Add benchmark pipeline with Rust-native A/B validation by @YuanyuanTian-hh in #912 * Remove unnecessary `Default` bound from `Neighbor`'s `VectorIdType` by @doliawu in #956 * Replace `AlignedBoxWithSlice` with plain `Vec` / `Matrix` where alignment is unused by @wuw92 in #955 * [minmax] 8-bit benchmark by @arkrishn94 in #959 * Add `MultiInsertStrategy` implementations for `BfTreeProvider` by @hildebrandmw in #949 * Replace `AlignedBoxWithSlice` with `Vec` in PQScratch and disk fp vector caches by @wuw92 in #960 * Adding unit tests for paged_search by @JordanMaples in #962 * Remove AlignedBoxWithSlice wrapper and add alias to Poly<[T], AlignedAllocator> by @JordanMaples in #965 * Remove synthetic/structured data generation from diskann-providers by @JordanMaples in #963 * added tests and some baselines for range_search by @JordanMaples in #961 ## New Contributors * @JordanMaples made their first contribution in #923 * @wuw92 made their first contribution in #950 * @doliawu made their first contribution in #951 * @YuanyuanTian-hh made their first contribution in #912 **Full Changelog**: v0.50.0...v0.50.1
@wuw92 - Note that this tolerance of |
Hi @hildebrandmw Thanks for flagging — followed up in #1044. |
Summary
Replaces
AlignedBoxWithSlicewith plainVecfor three disk-side buffers whose consumers don't require alignment.1.
PQScratch.rotated_query(diskann-disk/src/search/pq/pq_scratch.rs)AlignedBoxWithSlice<f32>withVec<f32>. Consumers (populate_chunk_distancesviadiskann-vectorSIMD,TransposedTable::process_intoviadiskann-quantizationscalar PQ table lookups) handle unaligned / arbitrary-length data, so the alignment is not load-bearing.PQScratch::setnow toleratesquery.len() >= dim, following the inmemTableL2::populateconvention, instead of requiring strict equality.2.
DiskVertexProvider::aligned_vector_bufandCache::vectors(FP_VECTOR_MEM_ALIGN)AlignedBoxWithSlice<T>(32-byteFP_VECTOR_MEM_ALIGN) with plainVec<T>. The fp distance kernels reached viaget_vectorgo throughdiskann-vectorSIMD, which handles unaligned data, so the alignment is not load-bearing.FastMemoryVectorProviderAsync-style 64-byte cache-line alignment:aligned_vector_bufis per-scratch-exclusive;Cache::vectorsis built once via BFS then shared read-only throughArc<Cache>.memory_aligned_dimension = dims.next_multiple_of(8)padding — it was paired with the 32-byte alignment to feed 8-wide AVX tiles and is no longer justified.DiskVertexProviderandCachenow size slots to rawdim;get_vectorreturns a slice of lengthdiminstead of a padded length.