Remove AlignedBoxWithSlice wrapper and add alias to Poly<[T], AlignedAllocator>#965
Conversation
…907) Introduce AlignedSlice<T> type alias and aligned_slice() constructor in diskann-quantization::alloc, replacing the AlignedBoxWithSlice wrapper struct from diskann-providers. This eliminates an unnecessary abstraction layer and moves aligned allocation to its proper tier. - Add AlignedSlice<T> = Poly<[T], AlignedAllocator> type alias - Add aligned_slice<T>(capacity, alignment) constructor - Migrate all 9 consumer files in diskann-disk - Replace split_into_nonoverlapping_mut_slices with chunks_mut - Replace as_slice/as_mut_slice with Deref/DerefMut - Delete aligned_allocator.rs from diskann-providers - Add 5 tests for aligned_slice Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
This replaces #908 |
There was a problem hiding this comment.
Pull request overview
This PR removes the AlignedBoxWithSlice wrapper from diskann-providers and standardizes aligned slice allocation in diskann-quantization::alloc via a new AlignedSlice<T> alias and aligned_slice() constructor, updating disk-index readers/search code to use the new API.
Changes:
- Add
AlignedSlice<T> = Poly<[T], AlignedAllocator>andaligned_slice<T>(capacity, alignment: PowerOfTwo)indiskann-quantization. - Migrate aligned buffer usage across
diskann-disk(tests/benches/search) to the new constructor and to slice deref semantics. - Remove
diskann-providers::common::AlignedBoxWithSliceand replace custom splitting withchunks_mut.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| diskann-quantization/src/alloc/mod.rs | Exposes new aligned-slice module + re-exports AlignedSlice/aligned_slice. |
| diskann-quantization/src/alloc/aligned_slice.rs | Introduces AlignedSlice<T> alias, constructor, and unit tests. |
| diskann-providers/src/common/mod.rs | Drops AlignedBoxWithSlice re-export. |
| diskann-providers/src/common/aligned_allocator.rs | Deletes the wrapper implementation (and its tests). |
| diskann-disk/src/utils/aligned_file_reader/windows_aligned_file_reader.rs | Updates tests to allocate aligned buffers via aligned_slice and split via chunks_mut. |
| diskann-disk/src/utils/aligned_file_reader/storage_provider_aligned_file_reader.rs | Same migration as above for storage-provider reader tests. |
| diskann-disk/src/utils/aligned_file_reader/linux_aligned_file_reader.rs | Same migration as above for Linux reader tests. |
| diskann-disk/src/utils/aligned_file_reader/aligned_read.rs | Updates/shortens doc comment reflecting removal of wrapper type. |
| diskann-disk/src/search/provider/disk_vertex_provider_factory.rs | Uses aligned_slice buffer for header reads and relies on deref slicing. |
| diskann-disk/src/search/provider/disk_sector_graph.rs | Switches internal storage to AlignedSlice<u8> and replaces custom splitting with chunks_mut. |
| diskann-disk/src/search/provider/disk_provider.rs | Removes .as_slice() calls; relies on deref to slice for .to_vec(). |
| diskann-disk/src/search/pq/quantizer_preprocess.rs | Replaces .as_mut_slice() with deref-based mutable slicing/borrows. |
| diskann-disk/src/search/pq/pq_scratch.rs | Replaces wrapper fields with AlignedSlice allocations and updates alignment tests. |
| diskann-disk/benches/benchmarks_iai/aligned_file_reader_bench_iai.rs | Updates benchmark to use aligned_slice + chunks_mut. |
| diskann-disk/benches/benchmarks/aligned_file_reader_bench.rs | Updates benchmark to use aligned_slice + chunks_mut. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #965 +/- ##
==========================================
- Coverage 90.47% 89.31% -1.17%
==========================================
Files 448 448
Lines 83447 83329 -118
==========================================
- Hits 75497 74422 -1075
- Misses 7950 8907 +957
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Poly::from_iter special-cases len == 0 and returns an empty slice rather than an AllocatorError. Update the test to assert Ok + empty slice instead of discarding the result. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Return ANNError instead of panicking on division by zero if num_sectors_per_node or block_size is zero. The constructor currently prevents this, but this guards against corrupt metadata defensively. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Bumping to 0.50.1 to propagate changes to consumers. Changes since previous bump: ## What's Changed * Add more agentic guard rails by @hildebrandmw in #871 * Cleanup `diskann-benchmark-runner` and friends. by @hildebrandmw in #865 * Use `--all-targets` for the no-default-features CI run. by @hildebrandmw in #874 * Remove unused `normalizing_util.rs` from `diskann-providers` by @Copilot in #902 * Benchmark Support for A/B Tests by @hildebrandmw in #900 * [diskann-garnet] Bump diskann-garnet to 1.0.26 by @tiagonapoli in #925 * Remove the `AdjacencyList` from `diskann-providers` by @hildebrandmw in #915 * [PQ cleanup] Part 1: Move pq_scratch, quantizer_preprocess and pq_dataset to `diskann-disk` by @arkrishn94 in #930 * Forbid Debug in diskann-benchmark by @arrayka in #914 * Remove DebugProvider by @JordanMaples in #923 * [diskann-garnet] Create workflow to publish to nuget by @tiagonapoli in #926 * Move k-means implementation from diskann-providers to diskann-disk by @Copilot in #933 * Inline minmax distance evaluations by @arkrishn94 in #935 * Use `rust-toolchain.toml` in CI by @hildebrandmw in #934 * Add a globally blocking CI gate. by @hildebrandmw in #932 * Remove `utils/math_util.rs` from `diskann-providers` by @Copilot in #921 * Bump rand from 0.9.2 to 0.9.3 by @dependabot[bot] in #945 * Remove OPQ and friends by @arkrishn94 in #947 * Migrate test_flaky_consolidate from diskann_providers to diskann by @JordanMaples in #942 * Remove GraphDataType from diskann-providers by @wuw92 in #950 * Remove unused method extract_best_l_candidates in NeighborPriorityQueue by @doliawu in #951 * Add `Debug` bounds to `VectorRepr`'s distance GATs. by @hildebrandmw in #948 * Add benchmark pipeline with Rust-native A/B validation by @YuanyuanTian-hh in #912 * Remove unnecessary `Default` bound from `Neighbor`'s `VectorIdType` by @doliawu in #956 * Replace `AlignedBoxWithSlice` with plain `Vec` / `Matrix` where alignment is unused by @wuw92 in #955 * [minmax] 8-bit benchmark by @arkrishn94 in #959 * Add `MultiInsertStrategy` implementations for `BfTreeProvider` by @hildebrandmw in #949 * Replace `AlignedBoxWithSlice` with `Vec` in PQScratch and disk fp vector caches by @wuw92 in #960 * Adding unit tests for paged_search by @JordanMaples in #962 * Remove AlignedBoxWithSlice wrapper and add alias to Poly<[T], AlignedAllocator> by @JordanMaples in #965 * Remove synthetic/structured data generation from diskann-providers by @JordanMaples in #963 * added tests and some baselines for range_search by @JordanMaples in #961 ## New Contributors * @JordanMaples made their first contribution in #923 * @wuw92 made their first contribution in #950 * @doliawu made their first contribution in #951 * @YuanyuanTian-hh made their first contribution in #912 **Full Changelog**: v0.50.0...v0.50.1
hildebrandmw
left a comment
There was a problem hiding this comment.
Thanks for the cleanup @JordanMaples! This is a step in the right direction, but I have some concrete requests for a follow-up. While this PR removes AlignedBoxWithSlice to the letter, it doesn't remove it in spirit as the AlignedSlice alias and aligned_slice constructor still exist. We should strive for construction of concrete types to not require a helper free function.
Instead, I would like call sites to explicitly compose the container and allocator instead of using the thin helper. We can make this more ergonomic by introducing const aliases for all valid PowerOfTwo values:
impl PowerOfTwo {
// Powers
const P0: Self = unsafe { Self::new_unchecked(NonZeroUsize::new_unchecked(1)) };
const P1: Self = unsafe { Self::new_unchecked(NonZeroUsize::new_unchecked(2)) };
// ...
const P63: Self = unsafe { Self::new_unchecked(NonZeroUsize::new_unchecked(1 << 63)) };
// Values
const V1: Self = Self::P0;
const V2: Self = Self::P1;
// ...
const V9223372036854775808: Self = Self::P63;
}Then in situations where the alignment is known and static, we can replace
// Before
let buf = aligned_slice::<u8>(sector_len, PowerOfTwo::new(512)?)?;
// After
let buf = Poly::broadcast(
0u8,
sector_len,
AlignedAllocator::new(PowerOfTwo::V512)
)?;We can also add const aliases for AlignedAllocator for common alignments to simplify even further:
let buf = Poly::broadcast(0u8, sector_len, AlignedAllocator::A512)?;This way, both the initialization of the allocator and the data are visible at the callsite.
My concrete ask is this:
- Add
PowerOfTwo/AlignedAllocatorconst aliases. - Delete
aligned_slice.rsand update everything to usePolydirectly.
| let mut mem_slices = aligned_mem | ||
| .split_into_nonoverlapping_mut_slices(0..aligned_mem.len(), read_length) | ||
| .unwrap(); | ||
| let mut mem_slices: Vec<&mut [u8]> = aligned_mem.chunks_mut(read_length).collect(); |
There was a problem hiding this comment.
Note: we probably never need to ever materializes the slices as a Vec<&mut [u8]>. In the context of the benchmark, a chunks_mut iterator would have worked just fine. In other situations, par_chunks directly would work as well.
I synced up with @hildebrandmw after he returned from vacation and learned that I misunderstood the brief of #907 with my #965 pr. Rather than removing the struct and replace it with a type-alias for slightly less overhead, the ask was to just replace all calls. So, this follow-up PR does just that and removes AlignedSlice & aligned_slice and replaces all sites with Poly::broadcast calls. This also introduces PowerOfTwo consts for 0..64 P and value mappings and AlignedAllocator consts for 4...4096
Closes #907
This change introduces
AlignedSlice<T>type alias and thealigned_slice()constructor indiskann-quantization::alloc, replacing theAlignedBoxWithSlicewrapper struct fromdiskann-providers. This eliminates an unnecessary abstraction layer and moves aligned allocation to its proper tier.AlignedSlice<T> = Poly<[T], AlignedAllocator>type aliasaligned_slice<T>(capacity, alignment)constructordiskann-disksplit_into_nonoverlapping_mut_sliceswithchunks_mutas_slice/as_mut_slicewith Deref/DerefMutdiskann-providersaligned_slice