[v26.1.x] Port chunked_vector, chunked_hash_map, and use it in prometheus-impl.hh#288
Open
WillemKauf wants to merge 5 commits into
Open
[v26.1.x] Port chunked_vector, chunked_hash_map, and use it in prometheus-impl.hh#288WillemKauf wants to merge 5 commits into
chunked_vector, chunked_hash_map, and use it in prometheus-impl.hh#288WillemKauf wants to merge 5 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR ports Redpanda’s chunked_vector and chunked_hash_map into Seastar, wires the new hash map into the Prometheus label aggregation path, and updates the build/dependency plumbing (CMake, cooking ingredients, pkg-config, and install scripts) to support the new containers.
Changes:
- Add new core containers:
include/seastar/core/chunked_vector.hhandinclude/seastar/core/chunked_hash_map.hh. - Switch
src/core/prometheus-impl.hhlabel aggregation map fromstd::unordered_maptochunked_hash_map. - Add unit tests for both containers and add Abseil + unordered_dense as dependencies (CMake/cooking/pkg-config/install scripts).
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/CMakeLists.txt | Registers new unit tests for chunked containers. |
| tests/unit/chunked_vector_test.cc | Adds extensive iterator/behavior tests for chunked_vector. |
| tests/unit/chunked_hash_map_test.cc | Adds basic compile/behavior tests for chunked_hash_map and from_range. |
| src/core/prometheus-impl.hh | Uses chunked_hash_map for metric aggregation by labels. |
| pkgconfig/seastar.pc.in | Adds cflags for unordered_dense and Requires for Abseil hash. |
| install-dependencies.sh | Adds distro packages for Abseil. |
| include/seastar/core/chunked_vector.hh | Introduces chunked_vector implementation and iterators. |
| include/seastar/core/chunked_hash_map.hh | Introduces chunked_hash_map/set, helpers, and fmt integration. |
| cooking_recipe.cmake | Adds cooking ingredients for unordered_dense and Abseil. |
| CMakeLists.txt | Installs new headers and links against unordered_dense and Abseil hash target. |
| cmake/SeastarDependencies.cmake | Finds unordered_dense and Abseil via CMake. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
| ~chunked_vector() noexcept = default; | ||
|
|
||
| chunked_vector copy() const noexcept { return *this; } |
Comment on lines
+297
to
+317
| void reserve(size_t new_cap) { | ||
| static constexpr size_t elems_per_frag = calc_elems_per_frag(); | ||
| if (new_cap > _capacity) { | ||
| if (_frags.empty()) { | ||
| auto& frag = _frags.emplace_back(); | ||
| frag.reserve(std::min(elems_per_frag, new_cap)); | ||
| _capacity = frag.capacity(); | ||
| } else if (_frags.size() == 1) { | ||
| auto& frag = _frags.front(); | ||
| frag.reserve(std::min(elems_per_frag, new_cap)); | ||
| _capacity = frag.capacity(); | ||
| } | ||
| // We only reserve the first fragment as all fragments after the | ||
| // first are allocated at the maximum size, so we don't save | ||
| // anything in terms of reallocs after fully allocating the | ||
| // first fragment. In addition, due to cache locality, it's | ||
| // better to delay the allocations of those other fragments | ||
| // until they're going to be used. | ||
| } | ||
| update_generation(); | ||
| } |
Comment on lines
+397
to
+407
| iter& operator+=(ssize_t n) { | ||
| check_generation(); | ||
| _index += n; | ||
| return *this; | ||
| } | ||
|
|
||
| iter& operator-=(ssize_t n) { | ||
| check_generation(); | ||
| _index -= n; | ||
| return *this; | ||
| } |
Comment on lines
+469
to
+471
| friend ssize_t operator-(const iter& a, const iter& b) { | ||
| return a._index - b._index; | ||
| } |
Comment on lines
+392
to
+395
| pointer operator->() const { | ||
| check_generation(); | ||
| return &_vec->operator[](_index); | ||
| } |
Comment on lines
+271
to
+273
| // Calling shrink to fix then modifying the container could result | ||
| // in allocations that overshoot our max_frag_bytes, except when | ||
| // we're managing the dynamic size of the first fragment. |
Comment on lines
+87
to
+90
| if (f.capacity() > std::decay_t<decltype(v)>::max_frag_bytes()) { | ||
| return fmt::format( | ||
| "fragment {} capacity over max_frag_bytes ({})", i, calc_cap); | ||
| } |
Comment on lines
+135
to
+137
| /** | ||
| * Proxy that applies a consistency check before deference | ||
| */ |
Comment on lines
+145
to
+147
| return m.bucket_count() | ||
| * sizeof(typename chunked_hash_map<K, V>::bucket_type) | ||
| + m.values().capacity() * sizeof(m.values()[0]); |
7219787 to
744ae73
Compare
Wire ankerl::unordered_dense into the build as a required dependency: find_package in SeastarDependencies, link unordered_dense::unordered_dense PUBLIC, a cooking recipe pinned to the commit Redpanda consumes (v4.4.0), and propagate its include flags to consumers via the pkg-config file. This is a prerequisite for the upcoming chunked_hash_map container.
Wire abseil into the build as a required dependency for chunked_hash_map's absl::Hash support: find_package(absl CONFIG), link absl::hash PUBLIC, a cooking recipe pinned to the same LTS release Redpanda consumes (20250814.1), the absl_hash pkg-config requirement for consumers, and the abseil dev packages for the distros that reliably carry them. This is a prerequisite for the upcoming chunked_hash_map container.
744ae73 to
cd64153
Compare
Port chunked_vector from Redpanda: a random-access vector whose storage is split across fixed-size fragments instead of one contiguous block, so it grows without large reallocations and keeps element addresses stable across growth. Adapted to Seastar conventions (seastar namespace, SEASTAR_ASSERT, .hh header) with a Boost.Test port of the test suite.
Port chunked_hash_map/chunked_hash_set from Redpanda: ankerl's segmented hash map/set backed by chunked_vector storage, suited to maps that grow large (scaling with partitions or topics) without large contiguous allocations. Hashing dispatches to absl::Hash when a type provides AbslHashValue, otherwise to unordered_dense's hash. Adapted to the seastar namespace, with a Boost.Test port of the test suite.
Replace the std::unordered_map backing metric_aggregate_by_labels with chunked_hash_map so per-label aggregation storage grows without large contiguous allocations when exporting many metric series.
cd64153 to
4205322
Compare
Author
|
@StephanDollberg are you happy with how these deps are fetched? vendoring was an option (for unordered_dense at least) but the CMake changes feel fine |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
ankerl::unordered_densedepabsldepchunked_vector.hchunked_hash_map.hchunked_hash_mapinprometheus-impl.hhto avoid oversized allocations in the metrics aggregation pathFor v26.2.x patch, see: #289