Skip to content

[v26.1.x] Port chunked_vector, chunked_hash_map, and use it in prometheus-impl.hh#288

Open
WillemKauf wants to merge 5 commits into
redpanda-data:v26.1.xfrom
WillemKauf:chunked-hash-map-v26.1.x
Open

[v26.1.x] Port chunked_vector, chunked_hash_map, and use it in prometheus-impl.hh#288
WillemKauf wants to merge 5 commits into
redpanda-data:v26.1.xfrom
WillemKauf:chunked-hash-map-v26.1.x

Conversation

@WillemKauf

@WillemKauf WillemKauf commented Jun 23, 2026

Copy link
Copy Markdown
  • adds ankerl::unordered_dense dep
  • adds absl dep
  • copy paste chunked_vector.h
  • copy paste chunked_hash_map.h
  • use chunked_hash_map in prometheus-impl.hh to avoid oversized allocations in the metrics aggregation path

For v26.2.x patch, see: #289

Copilot AI review requested due to automatic review settings June 23, 2026 15:30

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR ports Redpanda’s chunked_vector and chunked_hash_map into Seastar, wires the new hash map into the Prometheus label aggregation path, and updates the build/dependency plumbing (CMake, cooking ingredients, pkg-config, and install scripts) to support the new containers.

Changes:

  • Add new core containers: include/seastar/core/chunked_vector.hh and include/seastar/core/chunked_hash_map.hh.
  • Switch src/core/prometheus-impl.hh label aggregation map from std::unordered_map to chunked_hash_map.
  • Add unit tests for both containers and add Abseil + unordered_dense as dependencies (CMake/cooking/pkg-config/install scripts).

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
tests/unit/CMakeLists.txt Registers new unit tests for chunked containers.
tests/unit/chunked_vector_test.cc Adds extensive iterator/behavior tests for chunked_vector.
tests/unit/chunked_hash_map_test.cc Adds basic compile/behavior tests for chunked_hash_map and from_range.
src/core/prometheus-impl.hh Uses chunked_hash_map for metric aggregation by labels.
pkgconfig/seastar.pc.in Adds cflags for unordered_dense and Requires for Abseil hash.
install-dependencies.sh Adds distro packages for Abseil.
include/seastar/core/chunked_vector.hh Introduces chunked_vector implementation and iterators.
include/seastar/core/chunked_hash_map.hh Introduces chunked_hash_map/set, helpers, and fmt integration.
cooking_recipe.cmake Adds cooking ingredients for unordered_dense and Abseil.
CMakeLists.txt Installs new headers and links against unordered_dense and Abseil hash target.
cmake/SeastarDependencies.cmake Finds unordered_dense and Abseil via CMake.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}
~chunked_vector() noexcept = default;

chunked_vector copy() const noexcept { return *this; }
Comment on lines +297 to +317
void reserve(size_t new_cap) {
static constexpr size_t elems_per_frag = calc_elems_per_frag();
if (new_cap > _capacity) {
if (_frags.empty()) {
auto& frag = _frags.emplace_back();
frag.reserve(std::min(elems_per_frag, new_cap));
_capacity = frag.capacity();
} else if (_frags.size() == 1) {
auto& frag = _frags.front();
frag.reserve(std::min(elems_per_frag, new_cap));
_capacity = frag.capacity();
}
// We only reserve the first fragment as all fragments after the
// first are allocated at the maximum size, so we don't save
// anything in terms of reallocs after fully allocating the
// first fragment. In addition, due to cache locality, it's
// better to delay the allocations of those other fragments
// until they're going to be used.
}
update_generation();
}
Comment on lines +397 to +407
iter& operator+=(ssize_t n) {
check_generation();
_index += n;
return *this;
}

iter& operator-=(ssize_t n) {
check_generation();
_index -= n;
return *this;
}
Comment on lines +469 to +471
friend ssize_t operator-(const iter& a, const iter& b) {
return a._index - b._index;
}
Comment on lines +392 to +395
pointer operator->() const {
check_generation();
return &_vec->operator[](_index);
}
Comment on lines +271 to +273
// Calling shrink to fix then modifying the container could result
// in allocations that overshoot our max_frag_bytes, except when
// we're managing the dynamic size of the first fragment.
Comment on lines +87 to +90
if (f.capacity() > std::decay_t<decltype(v)>::max_frag_bytes()) {
return fmt::format(
"fragment {} capacity over max_frag_bytes ({})", i, calc_cap);
}
Comment on lines +135 to +137
/**
* Proxy that applies a consistency check before deference
*/
Comment on lines +145 to +147
return m.bucket_count()
* sizeof(typename chunked_hash_map<K, V>::bucket_type)
+ m.values().capacity() * sizeof(m.values()[0]);
@WillemKauf WillemKauf force-pushed the chunked-hash-map-v26.1.x branch from 7219787 to 744ae73 Compare June 23, 2026 15:53
Wire ankerl::unordered_dense into the build as a required dependency:
find_package in SeastarDependencies, link unordered_dense::unordered_dense
PUBLIC, a cooking recipe pinned to the commit Redpanda consumes (v4.4.0),
and propagate its include flags to consumers via the pkg-config file.

This is a prerequisite for the upcoming chunked_hash_map container.
Wire abseil into the build as a required dependency for chunked_hash_map's
absl::Hash support: find_package(absl CONFIG), link absl::hash PUBLIC, a
cooking recipe pinned to the same LTS release Redpanda consumes
(20250814.1), the absl_hash pkg-config requirement for consumers, and
the abseil dev packages for the distros that reliably carry them.

This is a prerequisite for the upcoming chunked_hash_map container.
@WillemKauf WillemKauf force-pushed the chunked-hash-map-v26.1.x branch from 744ae73 to cd64153 Compare June 23, 2026 16:15
Port chunked_vector from Redpanda: a random-access vector whose storage
is split across fixed-size fragments instead of one contiguous block, so
it grows without large reallocations and keeps element addresses stable
across growth. Adapted to Seastar conventions (seastar namespace,
SEASTAR_ASSERT, .hh header) with a Boost.Test port of the test suite.
Port chunked_hash_map/chunked_hash_set from Redpanda: ankerl's segmented
hash map/set backed by chunked_vector storage, suited to maps that grow
large (scaling with partitions or topics) without large contiguous
allocations. Hashing dispatches to absl::Hash when a type provides
AbslHashValue, otherwise to unordered_dense's hash. Adapted to the
seastar namespace, with a Boost.Test port of the test suite.
Replace the std::unordered_map backing metric_aggregate_by_labels with
chunked_hash_map so per-label aggregation storage grows without large
contiguous allocations when exporting many metric series.
@WillemKauf WillemKauf force-pushed the chunked-hash-map-v26.1.x branch from cd64153 to 4205322 Compare June 23, 2026 16:25
@WillemKauf

Copy link
Copy Markdown
Author

@StephanDollberg are you happy with how these deps are fetched? vendoring was an option (for unordered_dense at least) but the CMake changes feel fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants