refactor: stop allocating on tuple and list hashing#14542
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors Stdune tuple and list hash helpers to avoid building intermediate tuples/lists before hashing, replacing polymorphic hashing with arithmetic combiners.
Changes:
- Replaced
Tuple.T2.hashandTuple.T3.hashtuple allocation with multiplicative integer combination. - Replaced
List.hash’s mapped-list allocation with a left fold.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
otherlibs/stdune/src/tuple.ml |
Updates tuple hash implementations to combine component hashes directly. |
otherlibs/stdune/src/list.ml |
Updates list hashing to fold over element hashes without constructing an intermediate list. |
cc6bccb to
e193cc5
Compare
e193cc5 to
c566d44
Compare
Allocation measurementTo check the magnitude of the effect on dune itself, ran a paired dune-on-dune null build experiment. Setup. Results.
Mean Δ: −868,427 words ≈ −7.0 MB per null build (−0.115%). All 5 paired deltas have the same sign; range of deltas (~235k) is well below the magnitude (~870k). Treating this as a defensive/clarity change rather than a perf win — the saving is real but unlikely to be observable in wall time on a realistic workload. |
## Summary `Option.hash f None` and `Option.hash f (Some x)` previously both went through `Stdlib.Hashtbl.hash`, which gave equal results whenever `f x = 0` because `None` shares OCaml's tag-0 runtime representation with the integer 0. So under stdune's existing hashers: - `Option.hash Bool.hash (Some false)` collided with `Option.hash Bool.hash None` (`Bool.hash false = 0`). - `Option.hash Unit.hash (Some ())` collided with `Option.hash Unit.hash None` (`Unit.hash () = 0`). This PR replaces the structural hash with a tagged recurrence — `None -> 0`, `Some s -> (f s * 31) + 1` — so the two cases land in disjoint buckets. Added an expect test in `otherlibs/stdune/test/option_tests.ml` covering the `Bool` and `Unit` cases that previously collided. Discovered while surveying allocation-heavy hash sites for a follow-up to #14542. Signed-off-by: Robin Bate Boerop <me@robinbb.com>
|
It's ok to deviate from Poly.hash. We do not rely on hashes for correctness, only for performance. However, I'd rather not introduce code like this unless we can measure the impact on dune itself. |
I think I have done that, at least for dune-on-dune builds. It has an almost-imperceptible performance increase of 0.1%, and a reduction in memory allocations of 0.1%, also. An admittedly minor improvement. Simply close the PR if you don't want the change, @rgrinberg. I'm not concerned either way. Just offering a small improvement in performance; I understand it comes with risks related to hash bits entropy. |
|
Here's a related for PR that allows us to replace poorly written hash functions with a For this PR, I'm ok with the approach in principle, but I would prefer the underlying implementation to live in a separate module. E.g. |
71fff16 to
7f6dc75
Compare
Extract the multiplicative `(acc * 31) + x` accumulator into a new
`Stdune.Hash` module with the API requested in review:
type t
val create : unit -> t
val feed : t -> int -> t
val hash : t -> int
Route the structural hash combinators through it:
- `Stdune.Tuple.T2.hash`, `Stdune.Tuple.T3.hash`, `Stdune.List.hash`
drop the allocations (a fresh inner tuple / a fresh `int list`)
that they previously paid before handing off to `Hashtbl.hash`.
- `Stdune.Result.hash` likewise stops allocating and gains explicit
discriminator bytes. This is a drive-by bugfix: the previous
implementation collided `Ok x` with `Error y` whenever the inner
hashes were equal (e.g. `Ok 0` and `Error 0`), the same shape as
the bug `Stdune.Option.hash` had pre-ocaml#14543.
- `Resolve.error_hash` (`src/dune_rules/resolve.ml`) drops its 2-tuple
allocation and its per-call `List.map ~f:Lazy.force` (which builds
a fresh `int list` of length |stack_frames|), folding the forced
frames into the accumulator instead.
`Stdune.Option.hash` is already optimal on main as of ocaml#14543 and is
left unchanged.
Direct consumption of ocaml#14605: `src/source/opam_switch.ml` already
included `Repr.Poly`, but a redundant manual `let hash` shadowed the
inherited one. That definition is now dropped. This is a single
demonstration site for the elimination-by-Repr.Poly direction
referenced in the review; a broader sweep is out of scope here
because `Repr.Poly`'s `validate_compare` rejects `Repr.view`, which
pervades dune's wrapped-string newtypes (paths, lib-names, opam
URLs, checksums, package names, ...).
Same `'a -> 'a -> int` callsite API; callers don't change. Hash
values shift, but these functions feed `Hashtbl` / `Memo` bucket
selectors only — no persistent serialization, no cross-process
invariants, so the shift is invisible to consumers.
Signed-off-by: Robin Bate Boerop <me@robinbb.com>
7f6dc75 to
0bf3b93
Compare
|
@rgrinberg Is what's in this PR now what you had in mind? (Forgive my slowness getting back to this, but I think it's low priority for both of us.) |
Summary
Extract the
(acc * 31) + xhash accumulator into a newStdune.Hashmodule per #14542 review, and routeTuple.T2.hash,Tuple.T3.hash,List.hash,Result.hash, andResolve.error_hashthrough it. None of these allocate per call any more. (Option.hashis already optimal on main as of #14543 and is left untouched.)Drive-by fix:
Stdune.Result.hashno longer collidesOk xwithError ywheneverh1 x = h2 y(e.g.Ok 0vsError 0) — same bug shapeStdune.Option.hashhad before #14543.Also consumes #14605 directly at one site:
src/source/opam_switch.mlalready includedRepr.Poly, but a redundant manuallet hashshadowed the inherited one. Now dropped.A broader sweep of "eliminate manual hash via
Repr.Poly" is out of scope here.Repr.Poly.validate_comparerejectsRepr.view, which is pervasive in dune's wrapped-string newtypes (paths, lib-names, opam URLs, checksums, package names, ...). The set of types whoseRepr.tis record/variant + primitives all the way down is small enough that the broader cleanup is best done as a separate, deliberate PR with its own decisions about whether to dropRepr.viewfrom the affected newtypes.Related: #14605 (
Repr.Poly.hash).