Skip to content

feat(pz2): frozen shared match-finder — dict-tier encode at 3x spike speed, same ratio#147

Merged
ChrisLundquist merged 1 commit into
masterfrom
claude/pz2-frozen-finder
Jun 10, 2026
Merged

feat(pz2): frozen shared match-finder — dict-tier encode at 3x spike speed, same ratio#147
ChrisLundquist merged 1 commit into
masterfrom
claude/pz2-frozen-finder

Conversation

@ChrisLundquist

Copy link
Copy Markdown
Owner

Summary

Builds the encode-side prerequisite identified in #146: a frozen shared match-finder so the dict tier doesn't re-parse the dictionary per block.

  • lz77::FrozenDict — immutable hash-chain tables built once over the dict (insert-only, no walks), Arc-shared across workers. Coordinates are dict-relative; workers hold a dict‖block arena so compares read one contiguous buffer and no two-region logic exists anywhere.
  • find_best grows one extra chain walk over the frozen tables after the block-local walk (shared budget, recency first), with a pos >= dict.len guard making every read provably in-bounds.
  • lzseq::tokenize_with_dict(input, start, dict, config) — the parse starts at the dict boundary: no dict re-parse, and the spike's token-skipping/straddle handling disappears.
  • pz2::encode_with_frozen_dict — wire-compatible with the existing decode_with_prefix.

Measurements (per-32 MiB-segment head dict, blob)

ratio @16Mi encode (ST)
spike (re-parse per block) 30.475% 208.8 s
frozen finder 30.475% 70.5 s
no dict 31.043% 13.3 s

Identical ratio at 3× the speed. The remaining 5.3× over baseline is the dict chain walks themselves — a 32-byte weak-local-match gate was measured and rejected (−8% time for +0.018pp; most text positions have weak local matches, so the walk is inherent). Integration-time tuning levers: dict-specific chain caps, sampled insertion, 4 MiB dicts (−0.31pp at 33.7 s).

What remains for Pz2d: container segment framing, 2-wave parallel decode (Arc dict + worker arenas), and the tuning pass — all documented in clean-slate-codec.md §11.

Test plan

  • New test_frozen_dict_round_trip (wire-compat with decode_with_prefix, dict-reach assertion, empty-dict path)
  • --frozen probe round-trip-verifies every block
  • 594 + 741 tests, fmt, clippy clean

🤖 Generated with Claude Code

…speed, same ratio

The encode-side prerequisite for the Pz2d dict tier (#146 findings):

- lz77::FrozenDict: immutable hash-chain tables built ONCE over a dict
  prefix (one insert per position, no chain walks), Arc-shared by all
  workers. Dict-relative coordinates; the parse input must carry the dict
  as its prefix (worker arena = dict||block) so compares read one buffer.
- HashChainFinder::set_frozen_dict + a second chain walk in find_best
  over the frozen tables after the block-local walk (shared chain budget,
  recency first). The pos >= dict.len guard makes every read in-bounds
  regardless of caller behavior.
- lzseq::tokenize_with_dict(input, start, dict, config): parse starts at
  the dict boundary — no dict re-parse, no token-skip/straddle handling
  (tokenize_with_config delegates with (0, None)).
- pz2::encode_with_frozen_dict (wire-writing extracted into a shared
  encode_sequences; output decodes with the existing decode_with_prefix).

Measured (pz2_dict_probe --frozen, per-32MiB-segment, blob):
- ratio IDENTICAL to the re-parse spike: 30.475% at 16 MiB dict
  (-0.57pp vs no dict, ~0.9pp under pzstd-3)
- encode 70.5s vs the spike's 208.8s ST (3x); remaining cost is the dict
  chain walks themselves (5.3x no-dict baseline)
- a 32-byte weak-local-match gate was measured and REJECTED (-8% time,
  +0.018pp: most text positions have weak local matches — the walk is
  inherent; tune via dict chain caps at integration time)

Remaining for shipping Pz2d (documented in section 11): container
segment framing, 2-wave parallel decode with Arc dict + worker arenas,
encode-cost tuning pass.

594 + 741 tests, fmt, clippy clean.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant