Summary
The lossless invariant requires the encoder, decoder, and SIMD paths to agree bit-for-bit on every format-defining formula. Today those formulas are reimplemented independently in several files instead of living in one shared module, so a one-sided edit can silently break round-trip losslessness with no compiler help.
Affected areas
src/codec/block/encoder.cpp
src/codec/block/decoder.cpp
src/codec/simd/neon.cpp
src/codec/lpc/lpc.cpp
src/codec/block/constants.hpp
Problem details
The same wire-format / arithmetic contract is written two or three times:
- Fixed predictor taps
{1} / {2,-1} / {3,-3,1} / {4,-6,4,-1}: encoder compute_fixed_residual (encoder.cpp:265-292) vs decoder restore_fixed_in_place (decoder.cpp:320-341).
- FIR predictor taps/shift
{3,-1}, >> 2: encoder uses named constants; decoder hardcodes 3LL*pcm[i-1] - pcm[i-2] >> 2 literally and does not reference the encoder constants (decoder.cpp:347-349).
- Zigzag mapping:
unsigned_from_residual / zigzag_encode in the encoder vs to_unsigned / zigzag_decode lambdas in the decoder (decoder.cpp:85-88,120-124). A correct shared copy also exists in Rice::signed_to_unsigned (rice.cpp:7-15) — three implementations of one mapping.
- Stateless
adapt_k: encoder adapt_k_stateless vs decoder inline lambda (decoder.cpp:90-96) — same (sum + count/2)/count / bit_width(mean-1) math twice.
- Partition-size math: encoder
partition_sizes_for_block / max_partition_order_for_block (encoder.cpp:93-119) vs decoder partition_size_at (decoder.cpp:98-102).
- Q15 scale/shift (
15 / 32768): the binding quantization contract is a bare literal on the decode side (decoder.cpp:43,50,388,398) and implicit on the encode side (via lpc.compute_residual_q15); it is not a named shared constant.
- Wire-format tag/mode vocabulary:
kBinTag* / kResidualMode* / kPredictor* (encoder.cpp:46-56) and zero-run sub-tags kTagNormal/Run/Escape (encoder.cpp:670-672) are redefined in the decoder (decoder.cpp:68-72,139-141). None live in constants.hpp.
constants.hpp currently centralizes only block/partition limits and the control-byte bit layout — exactly the values that are not the lock-step hazard. The values that must stay identical are the ones duplicated.
Note: this is distinct from #8 (which targets UB/narrowing in those paths) and from the closed #19 (which documented the spec). This issue is about a single in-code source of truth.
Acceptance criteria
- Predictor formulas (fixed/FIR), zigzag mapping, stateless
adapt_k, partition-size math, the Q15 shift/scale, and the tag/mode/predictor constants are each defined once and consumed by encoder, decoder, and SIMD paths.
- Removing or changing a formula in one place forces a build break or test failure if a consumer drifts.
- A round-trip test asserts encoder output and decoder input use the shared definitions (e.g. via a small shared header consumed by both).
- No behavior change to the emitted bitstream; existing fixtures still round-trip bit-perfectly.
Summary
The lossless invariant requires the encoder, decoder, and SIMD paths to agree bit-for-bit on every format-defining formula. Today those formulas are reimplemented independently in several files instead of living in one shared module, so a one-sided edit can silently break round-trip losslessness with no compiler help.
Affected areas
src/codec/block/encoder.cppsrc/codec/block/decoder.cppsrc/codec/simd/neon.cppsrc/codec/lpc/lpc.cppsrc/codec/block/constants.hppProblem details
The same wire-format / arithmetic contract is written two or three times:
{1} / {2,-1} / {3,-3,1} / {4,-6,4,-1}: encodercompute_fixed_residual(encoder.cpp:265-292) vs decoderrestore_fixed_in_place(decoder.cpp:320-341).{3,-1},>> 2: encoder uses named constants; decoder hardcodes3LL*pcm[i-1] - pcm[i-2] >> 2literally and does not reference the encoder constants (decoder.cpp:347-349).unsigned_from_residual/zigzag_encodein the encoder vsto_unsigned/zigzag_decodelambdas in the decoder (decoder.cpp:85-88,120-124). A correct shared copy also exists inRice::signed_to_unsigned(rice.cpp:7-15) — three implementations of one mapping.adapt_k: encoderadapt_k_statelessvs decoder inline lambda (decoder.cpp:90-96) — same(sum + count/2)/count/bit_width(mean-1)math twice.partition_sizes_for_block/max_partition_order_for_block(encoder.cpp:93-119) vs decoderpartition_size_at(decoder.cpp:98-102).15/32768): the binding quantization contract is a bare literal on the decode side (decoder.cpp:43,50,388,398) and implicit on the encode side (vialpc.compute_residual_q15); it is not a named shared constant.kBinTag*/kResidualMode*/kPredictor*(encoder.cpp:46-56) and zero-run sub-tagskTagNormal/Run/Escape(encoder.cpp:670-672) are redefined in the decoder (decoder.cpp:68-72,139-141). None live inconstants.hpp.constants.hppcurrently centralizes only block/partition limits and the control-byte bit layout — exactly the values that are not the lock-step hazard. The values that must stay identical are the ones duplicated.Note: this is distinct from #8 (which targets UB/narrowing in those paths) and from the closed #19 (which documented the spec). This issue is about a single in-code source of truth.
Acceptance criteria
adapt_k, partition-size math, the Q15 shift/scale, and the tag/mode/predictor constants are each defined once and consumed by encoder, decoder, and SIMD paths.