Skip to content

Define a single source of truth for the codec format contract #23

Description

@audexdev

Summary

The lossless invariant requires the encoder, decoder, and SIMD paths to agree bit-for-bit on every format-defining formula. Today those formulas are reimplemented independently in several files instead of living in one shared module, so a one-sided edit can silently break round-trip losslessness with no compiler help.

Affected areas

  • src/codec/block/encoder.cpp
  • src/codec/block/decoder.cpp
  • src/codec/simd/neon.cpp
  • src/codec/lpc/lpc.cpp
  • src/codec/block/constants.hpp

Problem details

The same wire-format / arithmetic contract is written two or three times:

  • Fixed predictor taps {1} / {2,-1} / {3,-3,1} / {4,-6,4,-1}: encoder compute_fixed_residual (encoder.cpp:265-292) vs decoder restore_fixed_in_place (decoder.cpp:320-341).
  • FIR predictor taps/shift {3,-1}, >> 2: encoder uses named constants; decoder hardcodes 3LL*pcm[i-1] - pcm[i-2] >> 2 literally and does not reference the encoder constants (decoder.cpp:347-349).
  • Zigzag mapping: unsigned_from_residual / zigzag_encode in the encoder vs to_unsigned / zigzag_decode lambdas in the decoder (decoder.cpp:85-88,120-124). A correct shared copy also exists in Rice::signed_to_unsigned (rice.cpp:7-15) — three implementations of one mapping.
  • Stateless adapt_k: encoder adapt_k_stateless vs decoder inline lambda (decoder.cpp:90-96) — same (sum + count/2)/count / bit_width(mean-1) math twice.
  • Partition-size math: encoder partition_sizes_for_block / max_partition_order_for_block (encoder.cpp:93-119) vs decoder partition_size_at (decoder.cpp:98-102).
  • Q15 scale/shift (15 / 32768): the binding quantization contract is a bare literal on the decode side (decoder.cpp:43,50,388,398) and implicit on the encode side (via lpc.compute_residual_q15); it is not a named shared constant.
  • Wire-format tag/mode vocabulary: kBinTag* / kResidualMode* / kPredictor* (encoder.cpp:46-56) and zero-run sub-tags kTagNormal/Run/Escape (encoder.cpp:670-672) are redefined in the decoder (decoder.cpp:68-72,139-141). None live in constants.hpp.

constants.hpp currently centralizes only block/partition limits and the control-byte bit layout — exactly the values that are not the lock-step hazard. The values that must stay identical are the ones duplicated.

Note: this is distinct from #8 (which targets UB/narrowing in those paths) and from the closed #19 (which documented the spec). This issue is about a single in-code source of truth.

Acceptance criteria

  • Predictor formulas (fixed/FIR), zigzag mapping, stateless adapt_k, partition-size math, the Q15 shift/scale, and the tag/mode/predictor constants are each defined once and consumed by encoder, decoder, and SIMD paths.
  • Removing or changing a formula in one place forces a build break or test failure if a consumer drifts.
  • A round-trip test asserts encoder output and decoder input use the shared definitions (e.g. via a small shared header consumed by both).
  • No behavior change to the emitted bitstream; existing fixtures still round-trip bit-perfectly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestformatLAC bitstream format and compatibilityhardeningInput validation and robustness hardening

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions