Summary
The codec has three independent LPC-residual implementations (NEON kernel, scalar span, and the lpc.cpp fallback) that must be numerically bit-identical for losslessness, but that identity is currently guaranteed only by hand-matching, not by a shared kernel or an enforced test. The NEON path also relies on a separate scalar range-check pass rather than checking in the path that writes output.
Affected areas
src/codec/simd/neon.cpp
src/codec/lpc/lpc.cpp
Problem details
- Three residual engines.
neon_lpc_residual_safe_order (NEON), lpc_residual_scalar_unchecked_span (scalar), and LPC::compute_open_loop_residual_for_order (lpc.cpp) each re-derive the predictor acc >> 15 and warmup/tap handling. They produce identical bytes only because they were matched by hand; nothing in the build or tests fails if they diverge.
- Range-safety trust gap.
lpc_residual_with_fallback runs lpc_residual_scalar_check_range over the whole block (neon.cpp:230), then runs the unchecked NEON kernel (neon.cpp:237) which casts diff straight to int32_t. The guarantee depends on the check pass and the compute pass computing identical pred/diff. If they drift (e.g. a tap-unroll change), the NEON path silently truncates out-of-range residuals and corrupts losslessness with no detection. The decode-side restore does range-check (lpc.cpp:250,264), so the trust boundary is asymmetric.
- Duplicated fallback control logic.
kResidualFallbackOrders, append_unique_order, and build_residual_attempt_orders are duplicated between lpc.cpp and neon.cpp (neon.cpp:199-258). The encoder may pick either engine and they must select the same fallback order for the stream to round-trip.
This overlaps #8 on the per-line narrowing/UB, but the design ask here is different: a single shared kernel plus a test-enforced "SIMD output == scalar output" invariant, and range-checking inside the writing path.
Acceptance criteria
- One shared LPC-residual kernel (or a thin SIMD specialization of a shared scalar core) is used by all paths, so there is a single place defining the arithmetic.
- A test asserts NEON output is byte-identical to the scalar reference across representative blocks, including full-range and near-overflow inputs.
- Range safety is enforced in the path that writes residuals (no reliance on a separate check pass computing the same values), or the equivalence is asserted by test.
- Fallback-order selection logic has one definition shared by the scalar and SIMD entry points.
Summary
The codec has three independent LPC-residual implementations (NEON kernel, scalar span, and the
lpc.cppfallback) that must be numerically bit-identical for losslessness, but that identity is currently guaranteed only by hand-matching, not by a shared kernel or an enforced test. The NEON path also relies on a separate scalar range-check pass rather than checking in the path that writes output.Affected areas
src/codec/simd/neon.cppsrc/codec/lpc/lpc.cppProblem details
neon_lpc_residual_safe_order(NEON),lpc_residual_scalar_unchecked_span(scalar), andLPC::compute_open_loop_residual_for_order(lpc.cpp) each re-derive the predictoracc >> 15and warmup/tap handling. They produce identical bytes only because they were matched by hand; nothing in the build or tests fails if they diverge.lpc_residual_with_fallbackrunslpc_residual_scalar_check_rangeover the whole block (neon.cpp:230), then runs the unchecked NEON kernel (neon.cpp:237) which castsdiffstraight toint32_t. The guarantee depends on the check pass and the compute pass computing identicalpred/diff. If they drift (e.g. a tap-unroll change), the NEON path silently truncates out-of-range residuals and corrupts losslessness with no detection. The decode-side restore does range-check (lpc.cpp:250,264), so the trust boundary is asymmetric.kResidualFallbackOrders,append_unique_order, andbuild_residual_attempt_ordersare duplicated betweenlpc.cppandneon.cpp(neon.cpp:199-258). The encoder may pick either engine and they must select the same fallback order for the stream to round-trip.This overlaps #8 on the per-line narrowing/UB, but the design ask here is different: a single shared kernel plus a test-enforced "SIMD output == scalar output" invariant, and range-checking inside the writing path.
Acceptance criteria