Skip to content

Consolidate duplicated v3 decode path, validation helpers, limits, and WAV writing #24

Description

@audexdev

Summary

The version-3 decode path, the PCM-range / mid-side validation helpers, the decoder resource limits, and the WAV header/sample writer are each implemented twice or more across the CLI, the library decoder, and WAV I/O. This duplicates attacker-facing parser logic and lets security-critical limits drift between copies.

Affected areas

  • src/main.cpp
  • src/codec/lac/decoder.cpp
  • src/io/wav_io.cpp

Problem details

  • Whole v3 decode path duplicated. decode_lac_v3_to_mapped_wav (main.cpp:183-430) reimplements block-table parsing, size validation, resource-limit checks, the worker-thread loop, per-block stereo handling, mid/side reconstruction, and PCM-range validation that already exist in LAC::Decoder::decode (decoder.cpp:77-294). On a v3 stream the CLI never falls back to LAC::Decoder (it returns on FastDecodeStatus::Failed, main.cpp:755-758), so the second implementation is not even redundancy — it is a second parser to harden and keep in sync.
  • Resource-limit constants triplicated. MAX_TOTAL_SAMPLES / MAX_DECODED_PCM_BYTES / MAX_BLOCK_COUNT are defined in both decoder.cpp:18-24 and main.cpp:39-46; MAX_DECODED_PCM_BYTES is defined a third time in wav_io.cpp:13. A change to one copy silently weakens the others.
  • PCM-range / mid-side helpers duplicated 4×. is_valid_sample_for_depth, validate_pcm_range, and reconstruct_mid_side_in_place appear in decoder.cpp:35-66, as cli_* copies in main.cpp:80-112, in wav_io.cpp:60-64, and the depth check again in encoder.cpp:83-91. The mid/side reconstruction expression l = m + ((s + (s&1))>>1) is hand-copied in two places (decoder.cpp:57, main.cpp:102).
  • WAV header + sample packing triplicated. write_wav_impl (wav_io.cpp:282-344) and write_wav_header_to_buffer + pack_pcm_to_wav_bytes (main.cpp:126-181) implement the same RIFF/fmt /data layout and 16/24-bit little-endian packing independently.
  • Two worker-pool task-distribution patterns. The encoder uses a std::mutex + std::queue (encoder.cpp:261-264,408-414); the decoder and CLI use std::atomic<uint32_t> fetch_add (decoder.cpp:244,258, main.cpp:324,342). Same fan-out, two mechanisms.

This is related to but broader than #3 (allocation bounding) and #5 (encoder thread exception-safety): the core problem here is duplication, not any single bound or unwind.

Acceptance criteria

  • The v3 block-table parse + validation + limit checks have one implementation shared by the CLI fast path and LAC::Decoder (the mmap output writer may remain a thin presentation layer over the shared parser).
  • MAX_TOTAL_SAMPLES / MAX_DECODED_PCM_BYTES / MAX_BLOCK_COUNT are defined once and referenced everywhere.
  • PCM-range validation and mid/side reconstruction each have one definition.
  • WAV header construction and PCM packing are shared between the streaming writer and the mmap writer.
  • Worker-pool task distribution uses one pattern across encode and decode.
  • No change to produced/accepted bytes; existing round-trip and malformed-input behavior is preserved.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthardeningInput validation and robustness hardeningsecuritySecurity-relevant parser, decoder, or memory-safety work

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions