Context
The first cut of WhisperMLXTranscriber.makeStreamingSession accumulates PCM buffers in memory across the lifetime of a single recording. Whisper's file-based decode runs once at finish().
Why this is OK for the first cut
- Simplicity: no scratch-file I/O, no second writer competing with
LiveAudioEngine's AAC writer.
- Target device is iPhone 15 Pro Max (8 GB RAM). 16 kHz mono float32 = ~115 MB for 30 min of audio — comfortable headroom.
When it stops being OK
- Recordings longer than ~30 min on the iPhone 15 Pro Max.
- Any length on lower-RAM devices, if the device scope expands beyond the 15 Pro Max.
- Memory-pressure warnings under dogfood.
Options when revisiting
- Stream PCM to a scratch WAV during
feed, read back at finish. Lower peak memory, more disk I/O. Cleanest separation from the AAC writer.
- Reuse the in-progress AAC/m4a file that
LiveAudioEngine already writes. Decode it back to PCM at finalize. Avoids the second writer entirely, but couples the transcriber to the recording-file format.
- Chunked decode with interim transcripts. Out of scope for the no-streaming first cut, but the natural follow-up if we want live partials from Whisper.
Trigger to act
- Real notes regularly exceed ~20 min, OR
- Memory-pressure warnings in
os_log/Instruments during dogfood, OR
- Device scope expands beyond the iPhone 15 Pro Max.
References
planning/notes.md — Transcription upgrades section (added in the same plan that created this issue).
planning/transcription-tuning.md — Tier 2 (Local ASR via MLX).
Context
The first cut of
WhisperMLXTranscriber.makeStreamingSessionaccumulates PCM buffers in memory across the lifetime of a single recording. Whisper's file-based decode runs once atfinish().Why this is OK for the first cut
LiveAudioEngine's AAC writer.When it stops being OK
Options when revisiting
feed, read back atfinish. Lower peak memory, more disk I/O. Cleanest separation from the AAC writer.LiveAudioEnginealready writes. Decode it back to PCM at finalize. Avoids the second writer entirely, but couples the transcriber to the recording-file format.Trigger to act
os_log/Instruments during dogfood, ORReferences
planning/notes.md— Transcription upgrades section (added in the same plan that created this issue).planning/transcription-tuning.md— Tier 2 (Local ASR via MLX).