Skip to content

Audit follow-ups: reliability hardening + engine-state refactor (v2.4.2)#32

Merged
StevenACZ merged 24 commits into
mainfrom
fix/audit-followups-2026-06-13
Jun 15, 2026
Merged

Audit follow-ups: reliability hardening + engine-state refactor (v2.4.2)#32
StevenACZ merged 24 commits into
mainfrom
fix/audit-followups-2026-06-13

Conversation

@StevenACZ

Copy link
Copy Markdown
Owner

Resumen

Segunda ronda de audit follow-ups sobre la rama de deep-audit, más el primer slice del refactor estructural diferido. Build Release arm64, strict concurrency complete, 134 tests verdes, secrets-scan limpio.

Cambios principales

Fixed

  • Pérdida silenciosa de audio en el historial (persistencia serializada).
  • Retry pegaba audio viejo o texto duplicado.
  • Deepgram Flux perdía las últimas palabras al detener.
  • WhisperKit: guarda contra inferencia concurrente en el modelo compartido.
  • Fidelidad de números en AI polish (subsecuencia ordenada y duplicate-aware).
  • Versión mostrada y visuales de estado deshabilitado.

Performance

  • Paginación incremental del historial con orden estable.
  • Recorder hot path sin asignación de lock por buffer.

Refactor (paso 2 del plan diferido)

  • Protocolo TranscriptionEngineSession (readiness/busy) que retira 3 switch currentEngine duplicados tras un único mapeo; cada transcriber declara su estado. + tests de caracterización.

Infra / seguridad

  • Cierre de 2 data races del audio-stack bajo strict concurrency.
  • Redacción de secretos reforzada y clasificación 4xx no-reintentable.
  • CI en GitHub Actions (lint, secrets-scan, build, tests).

Detalle completo en CHANGELOG.md [2.4.2]. Release objetivo v2.4.2 (build 7).

StevenACZ added 24 commits June 13, 2026 22:26
- Read app version at runtime from the bundle (CFBundleShortVersionString /
  MARKETING_VERSION) instead of a hardcoded 2.3.0, so the About tab and the
  settings export never drift from the shipped version.
- Paint the neutral gray gradient for .noModel/.error in the menu bar action
  button via an exhaustive switch mirroring buttonColor, so foreground and
  background can no longer diverge.
- Add digit (0-9) and common punctuation labels to HotkeyKeyName so keycaps
  no longer render raw 'Key18' for Option+1 and similar combos.
- Reject silently-changed numbers in the polish fidelity guard: numeric anchors
  now match whole numeric tokens, so "5" no longer survives inside "15",
  "5.5", "5,000" or "5:30" (their punctuation is semantic).
- Add a non-retryable .clientError kind for HTTP 4xx (400/404/405/413/415/422)
  and map RecordingError.permissionDenied to .notConfigured, so the user no
  longer gets a Retry button for errors that a retry cannot fix.
- Resume (not drop) a pending stop continuation in both streaming transcribers'
  cleanupWebSocket, preventing a leaked CheckedContinuation and a hung stop.
- Gate the legacy temp-audio sweep by file age so a crash mid-write no longer
  has its still-recoverable WAV deleted.
- Name history audio files by UUID instead of a float timestamp to avoid silent
  collisions that lost audio.
- Make log credential redaction deterministic (ordered array, not a dict).
- AudioLevelMonitor: the mic-test sample file (sampleFile/sampleWriteActive) was
  written on the main actor and read+written from the nonisolated tap thread with
  no synchronization. Guard both with a dedicated os_unfair_lock and route the tap
  write through writeSampleBuffer (mic-test path, low-frequency).
- AudioRecorder: lastInputBufferTime was written by the tap and read by the
  capture diagnostics/health probe outside captureStateLock. Move its write into
  registerInputBuffer (already lock-held every buffer, so no new hot-path lock),
  read it under the lock, and reset it via a guarded helper.

PasteManager/VocabularyManager were also flagged but are false positives: with
SWIFT_DEFAULT_ACTOR_ISOLATION=MainActor those unannotated classes are already
MainActor-isolated, so there is no cross-isolation race.
The documented PR gate (make ci-check) was honor-system only: nothing ran the
suite unless a human invoked make. Add a macOS workflow that lints (swift-format
strict), scans for secrets, and runs the Debug build plus unit tests on push to
main and on every PR. No signing identity is needed: SigningDefaults.xcconfig
falls back to ad-hoc signing, so the build runs without the local override.
…ping

- Add VocabularyManagerTests for the replacement/keyterm logic that runs on every
  cloud session: whole-word + spoken-punctuation replacements, saved-terms-first
  payload ordering, length/count limits, and the singular Deepgram keyterm param.
  A new injectable fileURL init keeps the suite off the user's vocabulary.json.
- Extend TranscriptionFailureTests to cover from(): URLError/RecordingError/
  WhisperKitError mapping, the new non-retryable .clientError 4xx path, and
  RecordingError.permissionDenied -> .notConfigured.
processAudioBuffer allocated an OSAllocatedUnfairLock on every input buffer
(~47/s) just to back the converter input block's consumed flag, even though that
block runs synchronously inside convert() on the same thread (AVFAudio is
imported @preconcurrency, so the closure is not forced @sendable). Replace it
with a plain captured flag: no heap allocation, identical drain semantics.
- HistoryView.loadMoreIfNeeded now fetches only the next page by offset and
  appends it, instead of re-fetching every page loaded so far (which made a long
  scroll O(pages^2) in rows read). fetchEntries already supports offset on the
  FTS and LIKE paths.
- Tie-break the query by id DESC so rows sharing a timestamp keep a stable order
  across separate LIMIT/OFFSET page queries, preventing duplicated or skipped
  rows. Add tests for paging coverage and the stable newest-first order.
The app target already builds with SWIFT_STRICT_CONCURRENCY=complete, but the
test target was still on targeted, so test code (and the app code it links) was
not held to the same data-race checks. Raise it to complete: the suite compiles
and passes unchanged, closing the gap before any future Swift 6 language-mode move.
Note the test target now builds strict-complete; the fidelity guard's
whole-numeric-token matching; the 4xx/.clientError and permissionDenied failure
mapping; incremental + stable-order history paging and UUID audio names; CI on
GitHub Actions and the expanded test coverage. Add an owner-gated, slice-by-slice
plan for the deferred ViewModel/structural refactor.
The numeric fidelity check deduplicated anchors into a Set, so a number swap (5 tickets/sprint 6 -> 6 tickets/sprint 5) or a changed duplicate (5 and 5 -> 5 and 15) survived the guard. Extract numbers as an ordered sequence with duplicates and require them to survive as an ordered subsequence of the polished numbers; exclude digits embedded in URLs/emails (those keep their own literal anchor). Adds swap/duplicate/moved-URL coverage.
Redact Bearer/Token (Deepgram scheme) tokens even in colon-space header form: the generic key pattern stopped at the space and stranded the token, so the auth-scheme pattern now runs first and covers base64/JWT charsets. Map every other 4xx (406/409/410/...) to the non-retryable .clientError instead of falling through to retryable .unknown. Redact the OpenAI-compatible provider error body before it is logged (.public) or shown in Settings. SettingsTransferManager presence check gates on KeychainStore.hasValue so a settings import cannot trigger the macOS Keychain consent prompt.
… lock

The streaming twin of the recorder timestamp was written bare on the tap thread and read off audioSetupQueue in the health probe and diagnostics — the same data race already closed in AudioRecorder. Publish it inside registerInputBuffer under captureStateLock, read it via currentLastInputBufferTime(), and reset it via resetLastInputBufferTime().
Retry left lastFailedAudioURL pointing at a PRIOR session after a recording was interrupted (audio deleted) or blocked offline before start, so Retry could retranscribe and paste the wrong session. Clear the retry state on those non-persisting failure paths. retryTranscription also had no reentrancy guard, so a double Retry transcribed and pasted the same audio twice: guard with isRetryInFlight, cleared in the task defer (success and failure); a failed retry still preserves the audio for another attempt.
VocabularyManager.save() writes atomically so a crash mid-write cannot truncate vocabulary.json. AGENTS.md documents the StreamingAudioCapture lock and the ordered numeric-subsequence fidelity rule.
…up queue

audioEngine/recordingURL were re-assigned off-queue right after being set inside audioSetupQueue, and pause/resume touched audioEngine on the main actor. Both could race recoverCapture, which nils and rebuilds audioEngine on audioSetupQueue when a USB mic fires an AVAudioEngineConfigurationChange at capture start, concurrently retain/releasing the engine reference.

Return Void from the setup continuation (engine and url are already assigned once inside the queue) and wrap the pause/resume engine calls in audioSetupQueue.sync, so the engine lifecycle is single-threaded on that queue like stopRecording.
WhisperKit wraps a single non-reentrant model. The hotkey start gate blocks a colliding live dictation, but retranscribing a history entry reaches transcribe() on a path that bypasses it, so two inferences could run on the model at once. Reject the second call with WhisperKitError.transcriptionInProgress, mapped through TranscriptionFailure.from as a transient retryable failure.
isStopping is set before finishAndWait and before waitForFinalTranscript installs its continuation, so the EndOfTurn the server flushes after CloseStream could arrive while stopContinuation was nil and be dropped, making stop ride the full 4s timeout. Track that CloseStream was requested and that the final turn arrived after it, and return the accumulated transcript immediately, mirroring the ElevenLabs committed-count guard.
save() runs the orphan/size sweep on every insert, but completed dictations
copied their WAV before inserting the row on detached tasks, so a concurrent
save's sweep could delete a freshly copied WAV before its row referenced it —
losing saved audio silently and breaking retry/playback.

Serialize copy -> insert -> sweep through a recursive persistenceLock, add an
atomic persistEntry API, and drop the saveAudioFile forwarding so the split
cannot be reintroduced. Match orphaned files by their UUID name instead of the
full path so path normalization differences never strand a referenced file.
Add a targeted duration(for:) query so retry no longer materializes the whole
table via fetchAll().

Adds HistoryAudioPersistenceTests covering the persist invariant and the
concurrent persist-vs-sweep regression.
Replace the two byte-identical DeepgramModeButton/ElevenLabsModeButton views
with a single reusable EngineModeButton. Remove the unused ObservableObject
conformance, cancellables set, and Combine import from DockIconManager.
…ptionEngineSession

Engine readiness/busy state was branched in three places (isEngineReady,
isSelectedEngineBusy, canRecord), each repeating which transcriber and which
property back each logical engine. Introduce a read-only
TranscriptionEngineSession protocol (isReady/isBusy) that every transcriber
declares next to its own state, plus an EngineSessions value type deriving
readiness (single source) and busy (OR across the engine's sessions). The
ViewModel maps an engine to its sessions in one place and derives all three
queries from it; behavior is unchanged.

Add characterization tests pinning the aggregation semantics (including the
live-sibling-busy case) and the engine-to-sessions mapping by identity.
Bump MARKETING_VERSION to 2.4.2 (build 7) and record the audit follow-up
reliability fixes, faster history paging, and internal hardening in the
changelog.
The runner's bundled swift-format (Xcode 16.4 → 6.1) can't decode the
.swift-format config authored against current Xcode's 6.3. Install
swift-format via Homebrew and point the lint step at it so CI matches the
local toolchain.
The runner has no gitleaks, so the secret scan errored out. Install it
alongside swift-format in the tools step.
The project relies on SWIFT_DEFAULT_ACTOR_ISOLATION = MainActor and
SWIFT_APPROACHABLE_CONCURRENCY, which need Swift 6.2+ (Xcode 26). The
macos-15 image ships Xcode 16.4 / Swift 6.1, which ignores those settings
and fails the build with nonisolated-context errors (e.g. PasteManager).
Move CI to the GA macos-26 image so it matches the local toolchain.
@StevenACZ StevenACZ merged commit 2b755fd into main Jun 15, 2026
2 checks passed
@StevenACZ StevenACZ deleted the fix/audit-followups-2026-06-13 branch June 15, 2026 01:41
StevenACZ pushed a commit that referenced this pull request Jun 21, 2026
Audit follow-ups: reliability hardening + engine-state refactor (v2.4.2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant