Audit follow-ups: reliability hardening + engine-state refactor (v2.4.2) by StevenACZ · Pull Request #32 · StevenACZ/SapoWhisper

StevenACZ · 2026-06-15T01:16:01Z

Resumen

Segunda ronda de audit follow-ups sobre la rama de deep-audit, más el primer slice del refactor estructural diferido. Build Release arm64, strict concurrency complete, 134 tests verdes, secrets-scan limpio.

Cambios principales

Fixed

Pérdida silenciosa de audio en el historial (persistencia serializada).
Retry pegaba audio viejo o texto duplicado.
Deepgram Flux perdía las últimas palabras al detener.
WhisperKit: guarda contra inferencia concurrente en el modelo compartido.
Fidelidad de números en AI polish (subsecuencia ordenada y duplicate-aware).
Versión mostrada y visuales de estado deshabilitado.

Performance

Paginación incremental del historial con orden estable.
Recorder hot path sin asignación de lock por buffer.

Refactor (paso 2 del plan diferido)

Protocolo TranscriptionEngineSession (readiness/busy) que retira 3 switch currentEngine duplicados tras un único mapeo; cada transcriber declara su estado. + tests de caracterización.

Infra / seguridad

Cierre de 2 data races del audio-stack bajo strict concurrency.
Redacción de secretos reforzada y clasificación 4xx no-reintentable.
CI en GitHub Actions (lint, secrets-scan, build, tests).

Detalle completo en CHANGELOG.md [2.4.2]. Release objetivo v2.4.2 (build 7).

- Read app version at runtime from the bundle (CFBundleShortVersionString / MARKETING_VERSION) instead of a hardcoded 2.3.0, so the About tab and the settings export never drift from the shipped version. - Paint the neutral gray gradient for .noModel/.error in the menu bar action button via an exhaustive switch mirroring buttonColor, so foreground and background can no longer diverge. - Add digit (0-9) and common punctuation labels to HotkeyKeyName so keycaps no longer render raw 'Key18' for Option+1 and similar combos.

- Reject silently-changed numbers in the polish fidelity guard: numeric anchors now match whole numeric tokens, so "5" no longer survives inside "15", "5.5", "5,000" or "5:30" (their punctuation is semantic). - Add a non-retryable .clientError kind for HTTP 4xx (400/404/405/413/415/422) and map RecordingError.permissionDenied to .notConfigured, so the user no longer gets a Retry button for errors that a retry cannot fix. - Resume (not drop) a pending stop continuation in both streaming transcribers' cleanupWebSocket, preventing a leaked CheckedContinuation and a hung stop. - Gate the legacy temp-audio sweep by file age so a crash mid-write no longer has its still-recoverable WAV deleted. - Name history audio files by UUID instead of a float timestamp to avoid silent collisions that lost audio. - Make log credential redaction deterministic (ordered array, not a dict).

- AudioLevelMonitor: the mic-test sample file (sampleFile/sampleWriteActive) was written on the main actor and read+written from the nonisolated tap thread with no synchronization. Guard both with a dedicated os_unfair_lock and route the tap write through writeSampleBuffer (mic-test path, low-frequency). - AudioRecorder: lastInputBufferTime was written by the tap and read by the capture diagnostics/health probe outside captureStateLock. Move its write into registerInputBuffer (already lock-held every buffer, so no new hot-path lock), read it under the lock, and reset it via a guarded helper. PasteManager/VocabularyManager were also flagged but are false positives: with SWIFT_DEFAULT_ACTOR_ISOLATION=MainActor those unannotated classes are already MainActor-isolated, so there is no cross-isolation race.

The documented PR gate (make ci-check) was honor-system only: nothing ran the suite unless a human invoked make. Add a macOS workflow that lints (swift-format strict), scans for secrets, and runs the Debug build plus unit tests on push to main and on every PR. No signing identity is needed: SigningDefaults.xcconfig falls back to ad-hoc signing, so the build runs without the local override.

…ping - Add VocabularyManagerTests for the replacement/keyterm logic that runs on every cloud session: whole-word + spoken-punctuation replacements, saved-terms-first payload ordering, length/count limits, and the singular Deepgram keyterm param. A new injectable fileURL init keeps the suite off the user's vocabulary.json. - Extend TranscriptionFailureTests to cover from(): URLError/RecordingError/ WhisperKitError mapping, the new non-retryable .clientError 4xx path, and RecordingError.permissionDenied -> .notConfigured.

processAudioBuffer allocated an OSAllocatedUnfairLock on every input buffer (~47/s) just to back the converter input block's consumed flag, even though that block runs synchronously inside convert() on the same thread (AVFAudio is imported @preconcurrency, so the closure is not forced @sendable). Replace it with a plain captured flag: no heap allocation, identical drain semantics.

- HistoryView.loadMoreIfNeeded now fetches only the next page by offset and appends it, instead of re-fetching every page loaded so far (which made a long scroll O(pages^2) in rows read). fetchEntries already supports offset on the FTS and LIKE paths. - Tie-break the query by id DESC so rows sharing a timestamp keep a stable order across separate LIMIT/OFFSET page queries, preventing duplicated or skipped rows. Add tests for paging coverage and the stable newest-first order.

The app target already builds with SWIFT_STRICT_CONCURRENCY=complete, but the test target was still on targeted, so test code (and the app code it links) was not held to the same data-race checks. Raise it to complete: the suite compiles and passes unchanged, closing the gap before any future Swift 6 language-mode move.

Note the test target now builds strict-complete; the fidelity guard's whole-numeric-token matching; the 4xx/.clientError and permissionDenied failure mapping; incremental + stable-order history paging and UUID audio names; CI on GitHub Actions and the expanded test coverage. Add an owner-gated, slice-by-slice plan for the deferred ViewModel/structural refactor.

The numeric fidelity check deduplicated anchors into a Set, so a number swap (5 tickets/sprint 6 -> 6 tickets/sprint 5) or a changed duplicate (5 and 5 -> 5 and 15) survived the guard. Extract numbers as an ordered sequence with duplicates and require them to survive as an ordered subsequence of the polished numbers; exclude digits embedded in URLs/emails (those keep their own literal anchor). Adds swap/duplicate/moved-URL coverage.

Redact Bearer/Token (Deepgram scheme) tokens even in colon-space header form: the generic key pattern stopped at the space and stranded the token, so the auth-scheme pattern now runs first and covers base64/JWT charsets. Map every other 4xx (406/409/410/...) to the non-retryable .clientError instead of falling through to retryable .unknown. Redact the OpenAI-compatible provider error body before it is logged (.public) or shown in Settings. SettingsTransferManager presence check gates on KeychainStore.hasValue so a settings import cannot trigger the macOS Keychain consent prompt.

… lock The streaming twin of the recorder timestamp was written bare on the tap thread and read off audioSetupQueue in the health probe and diagnostics — the same data race already closed in AudioRecorder. Publish it inside registerInputBuffer under captureStateLock, read it via currentLastInputBufferTime(), and reset it via resetLastInputBufferTime().

Retry left lastFailedAudioURL pointing at a PRIOR session after a recording was interrupted (audio deleted) or blocked offline before start, so Retry could retranscribe and paste the wrong session. Clear the retry state on those non-persisting failure paths. retryTranscription also had no reentrancy guard, so a double Retry transcribed and pasted the same audio twice: guard with isRetryInFlight, cleared in the task defer (success and failure); a failed retry still preserves the audio for another attempt.

VocabularyManager.save() writes atomically so a crash mid-write cannot truncate vocabulary.json. AGENTS.md documents the StreamingAudioCapture lock and the ordered numeric-subsequence fidelity rule.

…up queue audioEngine/recordingURL were re-assigned off-queue right after being set inside audioSetupQueue, and pause/resume touched audioEngine on the main actor. Both could race recoverCapture, which nils and rebuilds audioEngine on audioSetupQueue when a USB mic fires an AVAudioEngineConfigurationChange at capture start, concurrently retain/releasing the engine reference. Return Void from the setup continuation (engine and url are already assigned once inside the queue) and wrap the pause/resume engine calls in audioSetupQueue.sync, so the engine lifecycle is single-threaded on that queue like stopRecording.

WhisperKit wraps a single non-reentrant model. The hotkey start gate blocks a colliding live dictation, but retranscribing a history entry reaches transcribe() on a path that bypasses it, so two inferences could run on the model at once. Reject the second call with WhisperKitError.transcriptionInProgress, mapped through TranscriptionFailure.from as a transient retryable failure.

isStopping is set before finishAndWait and before waitForFinalTranscript installs its continuation, so the EndOfTurn the server flushes after CloseStream could arrive while stopContinuation was nil and be dropped, making stop ride the full 4s timeout. Track that CloseStream was requested and that the final turn arrived after it, and return the accumulated transcript immediately, mirroring the ElevenLabs committed-count guard.

save() runs the orphan/size sweep on every insert, but completed dictations copied their WAV before inserting the row on detached tasks, so a concurrent save's sweep could delete a freshly copied WAV before its row referenced it — losing saved audio silently and breaking retry/playback. Serialize copy -> insert -> sweep through a recursive persistenceLock, add an atomic persistEntry API, and drop the saveAudioFile forwarding so the split cannot be reintroduced. Match orphaned files by their UUID name instead of the full path so path normalization differences never strand a referenced file. Add a targeted duration(for:) query so retry no longer materializes the whole table via fetchAll(). Adds HistoryAudioPersistenceTests covering the persist invariant and the concurrent persist-vs-sweep regression.

Replace the two byte-identical DeepgramModeButton/ElevenLabsModeButton views with a single reusable EngineModeButton. Remove the unused ObservableObject conformance, cancellables set, and Combine import from DockIconManager.

…ptionEngineSession Engine readiness/busy state was branched in three places (isEngineReady, isSelectedEngineBusy, canRecord), each repeating which transcriber and which property back each logical engine. Introduce a read-only TranscriptionEngineSession protocol (isReady/isBusy) that every transcriber declares next to its own state, plus an EngineSessions value type deriving readiness (single source) and busy (OR across the engine's sessions). The ViewModel maps an engine to its sessions in one place and derives all three queries from it; behavior is unchanged. Add characterization tests pinning the aggregation semantics (including the live-sibling-busy case) and the engine-to-sessions mapping by identity.

Bump MARKETING_VERSION to 2.4.2 (build 7) and record the audit follow-up reliability fixes, faster history paging, and internal hardening in the changelog.

The runner's bundled swift-format (Xcode 16.4 → 6.1) can't decode the .swift-format config authored against current Xcode's 6.3. Install swift-format via Homebrew and point the lint step at it so CI matches the local toolchain.

The runner has no gitleaks, so the secret scan errored out. Install it alongside swift-format in the tools step.

The project relies on SWIFT_DEFAULT_ACTOR_ISOLATION = MainActor and SWIFT_APPROACHABLE_CONCURRENCY, which need Swift 6.2+ (Xcode 26). The macos-15 image ships Xcode 16.4 / Swift 6.1, which ignores those settings and fails the build with nonisolated-context errors (e.g. PasteManager). Move CI to the GA macos-26 image so it matches the local toolchain.

Audit follow-ups: reliability hardening + engine-state refactor (v2.4.2)

StevenACZ added 24 commits June 13, 2026 22:26

chore(vocab): atomic vocabulary write and audit-followup doc notes

3576b35

VocabularyManager.save() writes atomically so a crash mid-write cannot truncate vocabulary.json. AGENTS.md documents the StreamingAudioCapture lock and the ordered numeric-subsequence fidelity rule.

chore(release): prepare 2.4.2

16d33d9

Bump MARKETING_VERSION to 2.4.2 (build 7) and record the audit follow-up reliability fixes, faster history paging, and internal hardening in the changelog.

ci: lint with a config-compatible swift-format

c729b7c

The runner's bundled swift-format (Xcode 16.4 → 6.1) can't decode the .swift-format config authored against current Xcode's 6.3. Install swift-format via Homebrew and point the lint step at it so CI matches the local toolchain.

ci: install gitleaks for the secret-scan step

0291b80

The runner has no gitleaks, so the secret scan errored out. Install it alongside swift-format in the tools step.

StevenACZ merged commit 2b755fd into main Jun 15, 2026
2 checks passed

StevenACZ deleted the fix/audit-followups-2026-06-13 branch June 15, 2026 01:41

StevenACZ pushed a commit that referenced this pull request Jun 21, 2026

Merge pull request #32 from sapowhisper/fix/audit-followups-2026-06-13

4e7413d

Audit follow-ups: reliability hardening + engine-state refactor (v2.4.2)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audit follow-ups: reliability hardening + engine-state refactor (v2.4.2)#32

Audit follow-ups: reliability hardening + engine-state refactor (v2.4.2)#32
StevenACZ merged 24 commits into
mainfrom
fix/audit-followups-2026-06-13

StevenACZ commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

StevenACZ commented Jun 15, 2026

Resumen

Cambios principales

Fixed

Performance

Refactor (paso 2 del plan diferido)

Infra / seguridad

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant