Audit follow-ups: reliability hardening + engine-state refactor (v2.4.2)#32
Merged
Conversation
- Read app version at runtime from the bundle (CFBundleShortVersionString / MARKETING_VERSION) instead of a hardcoded 2.3.0, so the About tab and the settings export never drift from the shipped version. - Paint the neutral gray gradient for .noModel/.error in the menu bar action button via an exhaustive switch mirroring buttonColor, so foreground and background can no longer diverge. - Add digit (0-9) and common punctuation labels to HotkeyKeyName so keycaps no longer render raw 'Key18' for Option+1 and similar combos.
- Reject silently-changed numbers in the polish fidelity guard: numeric anchors now match whole numeric tokens, so "5" no longer survives inside "15", "5.5", "5,000" or "5:30" (their punctuation is semantic). - Add a non-retryable .clientError kind for HTTP 4xx (400/404/405/413/415/422) and map RecordingError.permissionDenied to .notConfigured, so the user no longer gets a Retry button for errors that a retry cannot fix. - Resume (not drop) a pending stop continuation in both streaming transcribers' cleanupWebSocket, preventing a leaked CheckedContinuation and a hung stop. - Gate the legacy temp-audio sweep by file age so a crash mid-write no longer has its still-recoverable WAV deleted. - Name history audio files by UUID instead of a float timestamp to avoid silent collisions that lost audio. - Make log credential redaction deterministic (ordered array, not a dict).
- AudioLevelMonitor: the mic-test sample file (sampleFile/sampleWriteActive) was written on the main actor and read+written from the nonisolated tap thread with no synchronization. Guard both with a dedicated os_unfair_lock and route the tap write through writeSampleBuffer (mic-test path, low-frequency). - AudioRecorder: lastInputBufferTime was written by the tap and read by the capture diagnostics/health probe outside captureStateLock. Move its write into registerInputBuffer (already lock-held every buffer, so no new hot-path lock), read it under the lock, and reset it via a guarded helper. PasteManager/VocabularyManager were also flagged but are false positives: with SWIFT_DEFAULT_ACTOR_ISOLATION=MainActor those unannotated classes are already MainActor-isolated, so there is no cross-isolation race.
The documented PR gate (make ci-check) was honor-system only: nothing ran the suite unless a human invoked make. Add a macOS workflow that lints (swift-format strict), scans for secrets, and runs the Debug build plus unit tests on push to main and on every PR. No signing identity is needed: SigningDefaults.xcconfig falls back to ad-hoc signing, so the build runs without the local override.
…ping - Add VocabularyManagerTests for the replacement/keyterm logic that runs on every cloud session: whole-word + spoken-punctuation replacements, saved-terms-first payload ordering, length/count limits, and the singular Deepgram keyterm param. A new injectable fileURL init keeps the suite off the user's vocabulary.json. - Extend TranscriptionFailureTests to cover from(): URLError/RecordingError/ WhisperKitError mapping, the new non-retryable .clientError 4xx path, and RecordingError.permissionDenied -> .notConfigured.
processAudioBuffer allocated an OSAllocatedUnfairLock on every input buffer (~47/s) just to back the converter input block's consumed flag, even though that block runs synchronously inside convert() on the same thread (AVFAudio is imported @preconcurrency, so the closure is not forced @sendable). Replace it with a plain captured flag: no heap allocation, identical drain semantics.
- HistoryView.loadMoreIfNeeded now fetches only the next page by offset and appends it, instead of re-fetching every page loaded so far (which made a long scroll O(pages^2) in rows read). fetchEntries already supports offset on the FTS and LIKE paths. - Tie-break the query by id DESC so rows sharing a timestamp keep a stable order across separate LIMIT/OFFSET page queries, preventing duplicated or skipped rows. Add tests for paging coverage and the stable newest-first order.
The app target already builds with SWIFT_STRICT_CONCURRENCY=complete, but the test target was still on targeted, so test code (and the app code it links) was not held to the same data-race checks. Raise it to complete: the suite compiles and passes unchanged, closing the gap before any future Swift 6 language-mode move.
Note the test target now builds strict-complete; the fidelity guard's whole-numeric-token matching; the 4xx/.clientError and permissionDenied failure mapping; incremental + stable-order history paging and UUID audio names; CI on GitHub Actions and the expanded test coverage. Add an owner-gated, slice-by-slice plan for the deferred ViewModel/structural refactor.
The numeric fidelity check deduplicated anchors into a Set, so a number swap (5 tickets/sprint 6 -> 6 tickets/sprint 5) or a changed duplicate (5 and 5 -> 5 and 15) survived the guard. Extract numbers as an ordered sequence with duplicates and require them to survive as an ordered subsequence of the polished numbers; exclude digits embedded in URLs/emails (those keep their own literal anchor). Adds swap/duplicate/moved-URL coverage.
Redact Bearer/Token (Deepgram scheme) tokens even in colon-space header form: the generic key pattern stopped at the space and stranded the token, so the auth-scheme pattern now runs first and covers base64/JWT charsets. Map every other 4xx (406/409/410/...) to the non-retryable .clientError instead of falling through to retryable .unknown. Redact the OpenAI-compatible provider error body before it is logged (.public) or shown in Settings. SettingsTransferManager presence check gates on KeychainStore.hasValue so a settings import cannot trigger the macOS Keychain consent prompt.
… lock The streaming twin of the recorder timestamp was written bare on the tap thread and read off audioSetupQueue in the health probe and diagnostics — the same data race already closed in AudioRecorder. Publish it inside registerInputBuffer under captureStateLock, read it via currentLastInputBufferTime(), and reset it via resetLastInputBufferTime().
Retry left lastFailedAudioURL pointing at a PRIOR session after a recording was interrupted (audio deleted) or blocked offline before start, so Retry could retranscribe and paste the wrong session. Clear the retry state on those non-persisting failure paths. retryTranscription also had no reentrancy guard, so a double Retry transcribed and pasted the same audio twice: guard with isRetryInFlight, cleared in the task defer (success and failure); a failed retry still preserves the audio for another attempt.
VocabularyManager.save() writes atomically so a crash mid-write cannot truncate vocabulary.json. AGENTS.md documents the StreamingAudioCapture lock and the ordered numeric-subsequence fidelity rule.
…up queue audioEngine/recordingURL were re-assigned off-queue right after being set inside audioSetupQueue, and pause/resume touched audioEngine on the main actor. Both could race recoverCapture, which nils and rebuilds audioEngine on audioSetupQueue when a USB mic fires an AVAudioEngineConfigurationChange at capture start, concurrently retain/releasing the engine reference. Return Void from the setup continuation (engine and url are already assigned once inside the queue) and wrap the pause/resume engine calls in audioSetupQueue.sync, so the engine lifecycle is single-threaded on that queue like stopRecording.
WhisperKit wraps a single non-reentrant model. The hotkey start gate blocks a colliding live dictation, but retranscribing a history entry reaches transcribe() on a path that bypasses it, so two inferences could run on the model at once. Reject the second call with WhisperKitError.transcriptionInProgress, mapped through TranscriptionFailure.from as a transient retryable failure.
isStopping is set before finishAndWait and before waitForFinalTranscript installs its continuation, so the EndOfTurn the server flushes after CloseStream could arrive while stopContinuation was nil and be dropped, making stop ride the full 4s timeout. Track that CloseStream was requested and that the final turn arrived after it, and return the accumulated transcript immediately, mirroring the ElevenLabs committed-count guard.
save() runs the orphan/size sweep on every insert, but completed dictations copied their WAV before inserting the row on detached tasks, so a concurrent save's sweep could delete a freshly copied WAV before its row referenced it — losing saved audio silently and breaking retry/playback. Serialize copy -> insert -> sweep through a recursive persistenceLock, add an atomic persistEntry API, and drop the saveAudioFile forwarding so the split cannot be reintroduced. Match orphaned files by their UUID name instead of the full path so path normalization differences never strand a referenced file. Add a targeted duration(for:) query so retry no longer materializes the whole table via fetchAll(). Adds HistoryAudioPersistenceTests covering the persist invariant and the concurrent persist-vs-sweep regression.
Replace the two byte-identical DeepgramModeButton/ElevenLabsModeButton views with a single reusable EngineModeButton. Remove the unused ObservableObject conformance, cancellables set, and Combine import from DockIconManager.
…ptionEngineSession Engine readiness/busy state was branched in three places (isEngineReady, isSelectedEngineBusy, canRecord), each repeating which transcriber and which property back each logical engine. Introduce a read-only TranscriptionEngineSession protocol (isReady/isBusy) that every transcriber declares next to its own state, plus an EngineSessions value type deriving readiness (single source) and busy (OR across the engine's sessions). The ViewModel maps an engine to its sessions in one place and derives all three queries from it; behavior is unchanged. Add characterization tests pinning the aggregation semantics (including the live-sibling-busy case) and the engine-to-sessions mapping by identity.
Bump MARKETING_VERSION to 2.4.2 (build 7) and record the audit follow-up reliability fixes, faster history paging, and internal hardening in the changelog.
The runner's bundled swift-format (Xcode 16.4 → 6.1) can't decode the .swift-format config authored against current Xcode's 6.3. Install swift-format via Homebrew and point the lint step at it so CI matches the local toolchain.
The runner has no gitleaks, so the secret scan errored out. Install it alongside swift-format in the tools step.
The project relies on SWIFT_DEFAULT_ACTOR_ISOLATION = MainActor and SWIFT_APPROACHABLE_CONCURRENCY, which need Swift 6.2+ (Xcode 26). The macos-15 image ships Xcode 16.4 / Swift 6.1, which ignores those settings and fails the build with nonisolated-context errors (e.g. PasteManager). Move CI to the GA macos-26 image so it matches the local toolchain.
StevenACZ
pushed a commit
that referenced
this pull request
Jun 21, 2026
Audit follow-ups: reliability hardening + engine-state refactor (v2.4.2)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Resumen
Segunda ronda de audit follow-ups sobre la rama de deep-audit, más el primer slice del refactor estructural diferido. Build Release arm64, strict concurrency
complete, 134 tests verdes, secrets-scan limpio.Cambios principales
Fixed
Performance
Refactor (paso 2 del plan diferido)
TranscriptionEngineSession(readiness/busy) que retira 3switch currentEngineduplicados tras un único mapeo; cada transcriber declara su estado. + tests de caracterización.Infra / seguridad
Detalle completo en
CHANGELOG.md[2.4.2]. Release objetivo v2.4.2 (build 7).