Follow-up from the cleanup-personalization feature (CHANGE_LOG 2026-06-15).
Context
We shipped a "Cleanup personalization" group in the Tuning sheet: two free-text fields (Domains, Terms & acronyms) persisted in Tunings.cleanup (CleanupSettings) and surfaced to the cleanup LLM as CleanupPersonalization via CleanupPrompt.system(personalization:). Scope was deliberately kept cleanup-only.
Idea
The same Terms & acronyms the user lists for cleanup are exactly the kind of names/jargon that Apple Speech's contextual biasing (AppleSpeechOptions.contextualStrings) exists to reduce misrecognitions of. Feeding them to the transcriber too would fix domain words at the source, before cleanup ever runs — "list once, help both stages."
Why it's deferred (the tension to resolve)
- There is already a separate Contextual biasing field on
AppleSpeechSettingsSection (tunings.apple.contextualStringsText). Two overlapping inputs would confuse — we'd need to decide: merge them, derive one from the other, or keep both and union at the boundary.
- Contextual biasing is Apple-only; the cleanup terms are engine-independent. Coupling them blurs that line.
- Cleanup terms are free prose ("E2B is a Gemma variant"); contextual strings want short discrete tokens. Some parsing/extraction would be needed.
Sketch
- Option A: at the
transcriptionOptions boundary for .apple, union apple.contextualStrings with the comma/term-split of cleanup.terms.
- Option B: collapse the two UI fields into one shared "Terms & acronyms" that feeds both transcription biasing and cleanup, and drop the separate Contextual biasing field.
Decide the UX first (one field vs two), then wire. No behavior change until then.
Follow-up from the cleanup-personalization feature (CHANGE_LOG 2026-06-15).
Context
We shipped a "Cleanup personalization" group in the Tuning sheet: two free-text fields (Domains, Terms & acronyms) persisted in
Tunings.cleanup(CleanupSettings) and surfaced to the cleanup LLM asCleanupPersonalizationviaCleanupPrompt.system(personalization:). Scope was deliberately kept cleanup-only.Idea
The same
Terms & acronymsthe user lists for cleanup are exactly the kind of names/jargon that Apple Speech's contextual biasing (AppleSpeechOptions.contextualStrings) exists to reduce misrecognitions of. Feeding them to the transcriber too would fix domain words at the source, before cleanup ever runs — "list once, help both stages."Why it's deferred (the tension to resolve)
AppleSpeechSettingsSection(tunings.apple.contextualStringsText). Two overlapping inputs would confuse — we'd need to decide: merge them, derive one from the other, or keep both and union at the boundary.Sketch
transcriptionOptionsboundary for.apple, unionapple.contextualStringswith the comma/term-split ofcleanup.terms.Decide the UX first (one field vs two), then wire. No behavior change until then.