Skip to content

Cleanup personalization: optionally bias Apple Speech transcription with the same domain terms #14

Description

@samkeen

Follow-up from the cleanup-personalization feature (CHANGE_LOG 2026-06-15).

Context

We shipped a "Cleanup personalization" group in the Tuning sheet: two free-text fields (Domains, Terms & acronyms) persisted in Tunings.cleanup (CleanupSettings) and surfaced to the cleanup LLM as CleanupPersonalization via CleanupPrompt.system(personalization:). Scope was deliberately kept cleanup-only.

Idea

The same Terms & acronyms the user lists for cleanup are exactly the kind of names/jargon that Apple Speech's contextual biasing (AppleSpeechOptions.contextualStrings) exists to reduce misrecognitions of. Feeding them to the transcriber too would fix domain words at the source, before cleanup ever runs — "list once, help both stages."

Why it's deferred (the tension to resolve)

  • There is already a separate Contextual biasing field on AppleSpeechSettingsSection (tunings.apple.contextualStringsText). Two overlapping inputs would confuse — we'd need to decide: merge them, derive one from the other, or keep both and union at the boundary.
  • Contextual biasing is Apple-only; the cleanup terms are engine-independent. Coupling them blurs that line.
  • Cleanup terms are free prose ("E2B is a Gemma variant"); contextual strings want short discrete tokens. Some parsing/extraction would be needed.

Sketch

  • Option A: at the transcriptionOptions boundary for .apple, union apple.contextualStrings with the comma/term-split of cleanup.terms.
  • Option B: collapse the two UI fields into one shared "Terms & acronyms" that feeds both transcription biasing and cleanup, and drop the separate Contextual biasing field.

Decide the UX first (one field vs two), then wire. No behavior change until then.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions