Skip to content

feat(aws): add auto language detection and mid-stream language switch…#5435

Merged
tinalenguyen merged 8 commits intolivekit:mainfrom
cldsime:feature/auto-language-detection
Apr 24, 2026
Merged

feat(aws): add auto language detection and mid-stream language switch…#5435
tinalenguyen merged 8 commits intolivekit:mainfrom
cldsime:feature/auto-language-detection

Conversation

@cldsime
Copy link
Copy Markdown
Contributor

@cldsime cldsime commented Apr 13, 2026

Summary

Adds support for Amazon Transcribe's automatic language identification parameters to the livekit-plugins-aws STT plugin, enabling auto language detection and mid-stream language switching without requiring users to manually specify a language_code.

Changes

6 new parameters added to STTOptions and STT.__init__():

  • identify_language — detect the dominant language for the stream
  • identify_multiple_languages — detect language switches mid-stream
  • language_options — comma-separated list of expected language codes (2–12 required)
  • preferred_language — bias detection toward a specific language
  • vocabulary_names — custom vocabularies per language
  • vocabulary_filter_names — vocabulary filters per language

All default to disabled (False / NOT_GIVEN), preserving full backward compatibility.

Conditional config building in SpeechStream._run():

language_code and identify_language/identify_multiple_languages are mutually exclusive per the AWS API. The config builder now conditionally sends one or the other:

if self._opts.identify_language:
    live_config["identify_language"] = True
    ...
elif self._opts.identify_multiple_languages:
    live_config["identify_multiple_languages"] = True
    ...
else:
    live_config["language_code"] = self._opts.language

Bug fix: filtered_config boolean handling

The original filter {k: v for k, v in live_config.items() if v and is_given(v)} silently drops False booleans. Replaced with explicit type checking that correctly preserves booleans, numbers, and NOT_GIVEN values.

Usage

# Existing behavior — unchanged
stt = STT(language="en-US")

Single language auto-detection

stt = STT(identify_language=True, language_options="en-US,es-US,fr-FR")

Multi-language mid-stream switching

stt = STT(
identify_multiple_languages=True,
language_options="en-US,es-US,fr-FR,de-DE,ja-JP,ko-KR,zh-HK,pt-BR,hi-IN,vi-VN,pl-PL,ru-RU",
)

Test Results

Tested with identify_multiple_languages=True and 12 language codes. All 12 configured languages were successfully detected across test sessions with mid-stream switching:

Language Code Detected
English en-US
Spanish es-US
French fr-FR
German de-DE
Japanese ja-JP
Korean ko-KR
Cantonese zh-HK
Portuguese pt-BR
Hindi hi-IN
Vietnamese vi-VN
Polish pl-PL
Russian ru-RU

Sample output showing mid-stream language switching in a single session:

[FINAL] [en-US] Good afternoon, everyone. Welcome to day 4.
[FINAL] [zh-HK] 你聽緊嘅係SBS電台廣東話節目
[FINAL] [vi-VN] Xin kính chào quý vị và các bạn
[FINAL] [pt-BR] Nós estamos aqui de azul hoje porque azul é a cor mais celebrando
[FINAL] [pl-PL] Dzień dobry, przy mikrofonie Joanna Borkowska Surucić

Backward Compatibility

All new parameters default to disabled. Existing code continues to work without any changes.


@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 13, 2026

CLA assistant check
All committers have signed the CLA.

devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Member

@tinalenguyen tinalenguyen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also now support passing multiple detected languages in SpeechData via the source_languages field, could you pass the detected languages there as well?

Comment thread livekit-plugins/livekit-plugins-aws/livekit/plugins/aws/stt.py
devin-ai-integration[bot]

This comment was marked as resolved.

@cldsime cldsime force-pushed the feature/auto-language-detection branch from 3f16cbc to 2a2b9f5 Compare April 23, 2026 20:17
@cldsime cldsime closed this Apr 23, 2026
@cldsime cldsime reopened this Apr 23, 2026
Copy link
Copy Markdown
Member

@tinalenguyen tinalenguyen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for iterating!

@tinalenguyen tinalenguyen merged commit cfb853f into livekit:main Apr 24, 2026
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants