Skip to content

feat: restrict lingua validator to 24 supported languages#145

Merged
MaleicAcid merged 1 commit into
zh-plus:masterfrom
iautolab:feat/trim-lingua-models
May 22, 2026
Merged

feat: restrict lingua validator to 24 supported languages#145
MaleicAcid merged 1 commit into
zh-plus:masterfrom
iautolab:feat/trim-lingua-models

Conversation

@MaleicAcid

Copy link
Copy Markdown
Collaborator

Description:
ChunkedTranslateValidator and AtomicTranslateValidator previously used LanguageDetectorBuilder.from_all_languages(), loading all 75 lingua language models into the binary at build time (~88 MB).
Switch to from_languages() with the 24 languages already defined in defaults.supported_languages_lingua, matching the same restricted set used by utils.detect_lang(). The unused 51 language model directories (~56 MB) can now be safely removed at packaging time.
Fix a coincidental name collision: lingua.Language shadowed the earlier from langcodes import Language, breaking Language.get(). Aliased the two imports as Langcode and LinguaLanguage respectively.

@MaleicAcid MaleicAcid merged commit 8480c37 into zh-plus:master May 22, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant