Skip to content

Releases: FunAudioLLM/Fun-ASR

v1.0.0: Fun-ASR-Nano — 31-Language End-to-End Speech Recognition

25 May 16:40
4d84613

Choose a tag to compare

Fun-ASR-Nano v1.0.0

The first official release of Fun-ASR-Nano, an end-to-end speech recognition large model trained on tens of millions of hours of real speech data.

Highlights

  • 31 languages — Chinese (+ 7 dialects, 26 regional accents), English, Japanese, Korean, Vietnamese, Indonesian, Thai, Malay, Filipino, Arabic, Hindi, and 20+ European languages
  • Speaker diarization — identify who spoke when
  • Timestamps — word-level and sentence-level
  • Hotwords — boost recognition of domain-specific terms
  • Lyrics recognition — accurate transcription under music background
  • vLLM acceleration — 3-5x faster batch inference with WebSocket streaming

Quick Start

```python
from funasr import AutoModel

model = AutoModel(
model="FunAudioLLM/Fun-ASR-Nano-2512",
trust_remote_code=True,
device="cuda:0",
hub="hf"
)
result = model.generate(input="audio.wav")
```

Models

Model Languages Parameters Download
Fun-ASR-Nano-2512 Chinese/English/Japanese + dialects 800M ModelScope · HuggingFace
Fun-ASR-MLT-Nano-2512 31 languages 800M ModelScope · HuggingFace

Links