Releases: FunAudioLLM/Fun-ASR
Releases · FunAudioLLM/Fun-ASR
v1.0.0: Fun-ASR-Nano — 31-Language End-to-End Speech Recognition
Fun-ASR-Nano v1.0.0
The first official release of Fun-ASR-Nano, an end-to-end speech recognition large model trained on tens of millions of hours of real speech data.
Highlights
- 31 languages — Chinese (+ 7 dialects, 26 regional accents), English, Japanese, Korean, Vietnamese, Indonesian, Thai, Malay, Filipino, Arabic, Hindi, and 20+ European languages
- Speaker diarization — identify who spoke when
- Timestamps — word-level and sentence-level
- Hotwords — boost recognition of domain-specific terms
- Lyrics recognition — accurate transcription under music background
- vLLM acceleration — 3-5x faster batch inference with WebSocket streaming
Quick Start
```python
from funasr import AutoModel
model = AutoModel(
model="FunAudioLLM/Fun-ASR-Nano-2512",
trust_remote_code=True,
device="cuda:0",
hub="hf"
)
result = model.generate(input="audio.wav")
```
Models
| Model | Languages | Parameters | Download |
|---|---|---|---|
| Fun-ASR-Nano-2512 | Chinese/English/Japanese + dialects | 800M | ModelScope · HuggingFace |
| Fun-ASR-MLT-Nano-2512 | 31 languages | 800M | ModelScope · HuggingFace |
Links
- Documentation: https://www.funasr.com
- vLLM Guide: docs/vllm_guide.md
- FunASR toolkit: https://github.com/modelscope/FunASR