Skip to content

aaivu/KuralHub

🎙️ KuralHub: A Comprehensive Review of Speech Emotion Recognition (SER) Datasets

Latest Version Interspeech 2026 Website License Contributions


🔥 What is KuralHub?

KuralHub is a comprehensive repository that reviews and benchmarks Speech Emotion Recognition (SER) datasets across multiple languages.
It provides detailed metadata, access links, and benchmarks using fine-tuned monolingual models for SER.

📄 Paper: Accepted at Interspeech 2026 (to appear) 🌐 Website: https://aaivu.github.io/KuralHub/


🗂 Survey Organisation

KuralHub/
│── datasets/             # Language-specific datasets
│   ├── english/
│   │   ├── README.md     # Overview of English SER datasets
│   │   ├── ravdess.md    # Dataset-specific details
│   ├── spanish/
│   │   ├── README.md
│   │   ├── dataset1.md
....

📊 SER Datasets Coverage

This survey covers 70+ languages (with 29 benchmarked), including open-source and restricted datasets.
If a language has no available dataset, it is marked accordingly.

Language Coverage


🚀 Benchmarks

We fine-tune pre-trained SER models on datasets individually and report their performance.

Performance by Datasets

Model Dataset Performance

Average Performance by Languages

Model Language Performance


📥 How to Use

  1. Browse Datasets: Navigate to datasets/ for language-specific SER datasets.
  2. Download Datasets: Follow access links in each dataset file.
  3. Run Benchmarks: Check benchmarks/ for model performance.

🎯 Contribute to KuralHub

💡 Know of a missing dataset? Help us expand KuralHub!
📩 Submit a pull request or open an issue with new datasets.

📖 Contribution Guidelines


📜 Citation

If you are using our research findings, please cite the following paper:

Citation details will be finalized once the paper is published.

@inproceedings{kuralhub2026,
  title     = {KuralHub: A Comprehensive Review of Speech Emotion Recognition Datasets},
  author    = {Thavarasa, Luxshan and Thevakumar, Jubeerathan and Sivatheepan, Thanikan and Thayasivam, Uthayasanker},
  booktitle = {Interspeech},
  year      = {2026},
  note      = {To appear}
}

📬 Contact

🏷️ Name 📧 Email 🔗 LinkedIn 📚 Google Scholar
Luxshan Thavarasa luxshan.20@cse.mrt.ac.lk LinkedIn
Jubeerathan Thevakumar jubeerathan.20@cse.mrt.ac.lk LinkedIn
Thanikan Sivatheepan thanikan.20@cse.mrt.ac.lk LinkedIn
Uthayasanker Thayasivam (supervisor, corresponding author) rtuthaya@cse.mrt.ac.lk LinkedIn

All authors are with the Department of Computer Science & Engineering, University of Moratuwa, Sri Lanka.

🙏 Acknowledgment

We would like to thank Dr. Uthayasanker Thayasivam for his guidance as my supervisor, Braveenan Sritharan for his mentorship, and all the dataset owners for making their datasets available for us through open access or upon request. Your support has been invaluable.

About

An extensive collection of Speech Emotion Recognition (SER) datasets across multiple languages, including English, Mandarin, Hindi, Spanish, Tamil, Arabic, and more. Perfect for training emotion detection models in diverse linguistic and cultural contexts.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Contributors