Simple Diarization in a Colab

This is a very simple colab putting together a few models from Hugging Face to implement a simple audio to text + diarization script. The audio to text is implemented using different versions of Whisper including a small one fine tuned for Swiss German.

I was looking for this functionality online and only found very confusing blog posts explaining very unnecessary features and sorta obscure pipelines, hence why I decided to re-implement it in what I believe being a very simple way.

The colab does 4 things:

Let you upload an audio file
Run Whisper over the file to get a transcription
Run an additional model to classify speakers and map them to intervals of time
Map whisper transcriptions to the speakers to print a diarization.

Run the colab with a T4 GPU runtime for a somewhat decent speed

Disclaimer

This is not a supported product, it is supposed to be a one off thing, it might break if the imported libraries break.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Diarization.ipynb		Diarization.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simple Diarization in a Colab

Disclaimer

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Simple Diarization in a Colab

Disclaimer

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages