Deepspeech

This repository contains a Tensorflow implementation of Baidu's Deepspeech architecture, a neural speech-to-text model. This repository was inspired by the Mozilla's implementation.

Architecture

The model architecture is a sequence-to-sequence Artificial Neural Network (ANN) with Connectionist Temporal Classification (CTC), which maps a given audio file to a series of graphemes (characters) e.g. the english alphabet a-z.

Input

tools.py contains a function which converts a raw audio file of samplerate 16kHz to a series of mel-spaced log-filterbanks features, using a stride of 10ms and window size 20ms.

Model

The model consists of 3 linear layers followed by a recurrent network with options for bidirectionality as well as cell type e.g. GRU, LSTM or RNN, following this is another linear layer and finally a linear projection to form the unnormalised log probabilities (logits) for the CTC input.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
images		images
tests		tests
Deep Speech.pdf		Deep Speech.pdf
Deepspeech1.py		Deepspeech1.py
README.md		README.md
WER.py		WER.py
all_phonemes.txt		all_phonemes.txt
libri_load_data.py		libri_load_data.py
phoneme_set.py		phoneme_set.py
preprocess_timit_data.py		preprocess_timit_data.py
processed_timit_audio.pickle		processed_timit_audio.pickle
processed_timit_phns.pickle		processed_timit_phns.pickle
testfilelist.txt		testfilelist.txt
timit_load_data.py		timit_load_data.py
timit_log_prob.txt		timit_log_prob.txt
tools.py		tools.py
trainfilelist.txt		trainfilelist.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deepspeech

Architecture

Input

Model

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Deepspeech

Architecture

Input

Model

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages