speech_recognition

Training a Speech Recognizer

This template implements a simple speech recognizer trained on mini-librispeech. In particular, it implements an offline end-to-end attention-based speech recognizer. A tokenizer is used to detect the word token to estimate. Search replies on beam search coupled with an RNN language model.

Training such a system requires the following steps:

Train a tokenizer. Given the training transcriptions, the tokenizers decide which word pieces allocate for training. Most atomic units are character, the least atomic units are words. Most of the time, it is convenient to use tokens that are something in between characters and full words. SpeechBrain relies on the popular SentencePiece for tokenization. To train the tokenizer:

cd Tokenizer
python train.py tokenizer.yaml

Train a LM After having our target tokens, we can train a language model on top of that. To do it, we need some large text corpus (better if the language domain is the same as the one of your target application). In this example, we simply train the LM on top of the training transcriptions:

cd ../LM
python train.py RNNLM.yaml

In a real case, training LM is extremely computational demanding. It is thus a good practice to reuse existing LM or fine-tune them.

Train the speech recognizer At this point, we can train our speech recognizer. In this case, we are using a simple CRDNN model with an autoregressive GRU decoder. An attention mechanism is employed between encoding and decoder. The final sequence of words is retrieved with beamsearch coupled with the RNN LM trained in the previous step. To train the ASR:

cd ../ASR
python train.py train.yaml

This template can help you figure out how to set speechbrain for implementing an e2e speech recognizer. However, in a real case, the system must be trained with much more data to provide acceptable performance. For a competitive recipe with more data, see for instance our recipes on LibriSpeech (https://github.com/speechbrain/speechbrain/tree/develop/recipes/LibriSpeech/ASR).

For more information, please take a look into the "ASR from scratch" tutorial

Name		Name	Last commit message	Last commit date
parent directory ..
ASR		ASR
LM		LM
Tokenizer		Tokenizer
README.md		README.md
mini_librispeech_prepare.py		mini_librispeech_prepare.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Training a Speech Recognizer

FilesExpand file tree

speech_recognition

Directory actions

More options

Directory actions

More options

Latest commit

History

speech_recognition

Folders and files

parent directory

README.md

Training a Speech Recognizer