ASR

Template for Speech Recognition

This folder provides a working, well-documented example for training a seq2seq (+ CTC) speech recognizer model from scratch, based on a few hours of data.

There are three files here:

train.py: the main code file, outlines the entire training process.
train.yaml: the hyperparameters file, sets all parameters of execution.
mini_librispeech_prepare.py: If necessary, downloads and prepares data manifests.

To train the speech recognition model, just execute the following on the command-line:

python train.py train.yaml

We assume you already trained the tokenizer (see ../Tokenizer) and the language model (../LM). Training is done with the mini-librispeech dataset using a CRDNN model for encoding and a GRU for decoding. We pre-train with a larger model to ensure convergence (mini-librispeech is too small for training an e2e model from scratch). In a real case, you can skip pre-training and train from scratch on a larger dataset.

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
inference.yaml		inference.yaml
mini_librispeech_prepare.py		mini_librispeech_prepare.py
train.py		train.py
train.yaml		train.yaml
transcribe_file.py		transcribe_file.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Template for Speech Recognition

FilesExpand file tree

ASR

Directory actions

More options

Directory actions

More options

Latest commit

History

ASR

Folders and files

parent directory

README.md

Template for Speech Recognition