This folder contains the scripts to train a tokenizer using SentencePiece (https://github.com/google/sentencepiece). The tokenizer is trained on the top of the training transcriptions.
python train.py tokenizer.yaml
| Name | Name | Last commit date | ||
|---|---|---|---|---|
parent directory.. | ||||
This folder contains the scripts to train a tokenizer using SentencePiece (https://github.com/google/sentencepiece). The tokenizer is trained on the top of the training transcriptions.
python train.py tokenizer.yaml