-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
I am trying to retrain an ASR model on LibriSpeech from scratch with this recipe /speechbrain/templates/speech_recognition/ASR.
We are training on an A100 GPU, and even with a low batch size of 8, the model is consuming a very large amount of memory. Could you advise why this might be happening?
We are observing that the training seems to not actually progress — the train loss doesn’t change, and CER and WER remain very high. Any insights on what could be causing this?
Thank you.
I attach ASR configuration: train.yaml.
Expected behaviour
The model should train normally on LibriSpeech from scratch, with train loss gradually decreasing and CER/WER improving over epochs, without consuming excessive GPU memory even at batch size 8.
To Reproduce
No response
Environment Details
No response
Relevant Log Output
Additional Context
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working