Modified conformer warmup by asumagic · Pull Request #2566 · speechbrain/speechbrain

asumagic · 2024-06-06T13:43:34Z

What does this PR do?

Very simple experiment to try switching away from Noam scheduling with warmup to a layer-wise skip mechanism for warmup inspired by k2 (as explained in https://medium.com/@nadirapovey/next-gen-kaldi-reworked-conformer-model-8a3828f364af).

In theory, this might allow initial convergence happen much earlier in training.

Will turn into a proper PR if this works well.

Before submitting

Did you read the contributor guideline?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Does your code adhere to project-specific code style and conventions?

PR review

Reviewer checklist

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified
Confirm that the changes adhere to compatibility requirements (e.g., Python version, platform)
Review the self-review checklist to ensure the code is ready for review

mravanelli · 2024-06-17T15:58:47Z

@asumagic, at which stage we are with that? It looks like some tests are failing

mravanelli · 2024-06-17T16:20:25Z

@asumagic, what the status of this PR?

Use modified Conformer warmup (from k2) in LibriSpeech RNN-T

asumagic · 2024-11-05T09:53:09Z

Putting on hold, may or may not pick it up again after other changes that warrant retraining the model.

The approach is generic enough that it probably doesn't need to be implemented at Conformer level (rather at TransformerASR level). Also should maybe rename away from "scheduler" since this is not a LR scheduler?

mravanelli requested a review from TParcollet June 17, 2024 16:20

mravanelli assigned asumagic Jun 17, 2024

mravanelli added the enhancement New feature or request label Jun 17, 2024

Implement modified conformer warmup

ca93780

Use modified Conformer warmup (from k2) in LibriSpeech RNN-T

asumagic force-pushed the modified-conformer-warmup branch from 277f653 to ca93780 Compare November 5, 2024 09:21

Document the warmup scheduler

aee71b4

asumagic force-pushed the modified-conformer-warmup branch from 22ac4d6 to aee71b4 Compare November 5, 2024 09:30

asumagic added the on hold label Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modified conformer warmup#2566

Modified conformer warmup#2566
asumagic wants to merge 2 commits intospeechbrain:developfrom
asumagic:modified-conformer-warmup

asumagic commented Jun 6, 2024

Uh oh!

mravanelli commented Jun 17, 2024

Uh oh!

mravanelli commented Jun 17, 2024

Uh oh!

asumagic commented Nov 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

asumagic commented Jun 6, 2024

What does this PR do?

PR review

Uh oh!

mravanelli commented Jun 17, 2024

Uh oh!

mravanelli commented Jun 17, 2024

Uh oh!

asumagic commented Nov 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants