Skip to content

Modified conformer warmup#2566

Draft
asumagic wants to merge 2 commits intospeechbrain:developfrom
asumagic:modified-conformer-warmup
Draft

Modified conformer warmup#2566
asumagic wants to merge 2 commits intospeechbrain:developfrom
asumagic:modified-conformer-warmup

Conversation

@asumagic
Copy link
Collaborator

@asumagic asumagic commented Jun 6, 2024

What does this PR do?

Very simple experiment to try switching away from Noam scheduling with warmup to a layer-wise skip mechanism for warmup inspired by k2 (as explained in https://medium.com/@nadirapovey/next-gen-kaldi-reworked-conformer-model-8a3828f364af).

In theory, this might allow initial convergence happen much earlier in training.

Will turn into a proper PR if this works well.

Before submitting
  • Did you read the contributor guideline?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you list all the breaking changes introduced by this pull request?
  • Does your code adhere to project-specific code style and conventions?

PR review

Reviewer checklist
  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified
  • Confirm that the changes adhere to compatibility requirements (e.g., Python version, platform)
  • Review the self-review checklist to ensure the code is ready for review

@mravanelli
Copy link
Collaborator

@asumagic, at which stage we are with that? It looks like some tests are failing

@mravanelli mravanelli requested a review from TParcollet June 17, 2024 16:20
@mravanelli
Copy link
Collaborator

@asumagic, what the status of this PR?

@mravanelli mravanelli added the enhancement New feature or request label Jun 17, 2024
Use modified Conformer warmup (from k2) in LibriSpeech RNN-T
@asumagic asumagic force-pushed the modified-conformer-warmup branch from 277f653 to ca93780 Compare November 5, 2024 09:21
@asumagic asumagic force-pushed the modified-conformer-warmup branch from 22ac4d6 to aee71b4 Compare November 5, 2024 09:30
@asumagic
Copy link
Collaborator Author

asumagic commented Nov 5, 2024

Putting on hold, may or may not pick it up again after other changes that warrant retraining the model.

The approach is generic enough that it probably doesn't need to be implemented at Conformer level (rather at TransformerASR level). Also should maybe rename away from "scheduler" since this is not a LR scheduler?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request on hold

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants