-
Notifications
You must be signed in to change notification settings - Fork 31k
[Gemma Embedding] Fix SWA
#40700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Gemma Embedding] Fix SWA
#40700
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
Alternatively, we could change the Edit: See https://github.com/vasqu/transformers/pull/1/files |
|
Nice! Thanks for the fix! Indeed, I'd rather change the flag in the init if you don't mind, so we don't have to change the sdpa integration! 🤗 |
Gemma Embedding] Fix Flash Attention usageGemma Embedding] Fix SWA
|
[For maintainers] Suggested jobs to run (before merge) run-slow: gemma3, gemma3n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks a lot!
* fix gemma embedding flash attention * fix sdpa * fix atttempt number 2 * alternative gemma fix * fix modular
The issue at hand
(SW-1, SW // 2)instead of the intended(SW // 2, SW // 2)(SW-1, SW-1)The fix
Scripts
For sanity check, use the following script:
For visualization:
And insert/debug right after the masks in the models with
Before fix:


After fix:
cc @Cyrilvallez @tomaarsen