Fix: Quantized Gemma3 was missing Sliding Window attention and GELU activations by DrJesseGlass · Pull Request #3326 · huggingface/candle

DrJesseGlass · 2026-01-23T20:55:42Z

Changes quantized Gemma3 MLP activation from SiLU to GELU to match the model's
hidden_activation: gelu_pytorch_tanh config.

Issue #3299 pointed out strange behavior in quantized_gemma3 which was narrowed down to sliding window. The existing implementation of quantized_gemm3 did not have the sliding window implemented. So a design following gemma3's existing working sliding cache was copied to the quantized_gemma3. It has been verified working successfully.

Impact

For the SiLU to GELU we could see an immediate improvement in response for models that inherit from Gemma3, notably TranslateGemma. With SiLU, TranslateGemma produced partially untranslated output (English tokens leaking
through). With GELU, translations are fully in the target language.

Testing

Tested with TranslateGemma-4b-it (Q4_K_M quantization):

English → Swahili with SiLU: Mixed English/Swahili with untranslated phrases leaking through, mistranslations ("artificial intelligence" → "Kiwanda" [factory]), and incorrect word choices ("mtu" instead of "binadamu" for "human")
English → Swahili with GELU: Coherent fully-translated output with correct terminology ("siku zijazo" for "the future", "binadamu" for "human being")

corrected gemma3 activiation fn in quantized

740b3cd

DrJesseGlass marked this pull request as ready for review January 23, 2026 20:55

DrJesseGlass added 2 commits January 28, 2026 14:20

integrated sliding window kv cache

67c7a7e

contig before kv append

6600ef1

DrJesseGlass changed the title ~~Fix: Use GELU activation in quantized Gemma3 (matches config)~~ Fix: Quantized Gemma3 was missing Sliding Window attention and GELU activations Jan 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Quantized Gemma3 was missing Sliding Window attention and GELU activations#3326

Fix: Quantized Gemma3 was missing Sliding Window attention and GELU activations#3326
DrJesseGlass wants to merge 3 commits intohuggingface:mainfrom
DrJesseGlass:fix-quantized-gemma3

DrJesseGlass commented Jan 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

DrJesseGlass commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Impact

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

DrJesseGlass commented Jan 23, 2026 •

edited

Loading