Skip to content

Update Hugging Face LLM example for Gemma 4#1523

Open
bbrianxiao wants to merge 1 commit into
google-deepmind:masterfrom
bbrianxiao:hf_gemma4_update
Open

Update Hugging Face LLM example for Gemma 4#1523
bbrianxiao wants to merge 1 commit into
google-deepmind:masterfrom
bbrianxiao:hf_gemma4_update

Conversation

@bbrianxiao
Copy link
Copy Markdown

Summary

Updates the Hugging Face LLM fine-tuning example to support Gemma 4 E2B IT.

Main changes:

  • Switches the model to google/gemma-4-E2B-it
  • Updates Hugging Face dependencies for Gemma 4 compatibility
  • Uses QLoRA with 4-bit quantization
  • Sets use_reentrant=False for gradient checkpointing
  • Freezes non-language Gemma 4 modules for text-only fine-tuning
  • Uses target_modules="all-linear" for LoRA to avoid Gemma 4 wrapper-module incompatibility
  • Adds EOS tokens to response labels during tokenization
  • Fixes inference to slice generated tokens instead of slicing decoded strings

Motivation

The existing Hugging Face example was written for an earlier Gemma model. Gemma 4 has a different architecture with multimodal components, so the example needs a few updates to work reliably for text-only game-playing fine-tuning.

This keeps the OpenSpiel/MCTS data generation flow unchanged, while updating the Hugging Face training path for Gemma 4.

Testing

Tested the notebook in Colab using an H100 GPU.

The QLoRA setup successfully loaded google/gemma-4-E2B-it, attached LoRA adapters, and trained with a small fraction of trainable parameters:

trainable params: 37,920,768
all params: 5,142,218,272
trainable%: 0.7374
model loaded. first param device: cuda:0

@google-cla
Copy link
Copy Markdown

google-cla Bot commented Apr 26, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@bbrianxiao bbrianxiao marked this pull request as draft April 26, 2026 05:22
@bbrianxiao bbrianxiao marked this pull request as ready for review April 26, 2026 05:22
@lanctot
Copy link
Copy Markdown
Collaborator

lanctot commented Apr 26, 2026

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants