Update Hugging Face LLM example for Gemma 4 by bbrianxiao · Pull Request #1523 · google-deepmind/open_spiel

bbrianxiao · 2026-04-26T05:13:14Z

Summary

Updates the Hugging Face LLM fine-tuning example to support Gemma 4 E2B IT.

Main changes:

Switches the model to google/gemma-4-E2B-it
Updates Hugging Face dependencies for Gemma 4 compatibility
Uses QLoRA with 4-bit quantization
Sets use_reentrant=False for gradient checkpointing
Freezes non-language Gemma 4 modules for text-only fine-tuning
Uses target_modules="all-linear" for LoRA to avoid Gemma 4 wrapper-module incompatibility
Adds EOS tokens to response labels during tokenization
Fixes inference to slice generated tokens instead of slicing decoded strings

Motivation

The existing Hugging Face example was written for an earlier Gemma model. Gemma 4 has a different architecture with multimodal components, so the example needs a few updates to work reliably for text-only game-playing fine-tuning.

This keeps the OpenSpiel/MCTS data generation flow unchanged, while updating the Hugging Face training path for Gemma 4.

Testing

Tested the notebook in Colab using an H100 GPU.

The QLoRA setup successfully loaded google/gemma-4-E2B-it, attached LoRA adapters, and trained with a small fraction of trainable parameters:

trainable params: 37,920,768
all params: 5,142,218,272
trainable%: 0.7374
model loaded. first param device: cuda:0

google-cla · 2026-04-26T05:13:19Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

lanctot · 2026-04-26T12:13:43Z

Thanks!

Update Hugging Face LLM example for Gemma 4

ce7ce88

bbrianxiao force-pushed the hf_gemma4_update branch from 865141b to ce7ce88 Compare April 26, 2026 05:21

bbrianxiao marked this pull request as draft April 26, 2026 05:22

bbrianxiao marked this pull request as ready for review April 26, 2026 05:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Hugging Face LLM example for Gemma 4#1523

Update Hugging Face LLM example for Gemma 4#1523
bbrianxiao wants to merge 1 commit into
google-deepmind:masterfrom
bbrianxiao:hf_gemma4_update

bbrianxiao commented Apr 26, 2026

Uh oh!

google-cla Bot commented Apr 26, 2026

Uh oh!

lanctot commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bbrianxiao commented Apr 26, 2026

Summary

Motivation

Testing

Uh oh!

google-cla Bot commented Apr 26, 2026

Uh oh!

lanctot commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants