Skip to content

GRPO Trainer#1020

Open
michaelbenayoun wants to merge 84 commits intomainfrom
grpo
Open

GRPO Trainer#1020
michaelbenayoun wants to merge 84 commits intomainfrom
grpo

Conversation

@michaelbenayoun
Copy link
Member

@michaelbenayoun michaelbenayoun commented Nov 4, 2025

What does this PR do?

This PR adds partial support for GRPO.

It was broken down into smaller PRs:

It adds the NeuronGRPOTrainer with a set of optimizations and modifications for the Torch XLA backend used to run things on Trainium instances. There are still core missing features:

  • Integration with vLLM: we use a custom CPU vLLM hack for now. The plan is to work on the vLLM part on another PR.
  • Weight Synchronization NeuronGRPOTrainer <-> vLLM
  • No tensor parallelism

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@github-actions
Copy link

This PR is stale because it has been open 15 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Jan 21, 2026
@github-actions
Copy link

This PR was closed because it has been stalled for 5 days with no activity.

@github-actions github-actions bot closed this Jan 26, 2026
@github-actions github-actions bot removed the Stale label Jan 31, 2026
Comment on lines +59 to +63
if not self.experimental:
raise ValueError(
"NeuronGRPOTrainer is experimental and not production-ready. To proceed, set `experimental=True` in "
"your NeuronGRPOConfig. This flag exists to ensure users are aware of the current state of the implementation."
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now we disable the access to the NeuronGRPOTrainer

@michaelbenayoun michaelbenayoun marked this pull request as ready for review February 4, 2026 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants