Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fix test_train_with_chat_template_kwargs
#4971 opened Feb 4, 2026 by qgallouedec Loading…
Revert change in GRPO from NeMo-Gym Integration
#4970 opened Feb 4, 2026 by qgallouedec Loading…
Deprecate parameters in DPOConfig
#4969 opened Feb 4, 2026 by qgallouedec Loading…
perf: Qwen SAPO loss optimization
#4956 opened Feb 3, 2026 by casinca Loading…
2 of 5 tasks
fix: add gradient checkpointing to PolicyAndValueWrapper
#4955 opened Feb 3, 2026 by lvhungdev Loading…
3 of 5 tasks
Fix ZeRO-3 + PEFT + gradient checkpointing
#4951 opened Feb 2, 2026 by qgallouedec Loading…
OpenEnv clients async support update
#4949 opened Feb 2, 2026 by sergiopaniego Loading…
5 tasks
[Experimental] Add SDFT trainer, config, docs, and tests
#4941 opened Jan 31, 2026 by Shekswess Loading…
4 of 5 tasks
Update RewardFunc type to use RewardCallable protocol
#4938 opened Jan 31, 2026 by amit9oct Loading…
2 of 5 tasks
Expose generation index to tool callables in GRPOTrainer
#4894 opened Jan 25, 2026 by lukehinds Loading…
4 tasks done
Upgrade GitHub Actions to latest versions
#4893 opened Jan 24, 2026 by salmanmkc Loading…
[GRPO] feat: Geometric Sequence Masking
#4891 opened Jan 24, 2026 by LeonEricsson Loading…
5 tasks
Fix grpo tool calling
#4890 opened Jan 23, 2026 by akshayballal95 Loading…
2 tasks done
ProTip! Updated in the last three days: updated:>2026-02-01.