generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add num_generations_eval parameter for efficient evaluation
#4458
opened Nov 5, 2025 by
mingxuetian
Loading…
Prevent upcasting layers in
prepare_model_for_kbit_training
#4457
opened Nov 5, 2025 by
sergiopaniego
Loading…
5 tasks
[GFPO] fix the GFPO loss calculation error caused by unmodified old_per_token_logps
#4454
opened Nov 5, 2025 by
Peter-Chou
Loading…
2 tasks done
added 10 papers (+trainer cross-links) for #4407
#4441
opened Nov 3, 2025 by
SSusantAchary
Loading…
4 tasks done
docs: add KTO (2402.01306) to Paper Index + link ref to KTOTrainer
#4440
opened Nov 3, 2025 by
SSusantAchary
Loading…
refactor: Move judges to experimental submodule
#4439
opened Nov 3, 2025 by
behroozazarkhalili
Loading…
refactor: Move Mergekit integration to experimental submodule
#4438
opened Nov 3, 2025 by
behroozazarkhalili
Loading…
docs: Unify model examples to use trl-lib namespace
#4431
opened Nov 2, 2025 by
behroozazarkhalili
Loading…
docs: Add PEFT subsection to reducing memory usage guide
#4430
opened Nov 2, 2025 by
behroozazarkhalili
Loading…
docs: Expand speeding up training guide with acceleration methods
#4428
opened Nov 2, 2025 by
behroozazarkhalili
Loading…
docs: Expand training customization examples
#4427
opened Nov 2, 2025 by
behroozazarkhalili
Loading…
4 tasks done
Replace flash attention2 with kernels-community/flash-attn2
#4426
opened Nov 2, 2025 by
tamoghnokandar
Loading…
4 of 5 tasks
docs: Extend CLI basic usage examples to all supported CLIs
#4425
opened Nov 2, 2025 by
behroozazarkhalili
Loading…
docs: Rewrite PEFT integration guide with comprehensive examples
#4421
opened Nov 2, 2025 by
behroozazarkhalili
Loading…
[OpenENV] Openenv rollout_func signature proposal
#4344
opened Oct 27, 2025 by
kashif
Loading…
5 tasks
Use explicit tiny-Qwen2ForCausalLM-2.5 model_id param in CI tests
#4331
opened Oct 23, 2025 by
albertvillanova
Loading…
refactor: simplify parameter freezing in modeling_base.py
#4305
opened Oct 20, 2025 by
Ki-Seki
Loading…
2 of 5 tasks
[SFT] Log mean token accuracy from Liger kernel
#4302
opened Oct 18, 2025 by
kashif
Loading…
5 tasks
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.