Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

GRPO: ScaleRL -> Support casting LM Head to FP32
#4303 opened Oct 18, 2025 by pramodith Loading…
4 of 5 tasks
[SFT] Log mean token accuracy from Liger kernel
#4302 opened Oct 18, 2025 by kashif Loading…
5 tasks
⚰️ Remove deprecated
#4301 opened Oct 18, 2025 by qgallouedec Loading…
5 tasks
Tool call
#4300 opened Oct 18, 2025 by qgallouedec Draft
5 tasks
Add CISPO loss option and documentation
#4298 opened Oct 16, 2025 by gustavorubim Loading…
switch to sleep level=2 and split wake-ups in GRPO and RLOO trainers
#4296 opened Oct 16, 2025 by xxrjun Loading…
1 of 5 tasks
fix CI issue for vlm_gemma_3n model
#4278 opened Oct 15, 2025 by kaixuanliu Loading…
Add Humanline
#4261 opened Oct 13, 2025 by Muennighoff Loading…
Remove BestOfNSampler class
#4259 opened Oct 11, 2025 by behroozazarkhalili Loading…
Fix DPO Trainer Bug For Qwen2-VL (Issue 2660)
#4257 opened Oct 11, 2025 by FabianSchuetze Loading…
1 of 3 tasks
Online-dpo-ben
#4252 opened Oct 10, 2025 by burtenshaw Draft
5 tasks
Added SFT LoRA notebook
#4244 opened Oct 9, 2025 by sergiopaniego Loading…
5 tasks
[Utils] fix DataCollatorForChatML
#4231 opened Oct 8, 2025 by kashif Draft
Add support for Python 3.14
#4225 opened Oct 8, 2025 by albertvillanova Loading…
Update max_length explanation for VLM trainers
#4220 opened Oct 7, 2025 by sergiopaniego Loading…
5 tasks
Add trust_remote_code to GRPOConfig
#4186 opened Oct 1, 2025 by muupan Loading…
3 of 4 tasks
🐍 Drop Python 3.9
#4183 opened Sep 30, 2025 by qgallouedec Loading…
Fix GKD Liger memory spike
#4140 opened Sep 24, 2025 by qgallouedec Loading…
ProTip! Updated in the last three days: updated:>2025-10-15.