[kimi25 rl part1.1] add weight conversion for kimi25 (weight update for train-infer disaggregation)#1532
Open
GeLee-Q wants to merge 6 commits intoTHUDM:mainfrom
Open
[kimi25 rl part1.1] add weight conversion for kimi25 (weight update for train-infer disaggregation)#1532GeLee-Q wants to merge 6 commits intoTHUDM:mainfrom
GeLee-Q wants to merge 6 commits intoTHUDM:mainfrom
Conversation
Co-authored-by: Chokoyo <zcgu@qq.com> Co-authored-by: sxl1993 <1218197792@qq.com> Co-authored-by: Gao016 <yngao016@163.com> Co-authored-by: yefei12 <xjtu_yefeichen@163.com>
77c1f72 to
0cc8e37
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Thanks to the SGLang RL community members @yefei12 @Chokoyo @gxlvera for their help.
Thanks to the NVIDIA team @wplf for their guidance on Megatron Bridge.
Thanks to our colleagues on the AQ Infra team @Gao016 @sxl1993 and Algorithms team @Swayyyyy @yzlnew for their support.
[kimi25 rl part1.1] Add weight conversion for kimi25 (weight update for train-infer disaggregation)
#1532
[kimi25 rl part1.2] support kimi25 q-lora pairing in bridge update path (weight update for train-infer colocate)
#1753
[kimi25 rl part2] Pass Megatron Bridge provider arguments from the slime config
#1754
[kimi25 rl part3] Support the K25 VL rollout processor and train-time token expansion
#1755
[kimi25 rl part4] Support K25 HF weight conversion between BF16\FP8\INT4
#1757
[Megatron Bridge]
https://github.com/fzyzcjy/Megatron-Bridge/pull/7/commits
All of the code still requires further large-scale validation. The experimental results will be made public after validation is complete.