Skip to content

Conversation

pkuzyc
Copy link
Contributor

@pkuzyc pkuzyc commented Sep 24, 2025

PR types

New features

PR changes

Models

Description

Supports droping the masked tokens in MoE dispatching and fix some bugs in sequence parallel for deepseek v3 model.

Copy link

paddle-bot bot commented Sep 24, 2025

Thanks for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant