prevent saving a large number of small files #78

LiuXTao · 2026-02-10T09:58:13Z

Originally, the _save_weight_fast function saved each small weight as an individual file. When the number of weights is large, this results in a concentrated burst of creating and deleting a large number of new files in a short period. This not only may put pressure on the distributed file system but is also relatively inefficient. Therefore, I have added a new logic for saving batch files here.

I have verified the correctness, and testing before and after the modifications showed that the save_weights time for an 80B MoE model on 16 GPUs was reduced from 250s to 190s, a decrease of 24%.

prevent saving a large number of small files

cd108a7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prevent saving a large number of small files #78

prevent saving a large number of small files #78

Uh oh!

LiuXTao commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

prevent saving a large number of small files #78

Are you sure you want to change the base?

prevent saving a large number of small files #78

Uh oh!

Conversation

LiuXTao commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant