wraps DDP models with DSD by LucasLLC · Pull Request #857 · meta-pytorch/tnt

LucasLLC · 2024-07-02T14:50:34Z

Summary:
Distributed State Dict is the current suggested way from PyTorch for ensuring parallelized models state dicts are compatible with save/loads in Single process or re-sharding scenarios.

This diff updates dcp_saver to use DSD for DDP models. A good idea would be wrap all models in TNT with DSD, as this could replace some of the wrapper logic for FSDP and would guarantee future compat.

N5551629 also contains a workaround for current DDP model saved before this diff, by manually removing the "module." prefix in the checkpoint.

Differential Revision: D59234083

facebook-github-bot · 2024-07-02T14:50:42Z

This pull request was exported from Phabricator. Differential Revision: D59234083

Summary: Pull Request resolved: #857 Distributed State Dict is the current suggested way from PyTorch for ensuring parallelized models state dicts are compatible with save/loads in Single process or re-sharding scenarios. This diff updates dcp_saver to use DSD for DDP models. A good idea would be wrap all models in TNT with DSD, as this could replace some of the wrapper logic for FSDP and would guarantee future compat. N5551629 also contains a workaround for current DDP model saved before this diff, by manually removing the "module." prefix in the checkpoint. Differential Revision: D59234083

facebook-github-bot · 2024-07-08T15:32:24Z

This pull request was exported from Phabricator. Differential Revision: D59234083

facebook-github-bot added the cla signed label Jul 2, 2024

facebook-github-bot added the fb-exported label Jul 2, 2024

LucasLLC force-pushed the export-D59234083 branch from 5818bb8 to 435b1cb Compare July 8, 2024 15:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wraps DDP models with DSD#857

wraps DDP models with DSD#857
LucasLLC wants to merge 1 commit intomasterfrom
export-D59234083

LucasLLC commented Jul 2, 2024

Uh oh!

facebook-github-bot commented Jul 2, 2024

Uh oh!

facebook-github-bot commented Jul 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LucasLLC commented Jul 2, 2024

Uh oh!

facebook-github-bot commented Jul 2, 2024

Uh oh!

facebook-github-bot commented Jul 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants