-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make DDP/FSDP a regular transform #122
Comments
What is meant by making DDP/FSDP a regular transform? What are you planning to do? I don't see any other way for sharding happen somewhen after the
|
I would like to move 3 up (for thunder.jit). |
Is the preferred order then 3 -> 1 -> 2? |
So per discussions with @crcrpar and @IvanYashchuk (thank you!)
(obviously good ideas from Masaki and Ivan, not so good ones my own) |
triage review — let's start design review with draft PR to discuss |
This is done. |
🚀 Feature
Make DDP/FSDP a regular transform (to a large part including making transforms flexible enough to support this).
Motivation
Currently DDP/FSDP is not a regular transform, leading to things like #94 and limiting composability / sequencing.
One of the key bits is that DDP/FSDP would need to do the adjustments we currently do to the prologue during tracing with DDP/FSDP in the transform, so we need to allow mutation of prologues through transforms. This is also in line with similar needs for other transforms (lora, quantization, but also value-and-grad-things) that change prologue signatures, so this generalization should happen.
cc @carmocca @awaelchli @crcrpar
The text was updated successfully, but these errors were encountered: