You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, good question, I haven't tried it myself and don't have much experience with PEFT. Do you have a use case in hand?
For the forward pass, it should still work if you provide the PEFT'ed model to PP's API.
For the backward pass, we rely on an assumption that the backward flow of gradients have the same size as the forward flow of activations. Do you think this assumption still holds in PEFT case?
I'm not very familiar with pipeline parallelism. Can it work if most of the model's parameters are frozen?
The text was updated successfully, but these errors were encountered: