-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] cannot capture your model as a full graph #1132
Comments
Same problem |
This can (at least temporarily) be fixed by getting rid of the autocast at transformers/models/llama/modeling_llama.py freqs = (inv_freq_expanded @ position_ids_expanded).transpose(1, 2) (this is basically taking the llama model back to commit 7628b3a0f40212c0f264233fc6da0d9c9cf88853 of the transformers package) However, after doing this, there seems to still be a problem where the compiled (traced?, split?) model graph seems to not match the original:
|
Can you tell me exactly which line you're replacing? And the way to turn off the kv-cache is revicing "use_cache": true into false in config.json? |
Sorry the kv-cache thing was wrong. I was trying out gpt2 earlier to make sure I can at least run something. Also had the wrong commit number earlier. Was talking about reverting this change in the transformer's library: |
Was able to resolve this by reverting transformers to the last December 2023 commit that passes all tests (3b7675b2b844b02d4821b827871a21ad16dd446c) and the PiPPy v0.2.0 tag. If you need batch chat template decoding then you need to go find the updated utils tokenization base file and the init.py file for that folder accordingly as well. |
I encounter the same problem, so is there any solution to fix it? |
torch version: 2.5.0.dev20240616+cu121
python version: python 3.8
I run the llama example with torchrun --nproc-per-node 2 pippy_llama.py. It got an Error
The text was updated successfully, but these errors were encountered: