-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ThunderFX][HF] ValueError: unrecognized type in arguments: <class 'NoneType'> #1482
Comments
In newer container releases, it is shadowed by: #1479 |
The error is because the last output of lightning-thunder/thunder/executors/sdpaex.py Lines 278 to 282 in fef423b
according to https://github.com/pytorch/pytorch/blob/6b430c26bd78cf9f3736e0f9caf23f40e2a867f1/torch/_meta_registrations.py#L5318-L5329 grad_attn_mask is None when grad_input_mask[3] is False even though attn_mask is not None
And in the actual execution, the last output is lightning-thunder/thunder/executors/sdpaex.py Lines 300 to 321 in fef423b
|
🐛 Bug
When running train loop for
Qwen/Qwen2.5-7B-Instruct
, we getValueError
inctx.compiled_backward
:To Reproduce
pjnl-20241120
containerpip install datasets==3.0.2
for creating dummy datasetCode sample
As in the repro steps.
Expected behavior
It should run smoothly as it runs in eager and with default
torch.compile
Environment
As in the container.
Additional context
Happy to provide any information if needed :)
cc @apaz-cli @tfogal
The text was updated successfully, but these errors were encountered: