-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable fp8 on sm89 #3624
Enable fp8 on sm89 #3624
Conversation
!test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you! now I can see the same traces as in Lightning-AI/lightning-thunder#1551 on my environment with RTX6000 Ada, with a diff in thunder
Does that technically mean we only support CUDA 12+ for this feature? |
good call. I think I should conditionally relax this one, depending on the build time CUDA version. |
!test |
!test |
Fixing a version check for fp8 support. bump nvfuser version for PR #3624, Framework integration needs to guard against versions in order to decide whether to send fp8 operations to nvfuser
@jjsjann123 I'm seeing an error on RTX 6000 (sm_89):
|
Did I get the cuda TK check wrong?! I thought CUDA TK version would determine PTX ISA version... Are you running this in a container? I'm curious how the setup is like. |
This is on my own container with 12.6. |
wait, it's not complaining about fp8 though..
|
looks like cvt to/from bf16 does require sm_90. https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cvt I wonder why our check is only requiring sm_80. Fuser/csrc/device_lower/analysis/device_version.cpp Lines 17 to 21 in 6466834
Looks like this is just a test thing. I'll update that along with the checks. Thanks for raising the issue @naoyam |
|
fp8's supported has been lifted to sm89 since PTX ISA 8.1 and later per https://docs.nvidia.com/cuda/parallel-thread-execution/