-
Notifications
You must be signed in to change notification settings - Fork 65
Use gfx10 and gfx12 generic targets #937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The HIP Runtime added support for generic targets in ROCm 6.4. We can reduce build size and compile times while enhancing compatibility by taking advantage of these targets where there are no known drawbacks. The gfx12-generic target has identical code generation to gfx1200 and gfx1201 (which are identical to each other in all but name). It is therefore a safe replacement for those targets. The gfx10-3-generic targets has identical code generation to gfx1030, but functions on all RDNA 2 GPUs. It is therefore a good replacement for that target. The gxf10-1-generic target appears to have identical code generation to gfx1010, but it is compatible with all RDNA 1 GPUs. It is therefore a good replacement for that target.
This shouldn't auto merge, you aren't building on Linux or Windows the targets you expected, use "gfxall" label, and perf validation is required. |
Windows build is "python rmake.py -ci -a 'gfx906:xnack-;gfx1030;gfx1100;gfx1101;gfx1102;gfx1151;gfx1200;gfx1201'. If not tested shouldn't be auto-merge to explicitly assume the risk, as I mentioned generics fail in my tests. |
Thanks for the tip on
AFAIK, rocsolver has not done performance tuning for any of these architectures. That's not true of other ROCm libraries, so it makes rocSOLVER a good guinea pig for this sort of change.
Good point. I don't necessarily think that's a problem, though. If the change is not used in the Windows releases then that arguably reduces the risk further. |
Test failure is unrelated. Forcing the merge. |
The HIP Runtime added support for generic targets in ROCm 6.4. We can reduce build size and compile times while enhancing compatibility by taking advantage of these targets where there are no known drawbacks.
The gfx12-generic target has identical code generation to gfx1200 and gfx1201 (which are identical to each other in all but name). It is therefore a safe replacement for those targets.
The gfx10-3-generic targets has identical code generation to gfx1030, but functions on all RDNA 2 GPUs. It is therefore a good replacement for that target.
The gxf10-1-generic target appears to have identical code generation to gfx1010, but it is compatible with all RDNA 1 GPUs. It is therefore a good replacement for that target.