Skip to content

Use gfx10 and gfx12 generic targets #937

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 29, 2025
Merged

Conversation

cgmb
Copy link
Collaborator

@cgmb cgmb commented Apr 29, 2025

The HIP Runtime added support for generic targets in ROCm 6.4. We can reduce build size and compile times while enhancing compatibility by taking advantage of these targets where there are no known drawbacks.

The gfx12-generic target has identical code generation to gfx1200 and gfx1201 (which are identical to each other in all but name). It is therefore a safe replacement for those targets.

The gfx10-3-generic targets has identical code generation to gfx1030, but functions on all RDNA 2 GPUs. It is therefore a good replacement for that target.

The gxf10-1-generic target appears to have identical code generation to gfx1010, but it is compatible with all RDNA 1 GPUs. It is therefore a good replacement for that target.

The HIP Runtime added support for generic targets in ROCm 6.4. We can
reduce build size and compile times while enhancing compatibility by
taking advantage of these targets where there are no known drawbacks.

The gfx12-generic target has identical code generation to gfx1200 and
gfx1201 (which are identical to each other in all but name). It is
therefore a safe replacement for those targets.

The gfx10-3-generic targets has identical code generation to gfx1030,
but functions on all RDNA 2 GPUs. It is therefore a good replacement
for that target.

The gxf10-1-generic target appears to have identical code generation
to gfx1010, but it is compatible with all RDNA 1 GPUs. It is therefore
a good replacement for that target.
@cgmb cgmb enabled auto-merge (squash) April 30, 2025 18:16
@TorreZuk TorreZuk added the gfxall PR to develop build all default gfx label May 1, 2025
@TorreZuk
Copy link
Contributor

TorreZuk commented May 1, 2025

This shouldn't auto merge, you aren't building on Linux or Windows the targets you expected, use "gfxall" label, and perf validation is required.

@TorreZuk TorreZuk disabled auto-merge May 1, 2025 22:14
@TorreZuk
Copy link
Contributor

TorreZuk commented May 1, 2025

Windows build is "python rmake.py -ci -a 'gfx906:xnack-;gfx1030;gfx1100;gfx1101;gfx1102;gfx1151;gfx1200;gfx1201'. If not tested shouldn't be auto-merge to explicitly assume the risk, as I mentioned generics fail in my tests.

@cgmb
Copy link
Collaborator Author

cgmb commented May 6, 2025

This shouldn't auto merge, you aren't building on Linux or Windows the targets you expected, use "gfxall" label

Thanks for the tip on gfxall. You're right that the CI was not giving me the test coverage I expected. These targets are not working in rocprim. I'll file a bug.

and perf validation is required.

AFAIK, rocsolver has not done performance tuning for any of these architectures. That's not true of other ROCm libraries, so it makes rocSOLVER a good guinea pig for this sort of change.

Windows build is "python rmake.py -ci -a 'gfx906:xnack-;gfx1030;gfx1100;gfx1101;gfx1102;gfx1151;gfx1200;gfx1201'.

Good point. I don't necessarily think that's a problem, though. If the change is not used in the Windows releases then that arguably reduces the risk further.

@tfalders
Copy link
Collaborator

Test failure is unrelated. Forcing the merge.

@tfalders tfalders merged commit f971e80 into ROCm:develop May 29, 2025
14 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gfxall PR to develop build all default gfx
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants