-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking Issue for AVX512 intrinsics #111137
Comments
@rustbot label O-x86 |
Hello, what are the guidelines to potentially contribute intrinsics? Cheers |
Currently the main blocker for stabilizing AVX-512 intrinsics is that we are still missing some. See these files for the list of missing intrinsics:
There may also be missing intrinsics for some of the other AVX512 subsets, this should be double-checked. |
It seems like most of the intrinsics that are not yet implemented are labeled not in LLVM. Is stabilization blocked on those, or just the ones labeled |
The documents were made quite a few years ago, and should be checked against the equivalent intrinsics in the latest version of Clang. Regarding the "not in llvm", we can skip these since they are supported by neither Clang nor GCC. It seems these are only supported by icc for Xeon Phi targets. |
Not sure if this is a good place to ask, but I'm curious if there are any blockers for stabilizing I previously asked here without a reply: #44839 (comment) |
Yes, this is the right place to ask: essentially this is blocked on the AVX512 baseline intrinsics still being incomplete, see my comment above. |
what is considered baseline ? I see that e.g. _mm512_cvtt_roundpd_epi64 from AVX512DQ is not available today and I don't see an axv512dq.md file in the core arch dir |
I would consider F + VL/DQ/BW as the baseline for initial stabilization of AVX512 intrinsics. The MD files may be somewhat out of date and need someone to double-check against the full list of intrinsics. |
We should resolve rust-lang/stdarch#1533 before stabilizing these intrinsics. |
We also need to consider how this interacts with AVX10 now. In #121088 I made all the |
A dumb question, since this appears to be blocked on some cpu instructions not having a corresponding wrapper function due to downstream compilers not supporting them yet, why not stabilize it peacemeal? The instructions that are already implemented (provided that they do work as advertised) would already help me out a lot. I dont really see the need why all avx512 instruction wrappers need to be stabilized at the same time. |
Here is a more updated list of what is missing in stdarch: # in llvm-project
llvm_512f=$(rg '(?s:static __inline.*?(?P<fn_name>[a-z0-9_]+?)\s*\(|#define (?P<def_name>[a-z0-9_]+)\()' --only-matching --multiline --no-filename -r '$fn_name$def_name' --color=auto clang/lib/Headers/avx512fintrin.h clang/lib/Headers/avx512vlintrin.h | sort)
llvm_512bw=$(rg '(?s:static __inline.*?(?P<fn_name>[a-z0-9_]+?)\s*\(|#define (?P<def_name>[a-z0-9_]+)\()' --only-matching --multiline --no-filename -r '$fn_name$def_name' --color=auto clang/lib/Headers/avx512bwintrin.h | sort)
# in stdarch
stdarch_512f=$(rg 'pub unsafe fn (\w+)' --only-matching -r '$1' --color=auto crates/core_arch/src/x86/avx512f.rs | sort)
stdarch_512bw=$(rg 'pub unsafe fn (\w+)' --only-matching -r '$1' --color=auto crates/core_arch/src/x86/avx512bw.rs | sort)
# Find everything only in llvm but not rust
missing_f=$(echo "$llvm_512f$stdarch_512f" | sort | uniq --unique)
missing_bw=$(echo "$llvm_512bw$stdarch_512bw" | sort | uniq --unique)
# print things that aren't mentioned at all in stdarch
echo "$missing_f" | xargs -IINAME sh -c 'if ! rg INAME > /dev/null ; then echo INAME; fi'
echo "$missing_bw" | xargs -IINAME sh -c 'if ! rg INAME > /dev/null ; then echo INAME; fi' The results are: Missing avx512f intrinsics
Missing avx512bw intrinsics
Not mentioned avx512f intrinsics
Not mentioned avx512bw intrinsics:
|
It looks like we're also missing |
The untracked features |
We really need to upgrade the intrinsics list. Intel has since removed all the |
Who is in "charge" of that question on the rust project side ? It seem a lot of people have changes to the intrinsics lists to contribute but it does not seem like it was updated recently ? |
I am working on a PR to update many aspects of stdarch, including the intrinsics list (rust-lang/stdarch#1594) |
awesome 🙏 |
Generally this is libs team territory - or rather libs-api, I assume, since this is user-visible API. Sadly that team is particularly understaffed. The intrinsics are exposed via the stdarch module, for which @Amanieu seems to be the sole maintainer.
The usual process for API questions is to file an ACP but I do not know whether stdarch also uses that process.
|
We don't use ACPs for stdarch because we don't invent our own APIs and instead follow existing C APIs for arch-specific intrinsics. |
I don't think anything except for avx512vp2intersect is remaining, which needs compiler support for |
At least for intrinsics that operate on 512-bit vectors, we'd have to sort out the evex512 situation first. |
@IceTDrinker If we want to support AVX10/256 in the future, we can't have the implicit avx512 -> evex512 implication and presumably need to explicitly annotate all avx512 intrinsics that use 512-bit vectors with the (Intel is the gift that keeps on giving.) |
😵💫 @nikic no chance there is some metadata somewhere indicating whether an avx512 gated intrinsics uses ZMM registers ? so that nobody has to do that manually ? 😭 and yeah poisoned gift at this point |
very crudely it seems that an intel intrinsics uses at least one ZMM register if and only if the name starts with _mm512 👀 |
In the spirit of @AlexanderSchuetz97's question above:
Is there a way to enable use of compile time target feature detection (i.e. |
Feature gate:
#![feature(stdarch_x86_avx512)]
This is a tracking issue for the AVX-512 (and related extensions) intrinsics in
core::arch
.Public API
This feature covers all of the intrinsics from the following features:
avx512bf16
avx512bitalg
avx512bw
avx512cd
avx512f
avx512ifma
avx512vbmi
avx512vbmi2
avx512vnni
avx512vpopcntdq
gfni
vaes
vpclmulqdq
VEX variants
avxifma
avxneconvert
avxvnni
avxvnniint16
avxvnniint8
Implementation History
avx512_target_feature
to include VEX variants #126617Steps
Unresolved Questions and Other Concerns
_mm512_reduce_add_ps and friends are setting fast-math flags they should not set stdarch#1533Footnotes
https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html ↩
The text was updated successfully, but these errors were encountered: