Skip to content

_mm512_shrdv_* intrinsics have incorrect argument order #130365

@as-com

Description

@as-com
Contributor

I tried this code:

#![feature(stdarch_x86_avx512)]

use std::arch::x86_64::*;

fn main() {
    unsafe {
        let a = _mm512_set1_epi32(0xffff);
        let b = _mm512_setzero_epi32();
        let c = _mm512_set1_epi32(1);
    
        let dst = _mm512_shrdv_epi32(a, b, c);
        println!("{}", _mm512_cvtsi512_si32(dst));    
    }
}

I expected to see this happen:

The code produces the same output as the equivalent C program:

#include <immintrin.h>
#include <stdio.h>

int main() {
    __m512i a = _mm512_set1_epi32(0xffff);
    __m512i b = _mm512_setzero_epi32();
    __m512i c = _mm512_set1_epi32(1);

    __m512i dst = _mm512_shrdv_epi32(a, b, c);
    printf("%u\n", _mm512_cvtsi512_si32(dst));
}

The program outputs 32767.

Instead, the Rust program outputs -2147483648.


Intel's documentation (as linked in the rustdoc for the function) for _mm512_shrdv_epi32 states:

Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 32-bits in dst.

FOR j := 0 to 15
	i := j*32
	dst[i+31:i] := ((b[i+31:i] << 32)[63:0] | a[i+31:i]) >> (c[i+31:i] & 31)
ENDFOR
dst[MAX:512] := 0

meaning argument b is the upper bits, and a is the lower bits. However, llvm.fshr.* uses the opposite order. It appears Rust is passing arguments a, b, and c in that order to llvm.fshr:

https://github.com/rust-lang/stdarch/blob/b1edbf90955cb9b057a323f761e2c19edb591e6f/crates/core_arch/src/x86/avx512vbmi2.rs#L997-L999

This likely also applies to all similar intrinsics that call llvm.fshr.

Meta

rustc --version --verbose:

rustc 1.83.0-nightly (0609062a9 2024-09-13)
binary: rustc
commit-hash: 0609062a91c8f445c3e9a0de57e402f9b1b8b0a7
commit-date: 2024-09-13
host: x86_64-unknown-linux-gnu
release: 1.83.0-nightly
LLVM version: 19.1.0

Activity

added
needs-triageThis issue may need triage. Remove it if it has been sufficiently triaged.
on Sep 14, 2024
workingjubilee

workingjubilee commented on Sep 14, 2024

@workingjubilee
Member

Ah, thank you for reporting this!

stabilizing #111137 should be blocked on this (on top of other reasons it cannot yet be stabilized).

workingjubilee

workingjubilee commented on Sep 14, 2024

@workingjubilee
Member

cc @minybot on the off chance you would like to do the follow up (but I see no evidence they are interacting with GitHub lately so it is very likely that is a dead letter),

so: also ccing @sayantn

added
O-x86_64Target: x86-64 processors (like x86_64-*) (also known as amd64 and x64)
A-SIMDArea: SIMD (Single Instruction Multiple Data)
on Sep 14, 2024
bjorn3

bjorn3 commented on Sep 15, 2024

@bjorn3
Member

Do GCC's __builtin_ia32_vpshrdv_v*di intrinsics have the same reverse order as LLVM's intrinsic? If not the GCC and LLVM backends don't agree on how to compile _mm512_shrdv_*.

removed
needs-triageThis issue may need triage. Remove it if it has been sufficiently triaged.
on Sep 18, 2024
sayantn

sayantn commented on Sep 22, 2024

@sayantn
Contributor

This bug was probably due to an inconsistency on Intel's part. The shld intrinsics pack like a || b, but the shrd intrinsics pack like b || a 🤦🏽. I don't think this will require any change on the compiler side, as the compiler intrinsics are (?) consistent about the packing order (i.e, they always pack <first> || <second>)

sayantn

sayantn commented on Sep 22, 2024

@sayantn
Contributor

The fix has been merged in stdarch (rust-lang/stdarch#1644)

workingjubilee

workingjubilee commented on Sep 25, 2024

@workingjubilee
Member

Thank you for taking care of that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-SIMDArea: SIMD (Single Instruction Multiple Data)A-intrinsicsArea: IntrinsicsC-bugCategory: This is a bug.O-x86_64Target: x86-64 processors (like x86_64-*) (also known as amd64 and x64)

    Type

    Projects

    No projects

    Milestone

    No milestone

    Development

    No branches or pull requests

      Participants

      @as-com@saethlin@bjorn3@workingjubilee@rustbot

      Issue actions

        _mm512_shrdv_* intrinsics have incorrect argument order · Issue #130365 · rust-lang/rust