Skip to content

use correct LLVM intrinsic for min/max on floats#153343

Open
RalfJung wants to merge 1 commit intorust-lang:mainfrom
RalfJung:min-max-fix
Open

use correct LLVM intrinsic for min/max on floats#153343
RalfJung wants to merge 1 commit intorust-lang:mainfrom
RalfJung:min-max-fix

Conversation

@RalfJung
Copy link
Member

@RalfJung RalfJung commented Mar 3, 2026

View all comments

The Rust minnum/maxnum intrinsics are documented to return the other argument when one input is an SNaN. However, the LLVM lowering we currently choose for them does not match those semantics: we lower them to minnum/maxnum, which (since llvm/llvm-project#172012) is documented to non-deterministically return the other argument or NaN when one input is an SNaN.

LLVM does have an intrinsic with the intended semantics: minimumnum/maximumnum. Let's use that instead. We can set the nsz flag since we treat signed zero ordering as non-deterministic.

Also rename the intrinsics to follow the IEEE 2019 naming, since that is mostly (and in particular, as far as NaN are concerned) now what we do. Also, minimum_number and minimum are less easy to mix up than minnum and minimum.

r? @nikic
Cc @tgross35
Fixes #149537

@rustbot
Copy link
Collaborator

rustbot commented Mar 3, 2026

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred to the CTFE / Miri interpreter

cc @rust-lang/miri

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter
gets adapted for the changes, if necessary.

cc @rust-lang/miri, @oli-obk, @lcnr

Some changes occurred to the CTFE machinery

cc @oli-obk, @lcnr

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 3, 2026
);
// `nsz` in minimumnum/maximumnum is special: its only effect is to make signed-zero
// ordering non-deterministic.
unsafe { llvm::LLVMRustSetNoSignedZeros(call) };
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea if the way I wired up nsz is correct.^^

let b = b.load_scalar(fx);

// FIXME: make sure this has the intended behavior for SNaN
// (returning the other argument).
Copy link
Member Author

@RalfJung RalfJung Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bjorn3 does cranelift have intrinsics that have the intended guarantee re: SNaN?
Or should we use the fallback impl here?

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me once the questions for other backends are answered.

View changes since this review

if (auto I = dyn_cast<Instruction>(unwrap<Value>(V))) {
I->setHasNoSignedZeros(true);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C bindings have a native LLVMSetFastMathFlags(), we should probably switch to that. But I guess we should do that consistently for the existing LLVMRustSetAlgebraicMath/LLVMRustSetAllowReassoc/LLVMRustSetFastMath as well, so I don't particularly mind this in the meantime.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't know why we have a separate wrapper for each flag configuration here, but I figured I'd follow the existing pattern.

@RalfJung
Copy link
Member Author

RalfJung commented Mar 3, 2026

There are some odd things happening in CI

2026-03-03T12:31:32.4740066Z rustc-LLVM ERROR: Cannot select: 0xff67d41c22a0: f128 = fcanonicalize nsz 0xff67d41c2a10
2026-03-03T12:31:32.4740629Z   0xff67d41c2a10: f128 = AArch64ISD::CSEL 0xff67d41c5660, 0xff67d41c52e0, Constant:i32<11>, 0xff67d41c2930:1

Why did fcanonicalize end up with nsz? That was meant just for minimumnum.
And also it seems like f128 minimumnum is just broken on aarch64?

@RalfJung
Copy link
Member Author

RalfJung commented Mar 3, 2026

That was on the aarch64-gnu-llvm-20-1 runner. Maybe we have to fall back to minnum/maxnum for old LLVM versions?

@nikic
Copy link
Contributor

nikic commented Mar 3, 2026

That was on the aarch64-gnu-llvm-20-1 runner. Maybe we have to fall back to minnum/maxnum for old LLVM versions?

Ah yes, that's a good point. I believe minimumnum used to have some selection failures that were only fixed in LLVM 22.

@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Member Author

RalfJung commented Mar 3, 2026

I guess that makes sense, the test fails on old LLVM where we (have to) use the wrong intrinsic.

@RalfJung
Copy link
Member Author

RalfJung commented Mar 3, 2026

@bors try jobs=x86_64-gnu,aarch64-gnu

@rust-bors

This comment has been minimized.

rust-bors bot pushed a commit that referenced this pull request Mar 3, 2026
rename min/maxnum intrinsics to min/maximum_number and fix their LLVM lowering


try-job: x86_64-gnu
try-job: aarch64-gnu
@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Member Author

RalfJung commented Mar 3, 2026

This is very strange, I tested the fallback implementation locally and it passes. Why does it fail on the aarch runner?

And it's also a very strange return value. The inputs are from_bits(0x7fbfffff) and -9.0 and the output is from_bits(0x7fffffff)?!?

@rust-log-analyzer

This comment has been minimized.

@rust-bors
Copy link
Contributor

rust-bors bot commented Mar 3, 2026

☀️ Try build successful (CI)
Build commit: 5fc4d3f (5fc4d3f9142818f2f2b292605ba61c2b9b55f112, parent: 1b7d722f429f09c87b08b757d89c689c6cf7f6e7)

@RalfJung
Copy link
Member Author

RalfJung commented Mar 3, 2026

now with the commit that always forces the fallback impl to be used
@bors try jobs=x86_64-gnu,aarch64-gnu,x86_64-gnu-gcc

@rust-bors

This comment has been minimized.

rust-bors bot pushed a commit that referenced this pull request Mar 3, 2026
rename min/maxnum intrinsics to min/maximum_number and fix their LLVM lowering


try-job: x86_64-gnu
try-job: aarch64-gnu
try-job: x86_64-gnu-gcc
@RalfJung
Copy link
Member Author

RalfJung commented Mar 3, 2026

Seems like LLVM 20 straight-up miscompiles code like this

fn minimum_num(x: f32, y: f32) -> f32 {
    if x.is_nan() || y >= x {
        y
    } else {
        // Either y < x or y is a NaN.
        x
    }
}

const SNAN: f32 = f32::from_bits(f32::NAN.to_bits() - 1);

fn main() {
    dbg!(minimum_num(-9.0, std::hint::black_box(SNAN)));
}

I tried this on an aarch64 dev desktop: with Rust 1.87, an optimized build prints NaN, with latest stable Rust, it prints -9.0.

How do we handle library tests that trigger miscompilations on old LLVM versions...? We could just remove the black_box, but -- it'd be a shame to reduce test coverage on newer LLVM just because we also still test old LLVM.

Are we anywhere close to dropping LLVM 20? :D

@RalfJung
Copy link
Member Author

RalfJung commented Mar 3, 2026

The issue is that the fallback impl miscompiles on aarch64 with LLVM 20.^^ And the intrinsic crashes LLVM.

@tgross35
Copy link
Contributor

tgross35 commented Mar 3, 2026

Ah sorry, I just came to the same realization. I've come across that before and unfortunately there wasn't a good way #t-infra/bootstrap > LLVM version in std so I think what you have is about the best option for now.

Honestly I would love to have a very internal cfg for backend name and version so we don't need to jump through these kind of hoops when similar situations pop up.

@RalfJung
Copy link
Member Author

RalfJung commented Mar 3, 2026

The job aarch64-gnu failed! Check out the build log: (web) (plain enhanced) (plain)
Click to see the possible cause of the failure (guessed by this bot)

This is very strange, why does the fallback impl get miscompiled with the in-tree LLVM?

@RalfJung RalfJung force-pushed the min-max-fix branch 2 times, most recently from 8ce6631 to 977fa0b Compare March 3, 2026 22:46
@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Member Author

RalfJung commented Mar 4, 2026

Ah... yeah this would not work with -Zmiri-force-intrinsic-fallback. Argh.

@rustbot
Copy link
Collaborator

rustbot commented Mar 4, 2026

The Miri subtree was changed

cc @rust-lang/miri

@RalfJung
Copy link
Member Author

RalfJung commented Mar 4, 2026

It seems like indeed the miscompilation still occurs with the in-tree LLVM on aarch64 -- just not in my minimal example. I have no idea how to minimize this...

@tgross35
Copy link
Contributor

tgross35 commented Mar 4, 2026

The job aarch64-gnu failed! Check out the build log: (web) (plain enhanced) (plain)
Click to see the possible cause of the failure (guessed by this bot)

This is very strange, why does the fallback impl get miscompiled with the in-tree LLVM?

Is it using the fallback implementation? I'd think in-tree is using the intrinsics since we're on 22.x. Maybe worth another test in tests/codegen-llvm/float to verify?

@RalfJung
Copy link
Member Author

RalfJung commented Mar 4, 2026 via email

@nikic
Copy link
Contributor

nikic commented Mar 4, 2026

The AArch64 miscompile is llvm/llvm-project#176624. And yes, that one still exists on current main. (The intrinsics are fine.)

@RalfJung
Copy link
Member Author

RalfJung commented Mar 4, 2026

@nikic thanks a ton, you just saved me a lot of digging. :)

@RalfJung
Copy link
Member Author

RalfJung commented Mar 4, 2026

So, for this PR -- I guess the best we can do is use the intrinsic on LLVM22+ and the fallback impl for older LLVM, even if the fallback impl sometimes gives the wrong results on aarch64 (for SNaN inputs). We already sometimes give wrong results on aarch64 before this PR so that's not even a regression.

Or can we get away with using the intrinsic on LLVM 21 already?

@RalfJung RalfJung changed the title rename min/maxnum intrinsics to min/maximum_number and fix their LLVM lowering use correct LLVM intrinsic for min/max on floats Mar 4, 2026
@RalfJung
Copy link
Member Author

RalfJung commented Mar 5, 2026

@bors try jobs=x86_64-gnu,aarch64-gnu

@rust-bors

This comment has been minimized.

rust-bors bot pushed a commit that referenced this pull request Mar 5, 2026
use correct LLVM intrinsic for min/max on floats


try-job: x86_64-gnu
try-job: aarch64-gnu
@rust-bors
Copy link
Contributor

rust-bors bot commented Mar 5, 2026

💔 Test for e9e3af0 failed: CI

@RalfJung
Copy link
Member Author

RalfJung commented Mar 5, 2026

"The operation was canceled"?

@bors try jobs=x86_64-gnu,aarch64-gnu

@rust-bors
Copy link
Contributor

rust-bors bot commented Mar 5, 2026

⌛ Trying commit 1e50a40 with merge e5a358e

To cancel the try build, run the command @bors try cancel.

Workflow: https://github.com/rust-lang/rust/actions/runs/22736973220

rust-bors bot pushed a commit that referenced this pull request Mar 5, 2026
use correct LLVM intrinsic for min/max on floats


try-job: x86_64-gnu
try-job: aarch64-gnu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

f*::min/max do not behave as documented for signaling NaN on aarch64, or with optimizations

6 participants