use correct LLVM intrinsic for min/max on floats#153343
use correct LLVM intrinsic for min/max on floats#153343RalfJung wants to merge 1 commit intorust-lang:mainfrom
Conversation
|
Some changes occurred in compiler/rustc_codegen_cranelift cc @bjorn3 Some changes occurred to the CTFE / Miri interpreter cc @rust-lang/miri Some changes occurred in compiler/rustc_codegen_gcc Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter cc @rust-lang/miri, @oli-obk, @lcnr Some changes occurred to the CTFE machinery |
| ); | ||
| // `nsz` in minimumnum/maximumnum is special: its only effect is to make signed-zero | ||
| // ordering non-deterministic. | ||
| unsafe { llvm::LLVMRustSetNoSignedZeros(call) }; |
There was a problem hiding this comment.
I have no idea if the way I wired up nsz is correct.^^
| let b = b.load_scalar(fx); | ||
|
|
||
| // FIXME: make sure this has the intended behavior for SNaN | ||
| // (returning the other argument). |
There was a problem hiding this comment.
@bjorn3 does cranelift have intrinsics that have the intended guarantee re: SNaN?
Or should we use the fallback impl here?
| if (auto I = dyn_cast<Instruction>(unwrap<Value>(V))) { | ||
| I->setHasNoSignedZeros(true); | ||
| } | ||
| } |
There was a problem hiding this comment.
The C bindings have a native LLVMSetFastMathFlags(), we should probably switch to that. But I guess we should do that consistently for the existing LLVMRustSetAlgebraicMath/LLVMRustSetAllowReassoc/LLVMRustSetFastMath as well, so I don't particularly mind this in the meantime.
There was a problem hiding this comment.
Yeah I don't know why we have a separate wrapper for each flag configuration here, but I figured I'd follow the existing pattern.
|
There are some odd things happening in CI Why did fcanonicalize end up with nsz? That was meant just for minimumnum. |
|
That was on the aarch64-gnu-llvm-20-1 runner. Maybe we have to fall back to minnum/maxnum for old LLVM versions? |
Ah yes, that's a good point. I believe minimumnum used to have some selection failures that were only fixed in LLVM 22. |
This comment has been minimized.
This comment has been minimized.
|
I guess that makes sense, the test fails on old LLVM where we (have to) use the wrong intrinsic. |
|
@bors try jobs=x86_64-gnu,aarch64-gnu |
This comment has been minimized.
This comment has been minimized.
rename min/maxnum intrinsics to min/maximum_number and fix their LLVM lowering try-job: x86_64-gnu try-job: aarch64-gnu
This comment has been minimized.
This comment has been minimized.
|
This is very strange, I tested the fallback implementation locally and it passes. Why does it fail on the aarch runner? And it's also a very strange return value. The inputs are |
This comment has been minimized.
This comment has been minimized.
|
now with the commit that always forces the fallback impl to be used |
This comment has been minimized.
This comment has been minimized.
rename min/maxnum intrinsics to min/maximum_number and fix their LLVM lowering try-job: x86_64-gnu try-job: aarch64-gnu try-job: x86_64-gnu-gcc
|
Seems like LLVM 20 straight-up miscompiles code like this fn minimum_num(x: f32, y: f32) -> f32 {
if x.is_nan() || y >= x {
y
} else {
// Either y < x or y is a NaN.
x
}
}
const SNAN: f32 = f32::from_bits(f32::NAN.to_bits() - 1);
fn main() {
dbg!(minimum_num(-9.0, std::hint::black_box(SNAN)));
}I tried this on an aarch64 dev desktop: with Rust 1.87, an optimized build prints How do we handle library tests that trigger miscompilations on old LLVM versions...? We could just remove the Are we anywhere close to dropping LLVM 20? :D |
|
The issue is that the fallback impl miscompiles on aarch64 with LLVM 20.^^ And the intrinsic crashes LLVM. |
|
Ah sorry, I just came to the same realization. I've come across that before and unfortunately there wasn't a good way #t-infra/bootstrap > LLVM version in std so I think what you have is about the best option for now. Honestly I would love to have a very internal cfg for backend name and version so we don't need to jump through these kind of hoops when similar situations pop up. |
This is very strange, why does the fallback impl get miscompiled with the in-tree LLVM? |
8ce6631 to
977fa0b
Compare
This comment has been minimized.
This comment has been minimized.
|
Ah... yeah this would not work with |
|
The Miri subtree was changed cc @rust-lang/miri |
|
It seems like indeed the miscompilation still occurs with the in-tree LLVM on aarch64 -- just not in my minimal example. I have no idea how to minimize this... |
Is it using the fallback implementation? I'd think in-tree is using the intrinsics since we're on 22.x. Maybe worth another test in |
|
For that try build I added a hack in the backend to always use the fallback impl. That's also what I did to reproduce this on the dev desktop.
|
|
The AArch64 miscompile is llvm/llvm-project#176624. And yes, that one still exists on current main. (The intrinsics are fine.) |
|
@nikic thanks a ton, you just saved me a lot of digging. :) |
|
So, for this PR -- I guess the best we can do is use the intrinsic on LLVM22+ and the fallback impl for older LLVM, even if the fallback impl sometimes gives the wrong results on aarch64 (for SNaN inputs). We already sometimes give wrong results on aarch64 before this PR so that's not even a regression. Or can we get away with using the intrinsic on LLVM 21 already? |
|
@bors try jobs=x86_64-gnu,aarch64-gnu |
This comment has been minimized.
This comment has been minimized.
use correct LLVM intrinsic for min/max on floats try-job: x86_64-gnu try-job: aarch64-gnu
|
"The operation was canceled"? @bors try jobs=x86_64-gnu,aarch64-gnu |
|
⌛ Trying commit 1e50a40 with merge e5a358e… To cancel the try build, run the command Workflow: https://github.com/rust-lang/rust/actions/runs/22736973220 |
use correct LLVM intrinsic for min/max on floats try-job: x86_64-gnu try-job: aarch64-gnu
View all comments
The Rust
minnum/maxnumintrinsics are documented to return the other argument when one input is an SNaN. However, the LLVM lowering we currently choose for them does not match those semantics: we lower them tominnum/maxnum, which (since llvm/llvm-project#172012) is documented to non-deterministically return the other argument or NaN when one input is an SNaN.LLVM does have an intrinsic with the intended semantics:
minimumnum/maximumnum. Let's use that instead. We can set thenszflag since we treat signed zero ordering as non-deterministic.Also rename the intrinsics to follow the IEEE 2019 naming, since that is mostly (and in particular, as far as NaN are concerned) now what we do. Also,
minimum_numberandminimumare less easy to mix up thanminnumandminimum.r? @nikic
Cc @tgross35
Fixes #149537