Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement llvm.x86.aesni.* intrinsics #3101

Merged
merged 1 commit into from
Oct 9, 2023
Merged

Implement llvm.x86.aesni.* intrinsics #3101

merged 1 commit into from
Oct 9, 2023

Conversation

eduardosm
Copy link
Contributor

No description provided.

@RalfJung
Copy link
Member

RalfJung commented Oct 3, 2023

Thanks for the PR!

Now we are getting into territory where we'll have to make some non-trivial decisions. I am honestly not super excited about having to maintain an AES implementation. I cannot review this code, and implementing crypto is always dicey. So, in other words I see this PR as coming with a pretty high long-term cost, much more so than the other x86 intrinsic PRs we had so far. How strong is the need to have these intrinsics in Miri, compared to the cost? Can we avoid carrying our own AES implementation, can we import a (pure-Rust) crypto library instead?

@rust-lang/miri what do you think?

@workingjubilee
Copy link
Member

Has aes::hazmat been considered? Presumably with

aes = { version = "0.8", features = ["aes_force_soft"] }

@conradludgate
Copy link

Has aes::hazmat been considered? Presumably with

aes = { version = "0.8", features = ["aes_force_soft"] }

aes_force_soft seems to be a cfg flag and not a crate feature

@workingjubilee
Copy link
Member

doh.


/// SubWord - S-box transformation on 4 bytes
fn sub_word(w: u32) -> u32 {
u32::from_ne_bytes(w.to_ne_bytes().map(|byte| SBOX[usize::from(byte)]))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would strongly caution against any implementation of AES which uses lookup tables like this. They are notorious for cache timing and other microarchitectural sidechannels:

https://eprint.iacr.org/2018/1002.pdf

Modern implementations of AES, such as the one in the aes crate, use techniques like bitslicing/fixslicing to avoid such sidechannels.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nobody should ever run Miri on secret data anyway. So if this ever becomes relevant something already went seriously wrong.

But indeed that's one of the reasons why I don't want to maintain an AES implementation.^^

@RalfJung
Copy link
Member

RalfJung commented Oct 4, 2023

@tarcieri so aes is a solid and trustworthy implementation? That sounds like the most reasonably plan then, if we want to have these intrinsics in Miri.

aes_force_soft seems to be a cfg flag and not a crate feature

That sounds like we should ask the crate author to turn it into a crate feature?

@tarcieri
Copy link

tarcieri commented Oct 4, 2023

so I take you you are saying aes is a solid implementation?

I'm the author (or rather, translator), so take it with a grain of salt. However, it is using one of the best, most modern techniques available.

That sounds like we should ask the crate author to turn it into a crate feature?

It was a crate feature in the past. If one (potentially transitive) dependency enabled the feature, it would disable the use of AES-NI for every other application in the same binary, a problem which came up in practice. As it subtracts functionality, rather than adding it, it does not map well to Cargo features.

We switched to the use of a cfg attribute to move this decision to the toplevel binary, rather than one which could be made by any dependency with no way for the developer of the toplevel binary to disable it without patching their dependencies.

@RalfJung
Copy link
Member

RalfJung commented Oct 4, 2023

I'm the author (or rather, translator), so take it with a grain of salt. However, it is using one of the best, most modern techniques available.

Ah, good to know, thanks! That still give me confidence. :)

As it subtracts functionality, rather than adding it, it does not map well to Cargo features.

It's not subtractive as far as the API is concerned, right? I view it as the additive guarantee that no AES hardware instructions will be used. But maybe that's a somewhat stretched position. ;)

We switched to the use of a cfg attribute to move this decision to the toplevel binary, rather than one which could be made by any dependency with no way for the developer of the toplevel binary to disable it without patching their dependencies.

Miri as the toplevel binary still doesn't have a good way to make this decision though. It can't be set in Cargo.toml, I think? We'd have to make sure that every way of building Miri (both in this repo and as part of the rustc distribution) sets the --cfg flag, that's far from easy. And we'd hardly notice if the final distribution didn't have the flag...

@tarcieri
Copy link

tarcieri commented Oct 4, 2023

Is there a specific reason you need to disable support for hardware intrinsics?

The aes crate has CPU feature autodetection and will use the fastest backend available.

@RalfJung
Copy link
Member

RalfJung commented Oct 4, 2023

Yeah I guess in the end we don't really care how aes implements these operations as long as it implements them correctly. Addition in Miri is in the end also implemented with the host CPU's addition instructions, and unlike float operations (where we use a softfloat library) I wouldn't know that AES operations are notorious for different behavior on different hardware (to the contrary, their behavior is extremely precisely specified down to the last bit). We avoid unsafe code and non-Rust code as much as we can in Miri but of course we have plenty of that in our dependencies. We want to tightly control all sources of non-determinism but these operations are fully deterministic.

@eduardosm
Copy link
Contributor Author

+1 on using an external crate instead of implementing including our own implementation.

I implemented aesenc, aesded and aesimc directly with their corresponding aes::hazmat functions. aesenclast and aesdeclast required pairing aesenc + InvMixColumns and aesdec + MixColumns.

I still couldn't remove the S-Box transformation because it is needed by aeskeygenassist. Any suggestion?

@tarcieri
Copy link

tarcieri commented Oct 4, 2023

You can find a description of the steps performed by aeskeygenassist here: https://www.felixcloutier.com/x86/aeskeygenassist

Unfortunately that's the one part we don't have a portable wrapper for yet, though it would be something interesting for us to provide.

@RalfJung
Copy link
Member

RalfJung commented Oct 4, 2023

@eduardosm can you restrict this PR to the ones where we can use the aes crate?

@saethlin
Copy link
Member

saethlin commented Oct 4, 2023

what do you think?

Definitely meant to respond to this earlier, but in any case I approve of the reasoning and direction here. If I could choose, I would rather we have all the sort of data processing/vectorization intrinsics than these special purpose ones. I am more excited about being able to execute a crate like simd-json or atoi_simd in Miri than running cryptography code.

But I'm also just happy @eduardosm is on this intrinsics tear. I never really expected anyone would turn up and just do so many of them. Hero work.

@RalfJung
Copy link
Member

RalfJung commented Oct 4, 2023

@tarcieri since you're here anyway, could I ask you to take a look at how we are invoking aes? Does that looks sensible for an implementation of Intel's AES intrinsics?

@tarcieri
Copy link

tarcieri commented Oct 4, 2023

@RalfJung the portable low-level API we implement is specifically designed to resemble the AES-NI API, so it should be a fairly straightforward mapping. What I skimmed looked fine, but I can review in depth later.

@RalfJung
Copy link
Member

RalfJung commented Oct 4, 2023

@tarcieri it looks like some hacks are needed where the operation does more than we need: "undo the InvMixColumns with MixColumns". But maybe that's not as hacky as it looks to me. :)

@tarcieri
Copy link

tarcieri commented Oct 4, 2023

@RalfJung yeah, that's another gap in the API. We can try to provide a fully AES-NI compatible portable API, but...

The portability story is rather tricky, because we try to provide an API which is fast on both Intel and ARM, but Intel and ARM chose to implement AES in two completely different ways: ARM uses standard AES decryption and Intel uses a technique called the equivalent inverse cipher. On top of that, we also fall back to a portable software implementation, so everything needs to work on all three backends.

What we have implements the equivalent inverse cipher (i.e. AES-NI-like API) along with exposing various operations like mix columns and inverse mix columns, and while the latter are native instructions on ARM, on Intel we have to emulate mix columns on Intel by invoking inverse mix columns three times, and to provide all of the instructions like aeskeygenassist we'd need to implement backends for it using ARM and the portable implementation as well (although that wouldn't be too hard).

I would just suggest filing issues about the API gaps and we can decide if first-class support makes sense in the aes crate and go from there, but in the meantime if it's possible to emulate all of the instructions using what is available, that seems OK for now to me.

@RalfJung
Copy link
Member

RalfJung commented Oct 4, 2023

Okay, so the fact that we do some emulation by "un-inv-mixing" seems fine then, thanks. Looks like the keygenassist operation cannot be emulated (if I understood @eduardosm correctly); we'll just not support that intrinsic for now.

@bors
Copy link
Contributor

bors commented Oct 6, 2023

☔ The latest upstream changes (presumably #3108) made this pull request unmergeable. Please resolve the merge conflicts.

//@ignore-target-s390x
//@ignore-target-thumbv7em
//@ignore-target-wasm32
//@compile-flags: -C target-feature=+aes,+vaes,+avx512f
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avx512f should also be asserted below then.

@RalfJung
Copy link
Member

RalfJung commented Oct 9, 2023

But I'm also just happy @eduardosm is on this intrinsics tear. I never really expected anyone would turn up and just do so many of them. Hero work.

I can only agree, thanks a ton for all your work. :)

@bors r+

@bors
Copy link
Contributor

bors commented Oct 9, 2023

📌 Commit c01e7a9 has been approved by RalfJung

It is now in the queue for this repository.

@bors
Copy link
Contributor

bors commented Oct 9, 2023

⌛ Testing commit c01e7a9 with merge 1fffca4...

@bors
Copy link
Contributor

bors commented Oct 9, 2023

☀️ Test successful - checks-actions
Approved by: RalfJung
Pushing 1fffca4 to master...

@bors bors merged commit 1fffca4 into rust-lang:master Oct 9, 2023
@eduardosm eduardosm deleted the x86-aes-intrinsics branch October 9, 2023 21:34
@eduardosm
Copy link
Contributor Author

Thanks everyone for your reviews and suggestions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants