-
-
Notifications
You must be signed in to change notification settings - Fork 14.4k
Make operational semantics of pattern matching independent of crate and module #150681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Some changes occurred in match lowering cc @Nadrieril The Miri subtree was changed cc @rust-lang/miri |
|
|
|
Looks like I messed up the syntax slightly 😅 r? @RalfJung |
|
Ah, it's just that you can only request a review from one person at a time. Makes sense. I trust that the PR will make its way to the interested parties either way, but just in case, cc @theemathas and @traviscross, who were involved in previous discussions. Preparing a sister PR to the Rust Reference is on my TODO list. |
|
I'm also not really on the review rotation, so I won't be able to do the lead review here -- sorry. I'll try to take a look at the parts relevant to Miri, but the match lowering itself is outside my comfort zone. @rustbot reroll |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
For the tests that need to be adjusted, is it possible to make the adjustments before the main changes, and still have the tests pass? That can be a nicer approach, if it's viable. |
f8d7c97 to
2ef0ade
Compare
This comment has been minimized.
This comment has been minimized.
Depends on which tests we're talking about. The FileCheck changes in As for the changes in |
|
Resolving points (2) and (3) as proposed seems like a clear improvement to me. @rustbot reviewed FWIW, my assumption on point (1) would have been that we elide reading the discriminant if and only if the discriminant can be fully elided from the layout (and thus reading it is a no-op w.r.t. opsem). This feels like a special case optimization(?) for single-variant enums that was overlooked when introducing repr(int), as IIUC prior to that addition, all single-variant enums didn't store a discriminant. IIRC eliding the discriminant read was first done before MIR borrowck, even, meaning the "optimization" wouldn't even have served to make matching non-capturing until MIR borrowck came along. Certainly smells to me like an "optimization" to avoid emitting a "useless" MIR statement that turned out to be not so meaningless later on. |
|
@craterbot run mode=build-and-test crates=https://crater-reports.s3.amazonaws.com/pr-150681-1/retry-regressed-list.txt p=1 |
|
👌 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
|
It looks like a lot of the preliminary test changes could potentially be merged on their own, separately from the main changes in this PR. Would it make sense to extract some of those into a separate PR to land first, to make this PR smaller and more focused on the actual proposed changes? |
|
Yeah, feels like the first four commits could probably be a separate PR. I've created #151207. |
…thar Preliminary match/capture test cleanup for PR 150681 Review for rust-lang#150681 requested that this cleanup gets extracted to a separate PR. r? @Zalathar
|
🚧 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
|
🎉 Experiment
Footnotes
|
|
@rfcbot reviewed In particular, I'm a fan of I do like like CAD's note in #150681 (comment) -- it looks like #[repr(u32)]
enum Foo {
Bar,
}is still |
|
@scottmcm I believe that the current iteration of the PR has #[non_exhaustive] also affect opsem. It just affects both external crates and the same crate equally. |
|
Yeah, the part where things are uniform independent of the number of variants or |
|
Taking a look at the crater run, it looks like I'll send a PR to fix the error, but I wonder: do we want to have some kind of "note: this used to be accepted" in the error message? I was thinking we could mark the "read discriminant" MIR instruction with some kind of marker of "this read used to not happen in previous versions of rustc, we were just assuming it'd always be this constant", but I'm not sure of the cost-benefit ratio of this. In particular, this is a migration strategy that we could later use for handling case (1), which occurred a couple more times. Regarding the option of making the read only happen for |
Fully agreed. But that was moved out of this PR -- do we have an issue / PR where we can have this discussion separately, and prepare a collection of arguments for when we move point 1 of the PR description to FCP? |
At the moment the code happens to work only because of an oversight in rustc: that oversight is likely to be fixed soon (rust-lang/rust#150681).
|
Thanks for pointing out the lrlex issue! FWIW, in our situation we had a one-variant |
|
Makes sense to me.
Justification: If you say you're going to add more variants, adding more variants should not introduce UB to your code.
Justification: This was too clever in the first place. I'd be happy of course if we apply these elisions as optimizations, but those shouldn't impact what's considered UB or not. @rfcbot reviewed |
Making an existing type My 2 cents are that such a message should either only recommend |
The question of "when does matching an enum against a pattern of one of its variants read its discriminant" is currently an underspecified part of the language, causing weird behavior around borrowck, drop order, and UB.
Of course, in the common cases, the discriminant must be read to distinguish the variant of the enum, but currently the following exceptions are implemented:
If the enum has only one variant, we currently skip the discriminant read.
This has the advantage that single-variant enums behave the same way as structs in this regard.
However, it means that if the discriminant exists in the layout, we can't say that this discriminant being invalid is UB. This makes me particularly uneasy in its interactions with niches – consider the following example (playground), where miri currently doesn't detect any UB (because the semantics don't specify any):
Example 1
For the purpose of the above, enums with marked with
#[non_exhaustive]are always considered to have multiple variants when observed from foreign crates, but the actual number of variants is considered in the current crate.matchconsequences for#[non_exhaustive]#147722#[non_exhaustive]affecting the runtime semantics, its presence or absence can change what gets captured by a closure, and by extension, the drop order: Single-variant exception formatchconsequences for#[non_exhaustive]#147722 (comment)#[non_exhaustive]can cause borrowck to suddenly start failing in another crate.Moreover, we currently make a more specific check: we only read the discriminant if there is more than one inhabited variant in the enum.
This means that the semantics can differ between
foo<!>, and a copy offoowhereTwas manually replaced with!: Adding generics affect whether code has UB or not, according to Miri #146803Moreover, due to the privacy rules for inhabitedness, it means that the semantics of code can depend on the module in which it is located.
Additionally, this inhabitedness rule is even uglier due to the fact that closure capture analysis needs to happen before we can determine whether types are uninhabited, which means that whether the discriminant read happens has a different answer specifically for capture analysis.
For the two above points, see the following example (playground):
Example 2
In light of the above, and following the discussion at #138961 and #147722, this PR
makes it so that, operationally, matching on an enum always reads its discriminant.introduces the following changes to this behavior:#[non_exhaustive]enum will always introduce a discriminant read, regardless of whether the enum is from an external crateAs per the discussion below, the resolution for point (1) above is that it should land as part of a separate PR, so that the subtler decision can be more carefully considered.
Note that this is a breaking change, due to the aforementioned changes in borrow checking behavior, new UB (or at least UB newly detected by miri), as well as drop order around closure captures. However, it seems to me that the combination of this PR with #138961 should have smaller real-world impact than #138961 by itself.
Fixes #142394
Fixes #146590
Fixes #146803 (though already marked as duplicate)
Fixes parts of #147722
Fixes rust-lang/miri#4778
r? @Nadrieril @RalfJung
@rustbot label +A-closures +A-patterns +T-opsem +T-lang