-
Notifications
You must be signed in to change notification settings - Fork 5.2k
JIT: Constant fold SequenceEqual with the help of VN #121985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a JIT optimization to constant fold SequenceEqual operations using Value Numbering (VN). When the length is known at compile time and within unrolling thresholds, the JIT can either fold to a constant (if both memory regions are immutable and known) or unroll into an efficient load/compare chain using XOR and OR operations.
Key Changes
- Adds
optVNBasedFoldExpr_Call_Memcmpto unrollNI_System_SpanHelpers_SequenceEqualfor constant-length comparisons - Introduces
Memcmpto theUnrollKindenum with platform-specific thresholds (4x maxRegSize, 12x on ARM64) - Implements XOR+OR accumulation pattern for comparing memory chunks, similar to existing memcpy/memmove optimizations
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/coreclr/jit/compiler.h | Adds function declaration for optVNBasedFoldExpr_Call_Memcmp, adds Memcmp to UnrollKind enum, and sets unroll thresholds |
| src/coreclr/jit/assertionprop.cpp | Implements the core optimization logic to unroll SequenceEqual, including constant folding and XOR-based comparison chain generation |
|
If you want to fold SequenceEquals to true/false, then you can do that in the valuenum phase. |
|
E.g. the point of |
Yeah. I noticed this as well. Well there was several minor differences on register selection and unaligned loads btw. Removed the IND nodes unrolling. |
EgorBo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks. I'll move the importervectorization.cpp to that function eventually.
PS: Just Equality to 1/0 could've been done in the VN phase, but this is fine too as it might find more cases after assertions.
|
cc @dotnet/jit-contrib |
src/coreclr/jit/assertionprop.cpp
Outdated
|
|
||
| uint8_t* buffer1 = new (this, CMK_AssertionProp) uint8_t[len]; | ||
| uint8_t* buffer2 = new (this, CMK_AssertionProp) uint8_t[len]; | ||
| if (GetImmutableDataFromAddress(arg1->GetNode(), (int)len, buffer1) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How often do these Get... methods fail? Do we need to split this into a "can I get the data" and "get the data" pair, so we avoid allocating when the data can't be gotten?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pushed a commit to make GetImmutableDataFromAddress allocate on-demand.
But doing it in VN might unblock RBO, etc... |
Yeah, pros & cons. Doing this in VN won't handle e.g. if (len == 42)
{
SequenceEqual(..,.., len);
}The best option is to do it in VN and then run JitOptRepeat if SeqEquals's len became a constant. But we can't rely on JitOptRepeat today. Also, if we move it to VN, we'll have to do some work in VN constant propagation that AP does - I think it doesn't handle calls today. Overall I don't have a preference. Judging by the diffs, it's a fairly niche case. |
|
@AndyAyersMS PTAL |
Try to fold
SequenceEqualwith the help of VN.Codegen for
Before:
After: