Skip to content

Conversation

@hez2010
Copy link
Contributor

@hez2010 hez2010 commented Nov 26, 2025

Try to fold SequenceEqual with the help of VN.

Codegen for

Test.CompareStr();

class Test
{
    public static string Str1 => "abcdef";
    public static string Str2 => "bcdefg";

    [MethodImpl(MethodImplOptions.NoInlining)]
    public static bool CompareStr()
    {
        return Str1.Equals(Str2);
    }
}

Before:

G_M64625_IG01:  ;; offset=0x0000
                                                ;; size=0 bbWeight=1 PerfScore 0.00
G_M64625_IG02:  ;; offset=0x0000
       mov      rax, 0x2DA4A1AEFF0      ; 'abcdef'
       add      rax, 12
       mov      rcx, 0x2DA4A1AF024
       mov      rdx, qword ptr [rax]
       mov      rax, qword ptr [rax+0x04]
       mov      r8, qword ptr [rcx]
       xor      rdx, r8
       xor      rax, qword ptr [rcx+0x04]
       or       rax, rdx
       sete     al
       movzx    rax, al
                                                ;; size=50 bbWeight=1 PerfScore 11.50
G_M64625_IG03:  ;; offset=0x0032
       ret
                                                ;; size=1 bbWeight=1 PerfScore 1.00

After:

G_M64625_IG01:  ;; offset=0x0000
                                                ;; size=0 bbWeight=1 PerfScore 0.00
G_M64625_IG02:  ;; offset=0x0000
       xor      eax, eax
                                                ;; size=2 bbWeight=1 PerfScore 0.25
G_M64625_IG03:  ;; offset=0x0002
       ret
                                                ;; size=1 bbWeight=1 PerfScore 1.00

Copilot AI review requested due to automatic review settings November 26, 2025 12:09
@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 26, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Nov 26, 2025
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copilot finished reviewing on behalf of hez2010 November 26, 2025 12:12
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a JIT optimization to constant fold SequenceEqual operations using Value Numbering (VN). When the length is known at compile time and within unrolling thresholds, the JIT can either fold to a constant (if both memory regions are immutable and known) or unroll into an efficient load/compare chain using XOR and OR operations.

Key Changes

  • Adds optVNBasedFoldExpr_Call_Memcmp to unroll NI_System_SpanHelpers_SequenceEqual for constant-length comparisons
  • Introduces Memcmp to the UnrollKind enum with platform-specific thresholds (4x maxRegSize, 12x on ARM64)
  • Implements XOR+OR accumulation pattern for comparing memory chunks, similar to existing memcpy/memmove optimizations

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/coreclr/jit/compiler.h Adds function declaration for optVNBasedFoldExpr_Call_Memcmp, adds Memcmp to UnrollKind enum, and sets unroll thresholds
src/coreclr/jit/assertionprop.cpp Implements the core optimization logic to unroll SequenceEqual, including constant folding and XOR-based comparison chain generation

@EgorBo
Copy link
Member

EgorBo commented Nov 26, 2025

If you want to fold SequenceEquals to true/false, then you can do that in the valuenum phase.
For the unrolling to a set of IND nodes - I don't see what it can catch that LowerCallMemcmp won't handle to be honest, any example?

@EgorBo
Copy link
Member

EgorBo commented Nov 26, 2025

E.g. the point of optVNBasedFoldExpr_Call_Memmove to exist is that it can see RVA const data and emit const values to be memmoved - LowerMemmove cannot do it. In your case you don't take advantage of seeing string's content when you unroll to indirs (JitOptRepeat probably could help, but we cannot use it today)

@hez2010
Copy link
Contributor Author

hez2010 commented Nov 26, 2025

For the unrolling to a set of IND nodes - I don't see what it can catch that LowerCallMemcmp won't handle to be honest, any example?

Yeah. I noticed this as well. Well there was several minor differences on register selection and unaligned loads btw. Removed the IND nodes unrolling.

@hez2010
Copy link
Contributor Author

hez2010 commented Nov 26, 2025

@MihuBot

Copy link
Member

@EgorBo EgorBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks. I'll move the importervectorization.cpp to that function eventually.

PS: Just Equality to 1/0 could've been done in the VN phase, but this is fine too as it might find more cases after assertions.

@EgorBo
Copy link
Member

EgorBo commented Nov 26, 2025

cc @dotnet/jit-contrib


uint8_t* buffer1 = new (this, CMK_AssertionProp) uint8_t[len];
uint8_t* buffer2 = new (this, CMK_AssertionProp) uint8_t[len];
if (GetImmutableDataFromAddress(arg1->GetNode(), (int)len, buffer1) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How often do these Get... methods fail? Do we need to split this into a "can I get the data" and "get the data" pair, so we avoid allocating when the data can't be gotten?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed a commit to make GetImmutableDataFromAddress allocate on-demand.

@AndyAyersMS
Copy link
Member

PS: Just Equality to 1/0 could've been done in the VN phase, but this is fine too as it might find more cases after assertions.

But doing it in VN might unblock RBO, etc...

@EgorBo
Copy link
Member

EgorBo commented Nov 26, 2025

PS: Just Equality to 1/0 could've been done in the VN phase, but this is fine too as it might find more cases after assertions.

But doing it in VN might unblock RBO, etc...

Yeah, pros & cons. Doing this in VN won't handle e.g.

if (len == 42)
{
    SequenceEqual(..,.., len);
}

The best option is to do it in VN and then run JitOptRepeat if SeqEquals's len became a constant. But we can't rely on JitOptRepeat today. Also, if we move it to VN, we'll have to do some work in VN constant propagation that AP does - I think it doesn't handle calls today.

Overall I don't have a preference. Judging by the diffs, it's a fairly niche case.

@hez2010
Copy link
Contributor Author

hez2010 commented Nov 27, 2025

@AndyAyersMS PTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants