Skip to content

Conversation

@dianqk
Copy link
Member

@dianqk dianqk commented Nov 29, 2025

Fixes (after reland #168353 again) #169996.

#168353 will transform %1:gr64 = SUBREG_TO_REG 0, %2:gr32, %subreg.sub_32bit to undef %1.sub_32bit:gr64_with_sub_8bit = COPY %0.sub_32bit, implicit-def %1 to remain the zero extended semantics, but RegisterCoalescer doesn't check implicit-def.

@llvmbot
Copy link
Member

llvmbot commented Nov 29, 2025

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-llvm-regalloc

Author: dianqk (dianqk)

Changes

Fixes (after reland #168353 again) #169996.

#168353 will transform %1:gr64 = SUBREG_TO_REG 0, %2:gr32, %subreg.sub_32bit to undef %1.sub_32bit:gr64_with_sub_8bit = COPY %0.sub_32bit, implicit-def %1 to remain the zero extended semantics, but RegisterCoalescer doesn't check implicit-def.


Full diff: https://github.com/llvm/llvm-project/pull/169997.diff

3 Files Affected:

  • (modified) llvm/lib/CodeGen/RegisterCoalescer.cpp (+9)
  • (added) llvm/test/CodeGen/X86/coalesce-implicit-def.mir (+24)
  • (added) llvm/test/CodeGen/X86/pr169996.ll (+19)
diff --git a/llvm/lib/CodeGen/RegisterCoalescer.cpp b/llvm/lib/CodeGen/RegisterCoalescer.cpp
index e624088a0964e..ebf50bc366cf1 100644
--- a/llvm/lib/CodeGen/RegisterCoalescer.cpp
+++ b/llvm/lib/CodeGen/RegisterCoalescer.cpp
@@ -507,6 +507,15 @@ bool CoalescerPair::setRegisters(const MachineInstr *MI) {
       if (Src == Dst && SrcSub != DstSub)
         return false;
 
+      // The implicit-def of the super register is zero extended.
+      for (unsigned I = MI->getDesc().getNumOperands(),
+                    E = MI->getNumOperands();
+           I != E; ++I) {
+        const MachineOperand &MO = MI->getOperand(I);
+        if (MO.isReg() && MO.isDef() && MO.getReg() == Dst)
+          return false;
+      }
+
       NewRC = TRI.getCommonSuperRegClass(SrcRC, SrcSub, DstRC, DstSub, SrcIdx,
                                          DstIdx);
       if (!NewRC)
diff --git a/llvm/test/CodeGen/X86/coalesce-implicit-def.mir b/llvm/test/CodeGen/X86/coalesce-implicit-def.mir
new file mode 100644
index 0000000000000..bd9de3b933394
--- /dev/null
+++ b/llvm/test/CodeGen/X86/coalesce-implicit-def.mir
@@ -0,0 +1,24 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 6
+# RUN: llc -run-pass register-coalescer -mtriple x86_64-unknown-linux-gnu -o - %s | FileCheck %s
+
+# Checks that we do not merge %1 into %0.
+# `undef %1.sub_32bit = COPY %0.sub_32bit, implicit-def %1` is zero extended.
+
+---
+name:            fn1
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $rdi
+
+    ; CHECK-LABEL: name: fn1
+    ; CHECK: liveins: $rdi
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:gr64_with_sub_8bit = COPY $rdi
+    ; CHECK-NEXT: undef [[COPY1:%[0-9]+]].sub_32bit:gr64_with_sub_8bit = COPY [[COPY]].sub_32bit, implicit-def [[COPY1]]
+    ; CHECK-NEXT: $rax = COPY [[COPY1]]
+    %0:gr64_with_sub_8bit = COPY $rdi
+    undef %1.sub_32bit:gr64_with_sub_8bit = COPY %0.sub_32bit, implicit-def %1
+    $rax = COPY %1
+...
+
diff --git a/llvm/test/CodeGen/X86/pr169996.ll b/llvm/test/CodeGen/X86/pr169996.ll
new file mode 100644
index 0000000000000..71e50489afd4d
--- /dev/null
+++ b/llvm/test/CodeGen/X86/pr169996.ll
@@ -0,0 +1,19 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 6
+; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu | FileCheck %s
+
+; FIXME: The first instruction should be `movl %edi, %eax`.
+
+define i64 @fn1(i64 %arg, ptr %arg1) {
+; CHECK-LABEL: fn1:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    movq %rdi, %rax
+; CHECK-NEXT:    movb (%rsi), %al
+; CHECK-NEXT:    retq
+  %i = trunc i64 %arg to i32
+  %i4 = load i8, ptr %arg1
+  %i5 = zext i8 %i4 to i32
+  %i6 = and i32 %i, -256
+  %i7 = or i32 %i6, %i5
+  %i12 = zext i32 %i7 to i64
+  ret i64 %i12
+}

@MacDue
Copy link
Member

MacDue commented Nov 29, 2025

Is this change needed given #168353 was reverted (again)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants