Skip to content

[X86] Try to shrink signed i64 compares if the input has enough one bits #149719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

AZero13
Copy link
Contributor

@AZero13 AZero13 commented Jul 20, 2025

We have to check for SIGN_EXT because unlike the zero_ext version, DAG cannot just automatically know the top bits are 0, unlike in zero extension (hence the name zero extension).

@llvmbot
Copy link
Member

llvmbot commented Jul 20, 2025

@llvm/pr-subscribers-backend-x86

Author: AZero13 (AZero13)

Changes

We have to check for SIGN_EXT because unlike the zero_ext version, DAG cannot just automatically know the top bits are 0, unlike in zero extension (hence the name zero extension).


Full diff: https://github.com/llvm/llvm-project/pull/149719.diff

2 Files Affected:

  • (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+15-1)
  • (modified) llvm/test/CodeGen/X86/cmp.ll (+12)
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index d91ea1ea1bb1b..5d5d0c23376c7 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -23479,7 +23479,6 @@ static SDValue EmitCmp(SDValue Op0, SDValue Op1, X86::CondCode X86CC,
   }
 
   // Try to shrink i64 compares if the input has enough zero bits.
-  // TODO: Add sign-bits equivalent for isX86CCSigned(X86CC)?
   if (CmpVT == MVT::i64 && !isX86CCSigned(X86CC) &&
       Op0.hasOneUse() && // Hacky way to not break CSE opportunities with sub.
       DAG.MaskedValueIsZero(Op1, APInt::getHighBitsSet(64, 32)) &&
@@ -23489,6 +23488,21 @@ static SDValue EmitCmp(SDValue Op0, SDValue Op1, X86::CondCode X86CC,
     Op1 = DAG.getNode(ISD::TRUNCATE, dl, CmpVT, Op1);
   }
 
+  // Try to shrink signed i64 compares if the input has enough one bits.
+  // Or the input is sign extended from a 32-bit value.
+  if (CmpVT == MVT::i64 && isX86CCSigned(X86CC) &&
+      Op0.hasOneUse() && // Hacky way to not break CSE opportunities with sub.
+      (DAG.MaskedValueIsAllOnes(Op1, APInt::getHighBitsSet(64, 32)) ||
+       Op1.getOpcode() == ISD::SIGN_EXTEND ||
+       Op1.getOpcode() == ISD::SIGN_EXTEND_INREG) &&
+      (DAG.MaskedValueIsAllOnes(Op0, APInt::getHighBitsSet(64, 32)) ||
+       Op0.getOpcode() == ISD::SIGN_EXTEND ||
+       Op0.getOpcode() == ISD::SIGN_EXTEND_INREG)) {
+    CmpVT = MVT::i32;
+    Op0 = DAG.getNode(ISD::TRUNCATE, dl, CmpVT, Op0);
+    Op1 = DAG.getNode(ISD::TRUNCATE, dl, CmpVT, Op1);
+  }
+
   // 0-x == y --> x+y == 0
   // 0-x != y --> x+y != 0
   if (Op0.getOpcode() == ISD::SUB && isNullConstant(Op0.getOperand(0)) &&
diff --git a/llvm/test/CodeGen/X86/cmp.ll b/llvm/test/CodeGen/X86/cmp.ll
index f3e141740b287..d71a7adafc652 100644
--- a/llvm/test/CodeGen/X86/cmp.ll
+++ b/llvm/test/CodeGen/X86/cmp.ll
@@ -956,3 +956,15 @@ define i1 @fold_test_and_with_chain(ptr %x, ptr %y, i32 %z) {
   store i32 %z, ptr %y
   ret i1 %c
 }
+
+define i1 @sext_mask(i32 %a) {
+; CHECK-LABEL: sext_mask:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    cmpl $-523, %edi # encoding: [0x81,0xff,0xf5,0xfd,0xff,0xff]
+; CHECK-NEXT:    # imm = 0xFDF5
+; CHECK-NEXT:    setl %al # encoding: [0x0f,0x9c,0xc0]
+; CHECK-NEXT:    retq # encoding: [0xc3]
+  %a64 = sext i32 %a to i64
+  %v1 = icmp slt i64 %a64, -523
+  ret i1 %v1
+}

@AZero13 AZero13 requested a review from RKSimon July 20, 2025 16:55
if (CmpVT == MVT::i64 && isX86CCSigned(X86CC) &&
Op0.hasOneUse() && // Hacky way to not break CSE opportunities with sub.
(DAG.ComputeNumSignBits(Op1) > 32 ||
Op1.getOpcode() == ISD::SIGN_EXTEND ||
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we checking for specific opcodes? Aren't they already handled by ComputeNumSignBits?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the SIGN_EXTEND/SIGN_EXTEND_INREG checks - ComputeNumSignBits will handle it properly (also the opcode checks don't checks what typesize you're extending from).

// Try to shrink signed i64 compares if the input has enough one bits.
// Or the input is sign extended from a 32-bit value.
// TODO: Should we peek through freeze?
// TODO: Is SIGN_EXTEND_INREG needed here?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove these unnecessary TODOs.

if (CmpVT == MVT::i64 && isX86CCSigned(X86CC) &&
Op0.hasOneUse() && // Hacky way to not break CSE opportunities with sub.
(DAG.ComputeNumSignBits(Op1) > 32 ||
Op1.getOpcode() == ISD::SIGN_EXTEND ||
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the SIGN_EXTEND/SIGN_EXTEND_INREG checks - ComputeNumSignBits will handle it properly (also the opcode checks don't checks what typesize you're extending from).

@@ -956,3 +956,15 @@ define i1 @fold_test_and_with_chain(ptr %x, ptr %y, i32 %z) {
store i32 %z, ptr %y
ret i1 %c
}

define i1 @sext_mask(i32 %a) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need extract tests to better exercise this fold.

For the cmp with constant, something like these (I just copy+pasteed - please add different CondCode and constants):

define i1 @sext_i9_mask(i9 %a) {
  %a64 = sext i9 %a to i64
  %v1 = icmp slt i64 %a64, -523
  ret i1 %v1
}

define i1 @sext_i32_mask(i32 %a) {
  %a64 = sext i32 %a to i64
  %v1 = icmp slt i64 %a64, -523
  ret i1 %v1
}

define i1 @i40(i40 %a) {
  %a64 = sext i40 %a to i64
  %v1 = icmp slt i64 %a64, -523
  ret i1 %v1
}

You should also add some tests with comparison of 2 (sign/zero extended) variables - possibly different source sizes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants