-
Notifications
You must be signed in to change notification settings - Fork 14.5k
[X86] Try to shrink signed i64 compares if the input has enough one bits #149719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-backend-x86 Author: AZero13 (AZero13) ChangesWe have to check for SIGN_EXT because unlike the zero_ext version, DAG cannot just automatically know the top bits are 0, unlike in zero extension (hence the name zero extension). Full diff: https://github.com/llvm/llvm-project/pull/149719.diff 2 Files Affected:
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index d91ea1ea1bb1b..5d5d0c23376c7 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -23479,7 +23479,6 @@ static SDValue EmitCmp(SDValue Op0, SDValue Op1, X86::CondCode X86CC,
}
// Try to shrink i64 compares if the input has enough zero bits.
- // TODO: Add sign-bits equivalent for isX86CCSigned(X86CC)?
if (CmpVT == MVT::i64 && !isX86CCSigned(X86CC) &&
Op0.hasOneUse() && // Hacky way to not break CSE opportunities with sub.
DAG.MaskedValueIsZero(Op1, APInt::getHighBitsSet(64, 32)) &&
@@ -23489,6 +23488,21 @@ static SDValue EmitCmp(SDValue Op0, SDValue Op1, X86::CondCode X86CC,
Op1 = DAG.getNode(ISD::TRUNCATE, dl, CmpVT, Op1);
}
+ // Try to shrink signed i64 compares if the input has enough one bits.
+ // Or the input is sign extended from a 32-bit value.
+ if (CmpVT == MVT::i64 && isX86CCSigned(X86CC) &&
+ Op0.hasOneUse() && // Hacky way to not break CSE opportunities with sub.
+ (DAG.MaskedValueIsAllOnes(Op1, APInt::getHighBitsSet(64, 32)) ||
+ Op1.getOpcode() == ISD::SIGN_EXTEND ||
+ Op1.getOpcode() == ISD::SIGN_EXTEND_INREG) &&
+ (DAG.MaskedValueIsAllOnes(Op0, APInt::getHighBitsSet(64, 32)) ||
+ Op0.getOpcode() == ISD::SIGN_EXTEND ||
+ Op0.getOpcode() == ISD::SIGN_EXTEND_INREG)) {
+ CmpVT = MVT::i32;
+ Op0 = DAG.getNode(ISD::TRUNCATE, dl, CmpVT, Op0);
+ Op1 = DAG.getNode(ISD::TRUNCATE, dl, CmpVT, Op1);
+ }
+
// 0-x == y --> x+y == 0
// 0-x != y --> x+y != 0
if (Op0.getOpcode() == ISD::SUB && isNullConstant(Op0.getOperand(0)) &&
diff --git a/llvm/test/CodeGen/X86/cmp.ll b/llvm/test/CodeGen/X86/cmp.ll
index f3e141740b287..d71a7adafc652 100644
--- a/llvm/test/CodeGen/X86/cmp.ll
+++ b/llvm/test/CodeGen/X86/cmp.ll
@@ -956,3 +956,15 @@ define i1 @fold_test_and_with_chain(ptr %x, ptr %y, i32 %z) {
store i32 %z, ptr %y
ret i1 %c
}
+
+define i1 @sext_mask(i32 %a) {
+; CHECK-LABEL: sext_mask:
+; CHECK: # %bb.0:
+; CHECK-NEXT: cmpl $-523, %edi # encoding: [0x81,0xff,0xf5,0xfd,0xff,0xff]
+; CHECK-NEXT: # imm = 0xFDF5
+; CHECK-NEXT: setl %al # encoding: [0x0f,0x9c,0xc0]
+; CHECK-NEXT: retq # encoding: [0xc3]
+ %a64 = sext i32 %a to i64
+ %v1 = icmp slt i64 %a64, -523
+ ret i1 %v1
+}
|
if (CmpVT == MVT::i64 && isX86CCSigned(X86CC) && | ||
Op0.hasOneUse() && // Hacky way to not break CSE opportunities with sub. | ||
(DAG.ComputeNumSignBits(Op1) > 32 || | ||
Op1.getOpcode() == ISD::SIGN_EXTEND || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we checking for specific opcodes? Aren't they already handled by ComputeNumSignBits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the SIGN_EXTEND/SIGN_EXTEND_INREG checks - ComputeNumSignBits will handle it properly (also the opcode checks don't checks what typesize you're extending from).
// Try to shrink signed i64 compares if the input has enough one bits. | ||
// Or the input is sign extended from a 32-bit value. | ||
// TODO: Should we peek through freeze? | ||
// TODO: Is SIGN_EXTEND_INREG needed here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove these unnecessary TODOs.
if (CmpVT == MVT::i64 && isX86CCSigned(X86CC) && | ||
Op0.hasOneUse() && // Hacky way to not break CSE opportunities with sub. | ||
(DAG.ComputeNumSignBits(Op1) > 32 || | ||
Op1.getOpcode() == ISD::SIGN_EXTEND || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the SIGN_EXTEND/SIGN_EXTEND_INREG checks - ComputeNumSignBits will handle it properly (also the opcode checks don't checks what typesize you're extending from).
@@ -956,3 +956,15 @@ define i1 @fold_test_and_with_chain(ptr %x, ptr %y, i32 %z) { | |||
store i32 %z, ptr %y | |||
ret i1 %c | |||
} | |||
|
|||
define i1 @sext_mask(i32 %a) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need extract tests to better exercise this fold.
For the cmp with constant, something like these (I just copy+pasteed - please add different CondCode and constants):
define i1 @sext_i9_mask(i9 %a) {
%a64 = sext i9 %a to i64
%v1 = icmp slt i64 %a64, -523
ret i1 %v1
}
define i1 @sext_i32_mask(i32 %a) {
%a64 = sext i32 %a to i64
%v1 = icmp slt i64 %a64, -523
ret i1 %v1
}
define i1 @i40(i40 %a) {
%a64 = sext i40 %a to i64
%v1 = icmp slt i64 %a64, -523
ret i1 %v1
}
You should also add some tests with comparison of 2 (sign/zero extended) variables - possibly different source sizes.
We have to check for SIGN_EXT because unlike the zero_ext version, DAG cannot just automatically know the top bits are 0, unlike in zero extension (hence the name zero extension).