Skip to content

[GISel] Set more MIFlags when translating GEPs #151708

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

ritter-x2a
Copy link
Member

The IRTranslator sets the flags now more consistently with
SelectionDAGBuilder::visitGetElementPtr(). This affects nuw and nusw, as
well as the recently introduced inbounds MIFlag (see PR #150900).

This PR also adds more tests to AArch64/GlobalISel/irtranslator-gep-flags.ll
to cover all points in IRTranslator::translateGetElementPtr that set flags.

For SWDEV-516125.

The IRTranslator sets the flags now more consistently with
`SelectionDAGBuilder::visitGetElementPtr()`. This affects `nuw` and `nusw`, as
well as the recently introduced `inbounds` MIFlag (see PR #150900).

This PR also adds more tests to `AArch64/GlobalISel/irtranslator-gep-flags.ll`
to cover all points in `IRTranslator::translateGetElementPtr` that set flags.

For SWDEV-516125.
Copy link
Member Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@llvmbot
Copy link
Member

llvmbot commented Aug 1, 2025

@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-llvm-globalisel

Author: Fabian Ritter (ritter-x2a)

Changes

The IRTranslator sets the flags now more consistently with
SelectionDAGBuilder::visitGetElementPtr(). This affects nuw and nusw, as
well as the recently introduced inbounds MIFlag (see PR #150900).

This PR also adds more tests to AArch64/GlobalISel/irtranslator-gep-flags.ll
to cover all points in IRTranslator::translateGetElementPtr that set flags.

For SWDEV-516125.


Patch is 23.37 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/151708.diff

6 Files Affected:

  • (modified) llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp (+30-10)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-gep.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-switch.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-gep-flags.ll (+178-15)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/translate-gep.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/function-returns.ll (+2-2)
diff --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
index fd38c308003b7..3f93c7f70d881 100644
--- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
@@ -1592,9 +1592,19 @@ bool IRTranslator::translateGetElementPtr(const User &U,
   Type *OffsetIRTy = DL->getIndexType(PtrIRTy);
   LLT OffsetTy = getLLTForType(*OffsetIRTy, *DL);
 
-  uint32_t Flags = 0;
+  uint32_t PtrAddFlags = 0;
+  // Each PtrAdd generated to implement the GEP inherits its nuw, nusw, inbounds
+  // flags.
   if (const Instruction *I = dyn_cast<Instruction>(&U))
-    Flags = MachineInstr::copyFlagsFromInstruction(*I);
+    PtrAddFlags = MachineInstr::copyFlagsFromInstruction(*I);
+
+  auto PtrAddFlagsWithConst = [&](int64_t Offset) {
+    // For nusw/inbounds GEP with an offset that is nonnegative when interpreted
+    // as signed, assume there is no unsigned overflow.
+    if (Offset >= 0 && (PtrAddFlags & MachineInstr::MIFlag::NoUSWrap))
+      return PtrAddFlags | MachineInstr::MIFlag::NoUWrap;
+    return PtrAddFlags;
+  };
 
   // Normalize Vector GEP - all scalar operands should be converted to the
   // splat vector.
@@ -1644,7 +1654,9 @@ bool IRTranslator::translateGetElementPtr(const User &U,
 
       if (Offset != 0) {
         auto OffsetMIB = MIRBuilder.buildConstant({OffsetTy}, Offset);
-        BaseReg = MIRBuilder.buildPtrAdd(PtrTy, BaseReg, OffsetMIB.getReg(0))
+        BaseReg = MIRBuilder
+                      .buildPtrAdd(PtrTy, BaseReg, OffsetMIB.getReg(0),
+                                   PtrAddFlagsWithConst(Offset))
                       .getReg(0);
         Offset = 0;
       }
@@ -1668,12 +1680,23 @@ bool IRTranslator::translateGetElementPtr(const User &U,
       if (ElementSize != 1) {
         auto ElementSizeMIB = MIRBuilder.buildConstant(
             getLLTForType(*OffsetIRTy, *DL), ElementSize);
+
+        // The multiplication is NUW if the GEP is NUW and NSW if the GEP is
+        // NUSW.
+        uint32_t ScaleFlags = PtrAddFlags & MachineInstr::MIFlag::NoUWrap;
+        if (PtrAddFlags & MachineInstr::MIFlag::NoUSWrap)
+          ScaleFlags |= MachineInstr::MIFlag::NoSWrap;
+
         GepOffsetReg =
-            MIRBuilder.buildMul(OffsetTy, IdxReg, ElementSizeMIB).getReg(0);
-      } else
+            MIRBuilder.buildMul(OffsetTy, IdxReg, ElementSizeMIB, ScaleFlags)
+                .getReg(0);
+      } else {
         GepOffsetReg = IdxReg;
+      }
 
-      BaseReg = MIRBuilder.buildPtrAdd(PtrTy, BaseReg, GepOffsetReg).getReg(0);
+      BaseReg =
+          MIRBuilder.buildPtrAdd(PtrTy, BaseReg, GepOffsetReg, PtrAddFlags)
+              .getReg(0);
     }
   }
 
@@ -1681,11 +1704,8 @@ bool IRTranslator::translateGetElementPtr(const User &U,
     auto OffsetMIB =
         MIRBuilder.buildConstant(OffsetTy, Offset);
 
-    if (Offset >= 0 && cast<GEPOperator>(U).isInBounds())
-      Flags |= MachineInstr::MIFlag::NoUWrap;
-
     MIRBuilder.buildPtrAdd(getOrCreateVReg(U), BaseReg, OffsetMIB.getReg(0),
-                           Flags);
+                           PtrAddFlagsWithConst(Offset));
     return true;
   }
 
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-gep.ll b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-gep.ll
index 639b6fd6cacc0..da171edb03808 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-gep.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-gep.ll
@@ -13,12 +13,12 @@ define i32 @cse_gep(ptr %ptr, i32 %idx) {
   ; O0-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
   ; O0-NEXT:   [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[COPY1]](s32)
   ; O0-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 16
-  ; O0-NEXT:   [[MUL:%[0-9]+]]:_(s64) = G_MUL [[SEXT]], [[C]]
-  ; O0-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[MUL]](s64)
+  ; O0-NEXT:   [[MUL:%[0-9]+]]:_(s64) = nsw G_MUL [[SEXT]], [[C]]
+  ; O0-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = nusw inbounds G_PTR_ADD [[COPY]], [[MUL]](s64)
   ; O0-NEXT:   [[COPY2:%[0-9]+]]:_(p0) = COPY [[PTR_ADD]](p0)
   ; O0-NEXT:   [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY2]](p0) :: (load (s32) from %ir.gep1)
-  ; O0-NEXT:   [[MUL1:%[0-9]+]]:_(s64) = G_MUL [[SEXT]], [[C]]
-  ; O0-NEXT:   [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[MUL1]](s64)
+  ; O0-NEXT:   [[MUL1:%[0-9]+]]:_(s64) = nsw G_MUL [[SEXT]], [[C]]
+  ; O0-NEXT:   [[PTR_ADD1:%[0-9]+]]:_(p0) = nusw inbounds G_PTR_ADD [[COPY]], [[MUL1]](s64)
   ; O0-NEXT:   [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
   ; O0-NEXT:   [[PTR_ADD2:%[0-9]+]]:_(p0) = nuw nusw inbounds G_PTR_ADD [[PTR_ADD1]], [[C1]](s64)
   ; O0-NEXT:   [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p0) :: (load (s32) from %ir.gep2)
@@ -34,8 +34,8 @@ define i32 @cse_gep(ptr %ptr, i32 %idx) {
   ; O3-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
   ; O3-NEXT:   [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[COPY1]](s32)
   ; O3-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 16
-  ; O3-NEXT:   [[MUL:%[0-9]+]]:_(s64) = G_MUL [[SEXT]], [[C]]
-  ; O3-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[MUL]](s64)
+  ; O3-NEXT:   [[MUL:%[0-9]+]]:_(s64) = nsw G_MUL [[SEXT]], [[C]]
+  ; O3-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = nusw inbounds G_PTR_ADD [[COPY]], [[MUL]](s64)
   ; O3-NEXT:   [[COPY2:%[0-9]+]]:_(p0) = COPY [[PTR_ADD]](p0)
   ; O3-NEXT:   [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY2]](p0) :: (load (s32) from %ir.gep1)
   ; O3-NEXT:   [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-switch.ll b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-switch.ll
index 79b2e2eabe89c..02a8a4f618156 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-switch.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-switch.ll
@@ -792,8 +792,8 @@ define void @jt_multiple_jump_tables(ptr %arg, i32 %arg1, ptr %arg2) {
   ; CHECK-NEXT: bb.56.bb57:
   ; CHECK-NEXT:   [[PHI:%[0-9]+]]:_(s64) = G_PHI [[C56]](s64), %bb.1, [[C57]](s64), %bb.2, [[C58]](s64), %bb.3, [[C59]](s64), %bb.4, [[C60]](s64), %bb.5, [[C61]](s64), %bb.6, [[C62]](s64), %bb.7, [[C63]](s64), %bb.8, [[C64]](s64), %bb.9, [[C65]](s64), %bb.10, [[C66]](s64), %bb.11, [[C67]](s64), %bb.12, [[C68]](s64), %bb.13, [[C69]](s64), %bb.14, [[C70]](s64), %bb.15, [[C71]](s64), %bb.16, [[C72]](s64), %bb.17, [[C73]](s64), %bb.18, [[C74]](s64), %bb.19, [[C75]](s64), %bb.20, [[C76]](s64), %bb.21, [[C77]](s64), %bb.22, [[C78]](s64), %bb.23, [[C79]](s64), %bb.24, [[C80]](s64), %bb.25, [[C81]](s64), %bb.26, [[C82]](s64), %bb.27, [[C83]](s64), %bb.28, [[C84]](s64), %bb.29, [[C85]](s64), %bb.30, [[C86]](s64), %bb.31, [[C87]](s64), %bb.32, [[C88]](s64), %bb.33, [[C89]](s64), %bb.34, [[C90]](s64), %bb.35, [[C91]](s64), %bb.36, [[C92]](s64), %bb.37, [[C93]](s64), %bb.38, [[C94]](s64), %bb.39, [[C95]](s64), %bb.40, [[C96]](s64), %bb.41, [[C97]](s64), %bb.42, [[C98]](s64), %bb.43, [[C99]](s64), %bb.44, [[C100]](s64), %bb.45, [[C101]](s64), %bb.46, [[C102]](s64), %bb.47, [[C103]](s64), %bb.48, [[C104]](s64), %bb.49, [[C105]](s64), %bb.50, [[C106]](s64), %bb.51, [[C107]](s64), %bb.52, [[C108]](s64), %bb.53, [[C109]](s64), %bb.54, [[C110]](s64), %bb.55
   ; CHECK-NEXT:   [[C111:%[0-9]+]]:_(s64) = G_CONSTANT i64 16
-  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = G_MUL [[PHI]], [[C111]]
-  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[GV]], [[MUL]](s64)
+  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = nsw G_MUL [[PHI]], [[C111]]
+  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = nusw inbounds G_PTR_ADD [[GV]], [[MUL]](s64)
   ; CHECK-NEXT:   [[C112:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
   ; CHECK-NEXT:   [[PTR_ADD1:%[0-9]+]]:_(p0) = nuw nusw inbounds G_PTR_ADD [[PTR_ADD]], [[C112]](s64)
   ; CHECK-NEXT:   [[LOAD:%[0-9]+]]:_(p0) = G_LOAD [[PTR_ADD1]](p0) :: (load (p0) from %ir.tmp59)
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-gep-flags.ll b/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-gep-flags.ll
index 8a6f2664f1e88..b7cf9b31e345a 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-gep-flags.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-gep-flags.ll
@@ -10,12 +10,12 @@ define i32 @gep_nusw_nuw(ptr %ptr, i32 %idx) {
   ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
   ; CHECK-NEXT:   [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[COPY1]](s32)
   ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 16
-  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = G_MUL [[SEXT]], [[C]]
-  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[MUL]](s64)
+  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = nsw G_MUL [[SEXT]], [[C]]
+  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = nusw inbounds G_PTR_ADD [[COPY]], [[MUL]](s64)
   ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(p0) = COPY [[PTR_ADD]](p0)
   ; CHECK-NEXT:   [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY2]](p0) :: (load (s32) from %ir.gep1)
-  ; CHECK-NEXT:   [[MUL1:%[0-9]+]]:_(s64) = G_MUL [[SEXT]], [[C]]
-  ; CHECK-NEXT:   [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[MUL1]](s64)
+  ; CHECK-NEXT:   [[MUL1:%[0-9]+]]:_(s64) = nuw nsw G_MUL [[SEXT]], [[C]]
+  ; CHECK-NEXT:   [[PTR_ADD1:%[0-9]+]]:_(p0) = nuw nusw G_PTR_ADD [[COPY]], [[MUL1]](s64)
   ; CHECK-NEXT:   [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
   ; CHECK-NEXT:   [[PTR_ADD2:%[0-9]+]]:_(p0) = nuw nusw G_PTR_ADD [[PTR_ADD1]], [[C1]](s64)
   ; CHECK-NEXT:   [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p0) :: (load (s32) from %ir.gep2)
@@ -40,12 +40,12 @@ define i32 @gep_nuw(ptr %ptr, i32 %idx) {
   ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
   ; CHECK-NEXT:   [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[COPY1]](s32)
   ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 16
-  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = G_MUL [[SEXT]], [[C]]
-  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[MUL]](s64)
+  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = nsw G_MUL [[SEXT]], [[C]]
+  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = nusw inbounds G_PTR_ADD [[COPY]], [[MUL]](s64)
   ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(p0) = COPY [[PTR_ADD]](p0)
   ; CHECK-NEXT:   [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY2]](p0) :: (load (s32) from %ir.gep1)
-  ; CHECK-NEXT:   [[MUL1:%[0-9]+]]:_(s64) = G_MUL [[SEXT]], [[C]]
-  ; CHECK-NEXT:   [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[MUL1]](s64)
+  ; CHECK-NEXT:   [[MUL1:%[0-9]+]]:_(s64) = nuw G_MUL [[SEXT]], [[C]]
+  ; CHECK-NEXT:   [[PTR_ADD1:%[0-9]+]]:_(p0) = nuw G_PTR_ADD [[COPY]], [[MUL1]](s64)
   ; CHECK-NEXT:   [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
   ; CHECK-NEXT:   [[PTR_ADD2:%[0-9]+]]:_(p0) = nuw G_PTR_ADD [[PTR_ADD1]], [[C1]](s64)
   ; CHECK-NEXT:   [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p0) :: (load (s32) from %ir.gep2)
@@ -70,14 +70,14 @@ define i32 @gep_nusw(ptr %ptr, i32 %idx) {
   ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
   ; CHECK-NEXT:   [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[COPY1]](s32)
   ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 16
-  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = G_MUL [[SEXT]], [[C]]
-  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[MUL]](s64)
+  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = nsw G_MUL [[SEXT]], [[C]]
+  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = nusw inbounds G_PTR_ADD [[COPY]], [[MUL]](s64)
   ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(p0) = COPY [[PTR_ADD]](p0)
   ; CHECK-NEXT:   [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY2]](p0) :: (load (s32) from %ir.gep1)
-  ; CHECK-NEXT:   [[MUL1:%[0-9]+]]:_(s64) = G_MUL [[SEXT]], [[C]]
-  ; CHECK-NEXT:   [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[MUL1]](s64)
+  ; CHECK-NEXT:   [[MUL1:%[0-9]+]]:_(s64) = nsw G_MUL [[SEXT]], [[C]]
+  ; CHECK-NEXT:   [[PTR_ADD1:%[0-9]+]]:_(p0) = nusw G_PTR_ADD [[COPY]], [[MUL1]](s64)
   ; CHECK-NEXT:   [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
-  ; CHECK-NEXT:   [[PTR_ADD2:%[0-9]+]]:_(p0) = nusw G_PTR_ADD [[PTR_ADD1]], [[C1]](s64)
+  ; CHECK-NEXT:   [[PTR_ADD2:%[0-9]+]]:_(p0) = nuw nusw G_PTR_ADD [[PTR_ADD1]], [[C1]](s64)
   ; CHECK-NEXT:   [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p0) :: (load (s32) from %ir.gep2)
   ; CHECK-NEXT:   [[ADD:%[0-9]+]]:_(s32) = G_ADD [[LOAD]], [[LOAD1]]
   ; CHECK-NEXT:   $w0 = COPY [[ADD]](s32)
@@ -100,8 +100,8 @@ define i32 @gep_none(ptr %ptr, i32 %idx) {
   ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
   ; CHECK-NEXT:   [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[COPY1]](s32)
   ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 16
-  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = G_MUL [[SEXT]], [[C]]
-  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[MUL]](s64)
+  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = nsw G_MUL [[SEXT]], [[C]]
+  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = nusw inbounds G_PTR_ADD [[COPY]], [[MUL]](s64)
   ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(p0) = COPY [[PTR_ADD]](p0)
   ; CHECK-NEXT:   [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY2]](p0) :: (load (s32) from %ir.gep1)
   ; CHECK-NEXT:   [[MUL1:%[0-9]+]]:_(s64) = G_MUL [[SEXT]], [[C]]
@@ -120,3 +120,166 @@ define i32 @gep_none(ptr %ptr, i32 %idx) {
   %res = add i32 %v1, %v2
   ret i32 %res
  }
+
+define i32 @gep_nusw_negative(ptr %ptr, i32 %idx) {
+  ; CHECK-LABEL: name: gep_nusw_negative
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $w1, $x0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(p0) = COPY $x0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
+  ; CHECK-NEXT:   [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[COPY1]](s32)
+  ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 16
+  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = nsw G_MUL [[SEXT]], [[C]]
+  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = nusw inbounds G_PTR_ADD [[COPY]], [[MUL]](s64)
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(p0) = COPY [[PTR_ADD]](p0)
+  ; CHECK-NEXT:   [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY2]](p0) :: (load (s32) from %ir.gep1)
+  ; CHECK-NEXT:   [[MUL1:%[0-9]+]]:_(s64) = nsw G_MUL [[SEXT]], [[C]]
+  ; CHECK-NEXT:   [[PTR_ADD1:%[0-9]+]]:_(p0) = nusw G_PTR_ADD [[COPY]], [[MUL1]](s64)
+  ; CHECK-NEXT:   [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 -4
+  ; CHECK-NEXT:   [[PTR_ADD2:%[0-9]+]]:_(p0) = nusw G_PTR_ADD [[PTR_ADD1]], [[C1]](s64)
+  ; CHECK-NEXT:   [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p0) :: (load (s32) from %ir.gep2)
+  ; CHECK-NEXT:   [[ADD:%[0-9]+]]:_(s32) = G_ADD [[LOAD]], [[LOAD1]]
+  ; CHECK-NEXT:   $w0 = COPY [[ADD]](s32)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $w0
+  %sidx = sext i32 %idx to i64
+  %gep1 = getelementptr inbounds [4 x i32], ptr %ptr, i64 %sidx, i64 0
+  %v1 = load i32, ptr %gep1
+  %gep2 = getelementptr nusw [4 x i32], ptr %ptr, i64 %sidx, i64 -1
+  %v2 = load i32, ptr %gep2
+  %res = add i32 %v1, %v2
+  ret i32 %res
+ }
+
+define ptr @gep_many_indices(ptr %ptr, i32 %idx) {
+  ; CHECK-LABEL: name: gep_many_indices
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $w1, $x0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(p0) = COPY $x0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
+  ; CHECK-NEXT:   [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[COPY1]](s32)
+  ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 108
+  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C]](s64)
+  ; CHECK-NEXT:   [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 12
+  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = G_MUL [[SEXT]], [[C1]]
+  ; CHECK-NEXT:   [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[PTR_ADD]], [[MUL]](s64)
+  ; CHECK-NEXT:   [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 -4
+  ; CHECK-NEXT:   [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[PTR_ADD1]], [[C2]](s64)
+  ; CHECK-NEXT:   $x0 = COPY [[PTR_ADD2]](p0)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $x0
+  %sidx = sext i32 %idx to i64
+  %gep = getelementptr {i32, [4 x [3 x i32]]}, ptr %ptr, i64 2, i32 1, i64 %sidx, i64 -1
+  ret ptr %gep
+ }
+
+define ptr @gep_nuw_many_indices(ptr %ptr, i32 %idx) {
+  ; CHECK-LABEL: name: gep_nuw_many_indices
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $w1, $x0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(p0) = COPY $x0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
+  ; CHECK-NEXT:   [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[COPY1]](s32)
+  ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 108
+  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = nuw G_PTR_ADD [[COPY]], [[C]](s64)
+  ; CHECK-NEXT:   [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 12
+  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = nuw G_MUL [[SEXT]], [[C1]]
+  ; CHECK-NEXT:   [[PTR_ADD1:%[0-9]+]]:_(p0) = nuw G_PTR_ADD [[PTR_ADD]], [[MUL]](s64)
+  ; CHECK-NEXT:   [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 -4
+  ; CHECK-NEXT:   [[PTR_ADD2:%[0-9]+]]:_(p0) = nuw G_PTR_ADD [[PTR_ADD1]], [[C2]](s64)
+  ; CHECK-NEXT:   $x0 = COPY [[PTR_ADD2]](p0)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $x0
+  %sidx = sext i32 %idx to i64
+  %gep = getelementptr nuw {i32, [4 x [3 x i32]]}, ptr %ptr, i64 2, i32 1, i64 %sidx, i64 -1
+  ret ptr %gep
+ }
+
+define ptr @gep_nusw_many_indices(ptr %ptr, i32 %idx) {
+  ; CHECK-LABEL: name: gep_nusw_many_indices
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $w1, $x0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(p0) = COPY $x0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
+  ; CHECK-NEXT:   [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[COPY1]](s32)
+  ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 108
+  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = nuw nusw G_PTR_ADD [[COPY]], [[C]](s64)
+  ; CHECK-NEXT:   [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 12
+  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = nsw G_MUL [[SEXT]], [[C1]]
+  ; CHECK-NEXT:   [[PTR_ADD1:%[0-9]+]]:_(p0) = nusw G_PTR_ADD [[PTR_ADD]], [[MUL]](s64)
+  ; CHECK-NEXT:   [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 -4
+  ; CHECK-NEXT:   [[PTR_ADD2:%[0-9]+]]:_(p0) = nusw G_PTR_ADD [[PTR_ADD1]], [[C2]](s64)
+  ; CHECK-NEXT:   $x0 = COPY [[PTR_ADD2]](p0)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $x0
+  %sidx = sext i32 %idx to i64
+  %gep = getelementptr nusw {i32, [4 x [3 x i32]]}, ptr %ptr, i64 2, i32 1, i64 %sidx, i64 -1
+  ret ptr %gep
+ }
+
+define ptr @gep_inbounds_many_indices(ptr %ptr, i32 %idx) {
+  ; CHECK-LABEL: name: gep_inbounds_many_indices
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $w1, $x0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(p0) = COPY $x0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
+  ; CHECK-NEXT:   [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[COPY1]](s32)
+  ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 108
+  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = nuw nusw inbounds G_PTR_ADD [[COPY]], [[C]](s64)
+  ; CHECK-NEXT:   [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 12
+  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = nsw G_MUL [[SEXT]], [[C1]]
+  ; CHECK-NEXT:   [[PTR_ADD1:%[0-9]+]]:_(p0) = nusw inbounds G_PTR_ADD [[PTR_ADD]], [[MUL]](s64)
+  ; CHECK-NEXT:   [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 -4
+  ; CHECK-NEXT:   [[PTR_ADD2:%[0-9]+]]:_(p0) = nusw inbounds G_PTR_ADD [[PTR_ADD1]], [[C2]](s64)
+  ; CHECK-NEXT:   $x0 = COPY [[PTR_ADD2]](p0)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $x0
+  %sidx = sext i32 %idx to i64
+  %gep = getelementptr inbounds {i32, [4 x [3 x i32]]}, ptr %ptr, i64 2, i32 1, i64 %sidx, i64 -1
+  ret ptr %gep
+ }
+
+define ptr @gep_nuw_nusw_many_indices(ptr %ptr, i32 %idx) {
+  ; CHECK-LABEL: name: gep_nuw_nusw_many_indices
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $w1, $x0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(p0) = COPY $x0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
+  ; CHECK-NEXT:   [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[COPY1]](s32)
+  ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 108
+  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = nuw nusw G_PTR_ADD [[COPY]], [[C]](s64)
+  ; CHECK-NEXT:   [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 12
+  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = nuw nsw G_MUL [[SEXT]], [[C1]]
+  ; CHECK-NEXT:   [[PTR_ADD1:%[0-9]+]]:_(p0) = nuw nusw G_PTR_ADD [[PTR_ADD]], [[MUL]](s64)
+  ; CHECK-NEXT:   [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 -4
+  ; CHECK-NEXT:   [[PTR_ADD2:%[0-9]+]]:_(p0) = nuw nusw G_PTR_ADD [[PTR_ADD1]], [[C2]](s64)
+  ; CHECK-NEXT:   $x0 = COPY [[PTR_ADD2]](p0)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $x0
+  %sidx = sext i32 %idx to i64
+  %gep = getelementptr nuw nusw {i32, [4 x [3 x i32]]}, ptr %ptr, i64 2, i32 1, i64 %sidx, i64 -1
+  ret ptr %gep
+ }
+
+define ptr @gep_nuw_inbounds_many_indices(ptr %ptr, i32 %idx) {
+  ; CHECK-LABEL: name: gep_nuw_inbounds_ma...
[truncated]

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member Author

ritter-x2a commented Aug 4, 2025

Merge activity

  • Aug 4, 11:24 AM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Aug 4, 11:25 AM UTC: @ritter-x2a merged this pull request with Graphite.

@ritter-x2a ritter-x2a merged commit 95191d5 into main Aug 4, 2025
15 checks passed
@ritter-x2a ritter-x2a deleted the users/ritter-x2a/08-01-_gisel_set_more_miflags_when_translating_geps branch August 4, 2025 11:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants