Skip to content

[mlir][memref] Allow out-of-bounds semantics for memref.subview #152164

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Groverkss
Copy link
Member

out-of-bounds behavior for memref.subview was disallowed by #131876 , but it adds multiple problems to the problem definition:

  1. The op definition introduces undefined behavior to the operation. This means the operation is not Pure anymore. This is already a bug:
0<= %arg0 < 128
if (%arg0 < 64) {
  %subview = memref.subview %mem[%arg0][64][1] : memref<64xf32>
  memref.store %arg1, %subview[0]
} 

Because memref.subview is undefined if the subview is hoisted out, this introduces UB when hoisted out, which it will because it is marked as Pure.

  1. The op has ir that verifies, but will not verify after canonicalization:
%i = arith.constant 10 : index
memref.subview %mem[%i][10][1] : memref<10xf32>

This ir will verify, but after canonicalization it will not verify. This makes the operation definition inconsistent.

There are more issues with having this out of bounds verifier, but these two issues above are already enough to warrant reverting this behavior.

@llvmbot
Copy link
Member

llvmbot commented Aug 5, 2025

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-memref

Author: Kunwar Grover (Groverkss)

Changes

out-of-bounds behavior for memref.subview was disallowed by #131876 , but it adds multiple problems to the problem definition:

  1. The op definition introduces undefined behavior to the operation. This means the operation is not Pure anymore. This is already a bug:
0&lt;= %arg0 &lt; 128
if (%arg0 &lt; 64) {
  %subview = memref.subview %mem[%arg0][64][1] : memref&lt;64xf32&gt;
  memref.store %arg1, %subview[0]
} 

Because memref.subview is undefined if the subview is hoisted out, this introduces UB when hoisted out, which it will because it is marked as Pure.

  1. The op has ir that verifies, but will not verify after canonicalization:
%i = arith.constant 10 : index
memref.subview %mem[%i][10][1] : memref&lt;10xf32&gt;

This ir will verify, but after canonicalization it will not verify. This makes the operation definition inconsistent.

There are more issues with having this out of bounds verifier, but these two issues above are already enough to warrant reverting this behavior.


Full diff: https://github.com/llvm/llvm-project/pull/152164.diff

4 Files Affected:

  • (modified) mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td (+3-5)
  • (modified) mlir/include/mlir/Interfaces/ViewLikeInterface.h (+10-7)
  • (modified) mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp (+4-12)
  • (modified) mlir/test/Dialect/MemRef/invalid.mlir (-16)
diff --git a/mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td b/mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td
index 9321089ab55fa..3c34b7b9e00e9 100644
--- a/mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td
+++ b/mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td
@@ -2047,11 +2047,9 @@ def SubViewOp : MemRef_OpWithOffsetSizesAndStrides<"subview", [
     result_offset = src_offset + dot_product(offset_operands, src_strides)
     ```
 
-    The offset, size and stride operands must be in-bounds with respect to the
-    source memref. When possible, the static operation verifier will detect
-    out-of-bounds subviews. Subviews that cannot be confirmed to be in-bounds
-    or out-of-bounds based on compile-time information are valid. However,
-    performing an out-of-bounds subview at runtime is undefined behavior.
+    The operation does not guarantee if the created subview is in-bounds. It is
+    the responsibility of the user to guarantee there are no out-of-bounds
+    accesses into the subview.
 
     Example 1:
 
diff --git a/mlir/include/mlir/Interfaces/ViewLikeInterface.h b/mlir/include/mlir/Interfaces/ViewLikeInterface.h
index db9c37fc3dc99..d175acb3dfd29 100644
--- a/mlir/include/mlir/Interfaces/ViewLikeInterface.h
+++ b/mlir/include/mlir/Interfaces/ViewLikeInterface.h
@@ -76,7 +76,8 @@ SliceBoundsVerificationResult verifyInBoundsSlice(
 /// returns the new result type of the op, based on the new offsets, sizes and
 /// strides. `CastOpFunc` is used to generate a cast op if the result type of
 /// the op has changed.
-template <typename OpType, typename ResultTypeFn, typename CastOpFunc>
+template <typename OpType, typename ResultTypeFn, typename CastOpFunc,
+          bool CheckInBounds = true>
 class OpWithOffsetSizesAndStridesConstantArgumentFolder final
     : public OpRewritePattern<OpType> {
 public:
@@ -94,12 +95,14 @@ class OpWithOffsetSizesAndStridesConstantArgumentFolder final
         failed(foldDynamicIndexList(mixedStrides)))
       return failure();
 
-    // Pattern does not apply if the produced op would not verify.
-    SliceBoundsVerificationResult sliceResult = verifyInBoundsSlice(
-        cast<ShapedType>(op.getSource().getType()).getShape(), mixedOffsets,
-        mixedSizes, mixedStrides);
-    if (!sliceResult.isValid)
-      return failure();
+    if (CheckInBounds) {
+      // Pattern does not apply if the produced op would not verify.
+      SliceBoundsVerificationResult sliceResult = verifyInBoundsSlice(
+          cast<ShapedType>(op.getSource().getType()).getShape(), mixedOffsets,
+          mixedSizes, mixedStrides);
+      if (!sliceResult.isValid)
+        return failure();
+    }
 
     // Compute the new result type.
     auto resultType =
diff --git a/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp b/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
index 74b968c27a62a..6e5ca3678ec59 100644
--- a/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
+++ b/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
@@ -2977,14 +2977,6 @@ LogicalResult SubViewOp::verify() {
     return produceSubViewErrorMsg(SliceVerificationResult::LayoutMismatch,
                                   *this, expectedType);
 
-  // Verify that offsets, sizes, strides do not run out-of-bounds with respect
-  // to the base memref.
-  SliceBoundsVerificationResult boundsResult =
-      verifyInBoundsSlice(baseType.getShape(), staticOffsets, staticSizes,
-                          staticStrides, /*generateErrorMessage=*/true);
-  if (!boundsResult.isValid)
-    return getOperation()->emitError(boundsResult.errorMessage);
-
   return success();
 }
 
@@ -3253,10 +3245,10 @@ struct SubViewCanonicalizer {
 
 void SubViewOp::getCanonicalizationPatterns(RewritePatternSet &results,
                                             MLIRContext *context) {
-  results
-      .add<OpWithOffsetSizesAndStridesConstantArgumentFolder<
-               SubViewOp, SubViewReturnTypeCanonicalizer, SubViewCanonicalizer>,
-           SubViewOpMemRefCastFolder, TrivialSubViewOpFolder>(context);
+  results.add<OpWithOffsetSizesAndStridesConstantArgumentFolder<
+                  SubViewOp, SubViewReturnTypeCanonicalizer,
+                  SubViewCanonicalizer, /*CheckInBounds=*/false>,
+              SubViewOpMemRefCastFolder, TrivialSubViewOpFolder>(context);
 }
 
 OpFoldResult SubViewOp::fold(FoldAdaptor adaptor) {
diff --git a/mlir/test/Dialect/MemRef/invalid.mlir b/mlir/test/Dialect/MemRef/invalid.mlir
index b4476036d6513..e7a48979d11d0 100644
--- a/mlir/test/Dialect/MemRef/invalid.mlir
+++ b/mlir/test/Dialect/MemRef/invalid.mlir
@@ -745,22 +745,6 @@ func.func @invalid_subview(%arg0 : index, %arg1 : index, %arg2 : index) {
 
 // -----
 
-func.func @invalid_subview(%arg0: memref<10xf32>) {
-  // expected-error@+1 {{offset 0 is out-of-bounds: 10 >= 10}}
-  %0 = memref.subview %arg0 [10][1][1] : memref<10xf32> to memref<1xf32, strided<[1], offset: 10>>
-  return
-}
-
-// -----
-
-func.func @invalid_subview(%arg0: memref<9xf32>) {
-  // expected-error@+1 {{slice along dimension 0 runs out-of-bounds: 9 >= 9}}
-  %0 = memref.subview %arg0 [3][4][2] : memref<9xf32> to memref<4xf32, strided<[2], offset: 3>>
-  return
-}
-
-// -----
-
 func.func @invalid_rank_reducing_subview(%arg0 : index, %arg1 : index, %arg2 : index) {
   %0 = memref.alloc() : memref<8x16x4xf32>
   // expected-error@+1 {{expected result type to be 'memref<8x16x4xf32, strided<[64, 4, 1]>>' or a rank-reduced version. (mismatch of result sizes)}}

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request reverts the out-of-bounds verification for memref.subview operations that was introduced in a previous change. The change addresses two critical issues: 1) the out-of-bounds verification made the operation impure by introducing undefined behavior, contradicting its Pure trait, and 2) it created inconsistencies where valid IR could become invalid after canonicalization.

Key changes:

  • Remove out-of-bounds verification from SubViewOp::verify()
  • Update canonicalization patterns to skip bounds checking
  • Modify operation documentation to clarify that bounds checking is the user's responsibility

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
mlir/test/Dialect/MemRef/invalid.mlir Removes test cases that expected out-of-bounds errors
mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp Removes bounds verification logic and updates canonicalization
mlir/include/mlir/Interfaces/ViewLikeInterface.h Adds template parameter to make bounds checking optional
mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td Updates documentation to remove bounds guarantee

mixedSizes, mixedStrides);
if (!sliceResult.isValid)
return failure();
if (CheckInBounds) {
Copy link
Preview

Copilot AI Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conditional bounds checking introduces complexity where the template parameter controls verification behavior. Consider documenting why bounds checking is optional and when each mode should be used to improve code maintainability.

Copilot uses AI. Check for mistakes.

Comment on lines +2050 to +2052
The operation does not guarantee if the created subview is in-bounds. It is
the responsibility of the user to guarantee there are no out-of-bounds
accesses into the subview.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we use poison?

Suggested change
The operation does not guarantee if the created subview is in-bounds. It is
the responsibility of the user to guarantee there are no out-of-bounds
accesses into the subview.
It is legal to use this operation to create a subview that is out-of-bounds, however
the returned value is poison and access to the subview will trigger undefined-behavior.

Copy link
Member

@matthias-springer matthias-springer Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Groverkss Can you describe the use case that you have in mind? If the result is poisoned (which would make sense to me), loading from it will also be poison undefined behavior (even if the out-of-bounds element will be masked out). I suspect you want to use memref.subview just as an op that does index computations

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The returned value is partially poisoned. The entire memref is not poisoned. It is not undefined behavior to access part of the subview that is in bounds.

Copy link
Member

@matthias-springer matthias-springer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The op has ir that verifies, but will not verify after canonicalization: [...]
This ir will verify, but after canonicalization it will not verify. This makes the operation definition inconsistent.

This should not be the case. This case is accounted for in OpWithOffsetSizesAndStridesConstantArgumentFolder:

    // Pattern does not apply if the produced op would not verify.
    SliceBoundsVerificationResult sliceResult = verifyInBoundsSlice(
        cast<ShapedType>(op.getSource().getType()).getShape(), mixedOffsets,
        mixedSizes, mixedStrides);

This check should be relaxed for memref.subview, so that the canonicalizer can kick in.
edit: I saw that you already did that.

@@ -745,22 +745,6 @@ func.func @invalid_subview(%arg0 : index, %arg1 : index, %arg2 : index) {

// -----

func.func @invalid_subview(%arg0: memref<10xf32>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move these to ops.mlir?

out-of-bounds subviews. Subviews that cannot be confirmed to be in-bounds
or out-of-bounds based on compile-time information are valid. However,
performing an out-of-bounds subview at runtime is undefined behavior.
The operation does not guarantee if the created subview is in-bounds. It is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we clarify what exactly out-of-bounds means? Something along the lines of:

Subviews are allowed to run out-of-bounds. Both the offset and "offset + size" are allowed to be larger than the respective source dimension size. They cannot be negative.

Copy link
Member

@matthias-springer matthias-springer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also delete this (MemRef/Transforms/RuntimeOpVerification.cpp):

struct SubViewOpInterface
    : public RuntimeVerifiableOpInterface::ExternalModel<SubViewOpInterface,
                                                         SubViewOp> {
  void generateRuntimeVerification(Operation *op, OpBuilder &builder,
                                   Location loc) const {

This file should stay in sync with the op semantics.

@rengolin
Copy link
Member

rengolin commented Aug 6, 2025

The previous PR adds this to the doc:

Subviews that cannot be confirmed to be in-bounds or out-of-bounds based on compile-time information are valid.

So the point below is confusing:

The op has ir that verifies, but will not verify after canonicalization

Do you mean constant propagation, instead?

Regardless, I believe this is in line with the intention of the comment above: "confirmed at compile-time".

My question is what do we mean by "compile-time"? Is it at any point in the pipeline (and therefore the behaviour recently added is "ok")? Or at egress (and then terms like canonicalization make more sense)?

The former creates the problem @Groverkss refers to. The latter needs a point in time to do that kind of verification, and we don't have a trigger that does that on exit (whatever that means).

Operation verification doesn't intentionally walk the producers, but if the values are constant attributes, it can verify bounds.

What would be the case we'd want to have an out-of-bounds all constant indexing into a static shape to be valid?

@matthias-springer
Copy link
Member

matthias-springer commented Aug 6, 2025

My question is what do we mean by "compile-time"? Is it at any point in the pipeline (and therefore the behaviour recently added is "ok")? Or at egress (and then terms like canonicalization make more sense)?

I think I wrote that sentence. By "compile-time information" I meant "looking at the attributes + type of the memref.subview operation".

What would be the case we'd want to have an out-of-bounds all constant indexing into a static shape to be valid?

I have asked myself the same question.

In the current design, whether an op is considered valid is different depending on whether it can be detected based on (a) looking at attributes and types or (b) looking at attributes, types, values and other operations. That situation being a bit odd is one argument. I didn't buy it at first because I was used to it (we have many MLIR ops that work like that), but I'm starting to warm up to the idea... Canonicalizers + folders will require fewer special cases.

Instead, we could have an additional folder that folds such memref.subview (out-of-bounds and all constant indexing) to ub.poison. (The dynamic -> static folder/pattern would not have to care about out-of-bounds.)

@Groverkss
Copy link
Member Author

My question is what do we mean by "compile-time"? Is it at any point in the pipeline (and therefore the behaviour recently added is "ok")? Or at egress (and then terms like canonicalization make more sense)?

I think I wrote that sentence. By "compile-time information" I meant "looking at the attributes + type of the memref.subview operation".

What would be the case we'd want to have an out-of-bounds all constant indexing into a static shape to be valid?

I have asked myself the same question.

In the current design, whether an op is considered valid is different depending on whether it can be detected based on (a) looking at attributes and types or (b) looking at attributes, types, values and other operations. That situation being a bit odd is one argument. I didn't buy it at first because I was used to it (we have many MLIR ops that work like that), but I'm starting to warm up to the idea... Canonicalizers + folders will require fewer special cases.

Instead, we could have an additional folder that folds such memref.subview (out-of-bounds and all constant indexing) to ub.poison. (The dynamic -> static folder/pattern would not have to care about out-of-bounds.)

I don't know if it's correct to fold it to poison. It is only correct to fold it to poison if the entire subview is out of bounds. I don't think there is a notion of a slice being poison. For example:

%subview = memref.subview %sub[3][2][1] : memref<4xf32> -> memref<2xf32>
memref.load %subview[0] : f32

this is valid, we never access any out of bounds memory and there is no undefined behavior here.

%subview = memref.subview %sub[3][2][1] : memref<4xf32> -> memref<2xf32>
memref.load %subview[1] : f32

this is invalid, and there is undefined behavior here.

We can fold a subview to poison, if the entire subview is out-of-bounds.

@matthias-springer
Copy link
Member

matthias-springer commented Aug 6, 2025

this is valid, we never access any out of bounds memory and there is no undefined behavior here.

That's why I asked above what your use case is.

The reason why I'm feeling uneasy about memrefs running out-of-bounds is that we have operations such as vector.transfer_read that take a look at the size of a memref to determine if masking is needed. That will no longer work if memref.subview is allowed to produce a memref that runs out-of-bounds. What does it even mean for a memref to have a certain size if part of the memref points to invalid/unallocated memory? (In other words: why does the memref have a size?)

I would start with a smaller PR that just turns the immediate UB into poison. (Including partly out-of-bounds subviews.) I think that step is less controversial. Then discuss turning partly out-of-bounds subviews into non-poison separately. I think this needs more discussion.

@Groverkss
Copy link
Member Author

The reason why I'm feeling uneasy about memrefs running out-of-bounds is that we have operations such as vector.transfer_read that take a look at the size of a memref to determine if masking is needed.

vector.transfer_read masking does not use the source memref size to determine if masking is needed. A mask needs to be provided by the user. I think you are talking about the in_bounds attribute, which I agree would work weirdly with memrefs that can be out-of-bounds.

@matthias-springer
Copy link
Member

matthias-springer commented Aug 6, 2025

Yes, I was referring to the in_bounds attribute.

What does it even mean for a memref to have a certain size if part of the memref points to invalid/unallocated memory? (In other words: why does the memref have a size?)

This is the more fundamental question that I'd like to answer.

@rengolin
Copy link
Member

rengolin commented Aug 6, 2025

vector.transfer_read masking does not use the source memref size to determine if masking is needed. A mask needs to be provided by the user. I think you are talking about the in_bounds attribute, which I agree would work weirdly with memrefs that can be out-of-bounds.

Don't forget about scalable vector extensions, which get the mask from the first-fault. @banach-space

What does it even mean for a memref to have a certain size if part of the memref points to invalid/unallocated memory? (In other words: why does the memref have a size?)

This is the more fundamental question that I'd like to answer.

The value of higher-level abstraction is that you can trust it when lowering/vectorizing because you know the information is complete. If we add a clause where UB is only triggered if you actually access the out-of-bounds memory (ie. needs to look at a DAG of ops), then we're back into LLVM IR level, and the lower part of the stack gets confused.

The type of validation that would be required to assess UB here would be the second case @matthias-springer mentions:

(b) looking at attributes, types, values and other operations.

The basic validation (a) only looks at the op itself. (b) needs to be a separate pattern (that either fails or converts the value to ub.poison. Which is what I think Matthias proposed.

Alternatively, we can reinforce the semantics of attributes like in-bounds and partial-out-of-bounds and totally-out-of-bounds and convert them to poison or not, and help the vector.transfer_read check static bounds to a potential out-of-bounds memref slice without needing a full code scan pass.

@rengolin rengolin closed this Aug 6, 2025
@rengolin rengolin reopened this Aug 6, 2025
@rengolin
Copy link
Member

rengolin commented Aug 6, 2025

(sorry, my hand closed the PR without the brain agreeing to it) 😆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants