Description
Last month (2025-05-18)Kleiman update the AssumeAlignmentOp, let it has AnyMemRef result。
this make the following changes:
%2 = hal.interface.binding.subspan layout ... : memref<4096x4096xf16, #hal.descriptor_type<storage_buffer>>
memref.assume_alignment %2, 64 : memref<4096x4096xf16, #hal.descriptor_type<storage_buffer>>
use %2
change to
%2 = hal.interface.binding.subspan layout ... : memref<4096x4096xf16, #hal.descriptor_type<storage_buffer>>
%assume_align = memref.assume_alignment %2, 64 : memref<4096x4096xf16, #hal.descriptor_type<storage_buffer>>
use %assume_align
Problem:
This will affect the linalg hoisting optimization,due to the memref.assume_alignment inherited the interface
ViewLikeOpInterface which is excluded by linalg hoisting.
for example , in follow mlir, the
"%1 = vector.transfer_read %assume_align_0[%c0, %c0] ..." and
"vector.transfer_write %3, %assume_align_0[%c0, %c0]"
read from and write to a same location. We can hoist them out of loop:
%m0 = hal.interface.binding.subspan layout ...: memref<4096x4096xf16>
%m1 = hal.interface.binding.subspan layout ...: memref<4096x4096xf16>
%assume_align_0 = memref.assume_alignment %m0, 64 : memref<4096x4096xf16>
%assume_align_1 = memref.assume_alignment %m1, 64 : memref<4096x4096xf16>
scf.for %arg0 = %c256 to %c4096 step %c256 {
%1 = vector.transfer_read %assume_align_0[%c0, %c0], %cst_0 {in_bounds = [true, true]} : memref<4096x4096xf16>, vector<16x16xf16>
%2 = vector.transfer_read %m1[%arg0, %arg0], %cst_0 {in_bounds = [true, true]} : memref<4096x4096xf16>, vector<16x16xf16>
%3 = vector.contract {indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d2)>, affine_map<(d0, d1, d2) -> (d2, d1)>, affine_map<(d0, d1, d2) -> (d0, d1)>], iterator_types = ["parallel", "parallel", "reduction"], kind = #vector.kind<add>} %2, %2, %1 : vector<16x16xf16>, vector<16x16xf16> into vector<16x16xf16>
vector.transfer_write %3, %assume_align_0[%c0, %c0] {in_bounds = [true, true]} : vector<16x16xf16>, memref<4096x4096xf16>
}
but due to the transfer_read/write from/to an assume_alignment operation. The linalg hoisting stop do optimization for it.
(I am not much understand why the linalg hoisting do this, I am a beginner in mlir)
But the assume_alignment just mark memref's alignment, The linalg hoisting should check its memref operand not it self.
so we expect the upper mlir can be optimized to:
%m0 = hal.interface.binding.subspan layout ...: memref<4096x4096xf16>
%m1 = hal.interface.binding.subspan layout ...: memref<4096x4096xf16>
%assume_align_0 = memref.assume_alignment %m0, 64 : memref<4096x4096xf16>
%assume_align_1 = memref.assume_alignment %m1, 64 : memref<4096x4096xf16>
%0 = vector.transfer_read %assume_align[%c0, %c0], %cst {in_bounds = [true, true]} : memref<4096x4096xf16>, vector<16x16xf16> // out of loop
%1 = scf.for %arg0 = %c256 to %c4096 step %c256 iter_args(%arg1 = %0) -> (vector<16x16xf16>) {
%2 = vector.transfer_read %assume_align_1[%arg0, %arg0], %cst {in_bounds = [true, true]} : memref<4096x4096xf16>, vector<16x16xf16>
%3 = vector.contract {indexing_maps = [#map, #map1, #map2], iterator_types = ["parallel", "parallel", "reduction"], kind = #vector.kind<add>} %2, %2, %arg1 : vector<16x16xf16>, vector<16x16xf16> into vector<16x16xf16>
scf.yield %3 : vector<16x16xf16>
}
vector.transfer_write %1, %assume_align[%c0, %c0] {in_bounds = [true, true]} : vector<16x16xf16>, memref<4096x4096xf16> // out of loop
detailed example pls refer to example
(I don't not how to write hal.interface.binding for mlir-opt so, in the example I use memref.alloc() instead of them.)