-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[flang][acc] Ensure all acc.loop get a default parallelism determination mode #143623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This PR updates the flang lowering to explicitly implement the OpenACC rules: - As per OpenACC 3.3 standard section 2.9.6 independent clause: A loop construct with no auto or seq clause is treated as if it has the independent clause when it is an orphaned loop construct or its parent compute construct is a parallel construct. - As per OpenACC 3.3 standard section 2.9.7 auto clause: When the parent compute construct is a kernels construct, a loop construct with no independent or seq clause is treated as if it has the auto clause. - Loops in serial regions are `seq` if they have no other parallelism marking such as gang, worker, vector. For now the `acc.loop` verifier has not yet been updated to enforce this.
@llvm/pr-subscribers-flang-fir-hlfir @llvm/pr-subscribers-mlir-openacc Author: Razvan Lupusoru (razvanlupusoru) ChangesThis PR updates the flang lowering to explicitly implement the OpenACC rules:
For now the Patch is 29.61 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/143623.diff 7 Files Affected:
diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp
index 02dba22c29c7f..11e99d100f31c 100644
--- a/flang/lib/Lower/OpenACC.cpp
+++ b/flang/lib/Lower/OpenACC.cpp
@@ -2150,6 +2150,60 @@ privatizeIv(Fortran::lower::AbstractConverter &converter,
ivPrivate.push_back(privateValue);
}
+static void determineDefaultLoopParMode(
+ Fortran::lower::AbstractConverter &converter, mlir::acc::LoopOp &loopOp,
+ llvm::SmallVector<mlir::Attribute> &seqDeviceTypes,
+ llvm::SmallVector<mlir::Attribute> &independentDeviceTypes,
+ llvm::SmallVector<mlir::Attribute> &autoDeviceTypes) {
+ auto hasDeviceNone = [](mlir::Attribute attr) -> bool {
+ return mlir::dyn_cast<mlir::acc::DeviceTypeAttr>(attr).getValue() ==
+ mlir::acc::DeviceType::None;
+ };
+ bool hasDefaultSeq = llvm::any_of(seqDeviceTypes, hasDeviceNone);
+ bool hasDefaultIndependent =
+ llvm::any_of(independentDeviceTypes, hasDeviceNone);
+ bool hasDefaultAuto = llvm::any_of(autoDeviceTypes, hasDeviceNone);
+ if (hasDefaultSeq || hasDefaultIndependent || hasDefaultAuto)
+ return; // Default loop par mode is already specified.
+
+ mlir::Region *currentRegion =
+ converter.getFirOpBuilder().getBlock()->getParent();
+ mlir::Operation *parentOp = mlir::acc::getEnclosingComputeOp(*currentRegion);
+ const bool isOrphanedLoop = !parentOp;
+ if (isOrphanedLoop ||
+ mlir::isa_and_present<mlir::acc::ParallelOp>(parentOp)) {
+ // As per OpenACC 3.3 standard section 2.9.6 independent clause:
+ // A loop construct with no auto or seq clause is treated as if it has the
+ // independent clause when it is an orphaned loop construct or its parent
+ // compute construct is a parallel construct.
+ independentDeviceTypes.push_back(mlir::acc::DeviceTypeAttr::get(
+ converter.getFirOpBuilder().getContext(), mlir::acc::DeviceType::None));
+ } else if (mlir::isa_and_present<mlir::acc::SerialOp>(parentOp)) {
+ // Serial construct implies `seq` clause on loop. However, this
+ // conflicts with parallelism assignment if already set. Therefore check
+ // that first.
+ bool hasDefaultGangWorkerOrVector =
+ loopOp.hasVector() || loopOp.getVectorValue() || loopOp.hasWorker() ||
+ loopOp.getWorkerValue() || loopOp.hasGang() ||
+ loopOp.getGangValue(mlir::acc::GangArgType::Num) ||
+ loopOp.getGangValue(mlir::acc::GangArgType::Dim) ||
+ loopOp.getGangValue(mlir::acc::GangArgType::Static);
+ if (!hasDefaultGangWorkerOrVector)
+ seqDeviceTypes.push_back(mlir::acc::DeviceTypeAttr::get(
+ converter.getFirOpBuilder().getContext(),
+ mlir::acc::DeviceType::None));
+ } else {
+ // As per OpenACC 3.3 standard section 2.9.7 auto clause:
+ // When the parent compute construct is a kernels construct, a loop
+ // construct with no independent or seq clause is treated as if it has the
+ // auto clause.
+ assert(mlir::isa_and_present<mlir::acc::KernelsOp>(parentOp) &&
+ "Expected kernels construct");
+ autoDeviceTypes.push_back(mlir::acc::DeviceTypeAttr::get(
+ converter.getFirOpBuilder().getContext(), mlir::acc::DeviceType::None));
+ }
+}
+
static mlir::acc::LoopOp createLoopOp(
Fortran::lower::AbstractConverter &converter,
mlir::Location currentLocation,
@@ -2482,6 +2536,9 @@ static mlir::acc::LoopOp createLoopOp(
loopOp.setTileOperandsSegmentsAttr(
builder.getDenseI32ArrayAttr(tileOperandsSegments));
+ // Determine the loop's default par mode - either seq, independent, or auto.
+ determineDefaultLoopParMode(converter, loopOp, seqDeviceTypes,
+ independentDeviceTypes, autoDeviceTypes);
if (!seqDeviceTypes.empty())
loopOp.setSeqAttr(builder.getArrayAttr(seqDeviceTypes));
if (!independentDeviceTypes.empty())
diff --git a/flang/test/Lower/OpenACC/acc-kernels-loop.f90 b/flang/test/Lower/OpenACC/acc-kernels-loop.f90
index 8608b0ad98ce6..4e968144399a8 100644
--- a/flang/test/Lower/OpenACC/acc-kernels-loop.f90
+++ b/flang/test/Lower/OpenACC/acc-kernels-loop.f90
@@ -47,7 +47,7 @@ subroutine acc_kernels_loop
! CHECK: acc.kernels {
! CHECK: acc.loop private{{.*}} {
! CHECK: acc.yield
-! CHECK-NEXT: }{{$}}
+! CHECK-NEXT: } attributes {auto_ = [#acc.device_type<none>]{{.*}}}
! CHECK: acc.terminator
! CHECK-NEXT: }{{$}}
@@ -59,7 +59,7 @@ subroutine acc_kernels_loop
! CHECK: acc.kernels combined(loop) {
! CHECK: acc.loop combined(kernels) private{{.*}} {
! CHECK: acc.yield
-! CHECK-NEXT: }{{$}}
+! CHECK-NEXT: } attributes {auto_ = [#acc.device_type<none>]{{.*}}}
! CHECK: acc.terminator
! CHECK-NEXT: }{{$}}
@@ -490,7 +490,7 @@ subroutine acc_kernels_loop
! CHECK: acc.kernels {{.*}} {
! CHECK: acc.loop {{.*}} gang {{.*}} {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}{{$}}
+! CHECK-NEXT: } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
! CHECK: acc.terminator
! CHECK-NEXT: }{{$}}
@@ -503,7 +503,7 @@ subroutine acc_kernels_loop
! CHECK: [[GANGNUM1:%.*]] = arith.constant 8 : i32
! CHECK: acc.loop {{.*}} gang({num=[[GANGNUM1]] : i32}) {{.*}} {
! CHECK: acc.yield
-! CHECK-NEXT: }{{$}}
+! CHECK-NEXT: } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
! CHECK: acc.terminator
! CHECK-NEXT: }{{$}}
@@ -516,7 +516,7 @@ subroutine acc_kernels_loop
! CHECK: [[GANGNUM2:%.*]] = fir.load %{{.*}} : !fir.ref<i32>
! CHECK: acc.loop {{.*}} gang({num=[[GANGNUM2]] : i32}) {{.*}} {
! CHECK: acc.yield
-! CHECK-NEXT: }{{$}}
+! CHECK-NEXT: } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
! CHECK: acc.terminator
! CHECK-NEXT: }{{$}}
@@ -528,7 +528,7 @@ subroutine acc_kernels_loop
! CHECK: acc.kernels {{.*}} {
! CHECK: acc.loop {{.*}} gang({num=%{{.*}} : i32, static=%{{.*}} : i32})
! CHECK: acc.yield
-! CHECK-NEXT: }{{$}}
+! CHECK-NEXT: } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
! CHECK: acc.terminator
! CHECK-NEXT: }{{$}}
@@ -540,7 +540,7 @@ subroutine acc_kernels_loop
! CHECK: acc.kernels {{.*}} {
! CHECK: acc.loop {{.*}} vector {{.*}} {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}{{$}}
+! CHECK-NEXT: } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
! CHECK: acc.terminator
! CHECK-NEXT: }{{$}}
@@ -553,7 +553,7 @@ subroutine acc_kernels_loop
! CHECK: [[CONSTANT128:%.*]] = arith.constant 128 : i32
! CHECK: acc.loop {{.*}} vector([[CONSTANT128]] : i32) {{.*}} {
! CHECK: acc.yield
-! CHECK-NEXT: }{{$}}
+! CHECK-NEXT: } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
! CHECK: acc.terminator
! CHECK-NEXT: }{{$}}
@@ -566,7 +566,7 @@ subroutine acc_kernels_loop
! CHECK: [[VECTORLENGTH:%.*]] = fir.load %{{.*}} : !fir.ref<i32>
! CHECK: acc.loop {{.*}} vector([[VECTORLENGTH]] : i32) {{.*}} {
! CHECK: acc.yield
-! CHECK-NEXT: }{{$}}
+! CHECK-NEXT: } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
! CHECK: acc.terminator
! CHECK-NEXT: }{{$}}
@@ -578,7 +578,7 @@ subroutine acc_kernels_loop
! CHECK: acc.kernels {{.*}} {
! CHECK: acc.loop {{.*}} worker {{.*}} {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}{{$}}
+! CHECK-NEXT: } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
! CHECK: acc.terminator
! CHECK-NEXT: }{{$}}
@@ -591,7 +591,7 @@ subroutine acc_kernels_loop
! CHECK: [[WORKER128:%.*]] = arith.constant 128 : i32
! CHECK: acc.loop {{.*}} worker([[WORKER128]] : i32) {{.*}} {
! CHECK: acc.yield
-! CHECK-NEXT: }{{$}}
+! CHECK-NEXT: } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
! CHECK: acc.terminator
! CHECK-NEXT: }{{$}}
@@ -605,7 +605,7 @@ subroutine acc_kernels_loop
! CHECK: acc.kernels {{.*}} {
! CHECK: acc.loop {{.*}} {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {collapse = [2], collapseDeviceType = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true, true>}
+! CHECK-NEXT: } attributes {{{.*}}collapse = [2], collapseDeviceType = [#acc.device_type<none>]{{.*}}}
! CHECK: acc.terminator
! CHECK-NEXT: }{{$}}
@@ -621,9 +621,9 @@ subroutine acc_kernels_loop
! CHECK: acc.loop {{.*}} {
! CHECK: acc.loop {{.*}} {
! CHECK: acc.yield
-! CHECK-NEXT: }{{$}}
+! CHECK-NEXT: } attributes {auto_ = [#acc.device_type<none>]{{.*}}}
! CHECK: acc.yield
-! CHECK-NEXT: }{{$}}
+! CHECK-NEXT: } attributes {auto_ = [#acc.device_type<none>]{{.*}}}
! CHECK: acc.terminator
! CHECK-NEXT: }{{$}}
diff --git a/flang/test/Lower/OpenACC/acc-loop.f90 b/flang/test/Lower/OpenACC/acc-loop.f90
index 0246f60705898..eca7fb30da8fa 100644
--- a/flang/test/Lower/OpenACC/acc-loop.f90
+++ b/flang/test/Lower/OpenACC/acc-loop.f90
@@ -29,7 +29,7 @@ program acc_loop
! CHECK: acc.loop private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}{{$}}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}{{$}}
!$acc loop seq
DO i = 1, n
@@ -65,7 +65,7 @@ program acc_loop
! CHECK: acc.loop gang private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
!$acc loop gang(num: 8)
DO i = 1, n
@@ -75,7 +75,7 @@ program acc_loop
! CHECK: [[GANGNUM1:%.*]] = arith.constant 8 : i32
! CHECK: acc.loop gang({num=[[GANGNUM1]] : i32}) private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
!$acc loop gang(num: gangNum)
DO i = 1, n
@@ -85,7 +85,7 @@ program acc_loop
! CHECK: [[GANGNUM2:%.*]] = fir.load %{{.*}} : !fir.ref<i32>
! CHECK: acc.loop gang({num=[[GANGNUM2]] : i32}) private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
!$acc loop gang(num: gangNum, static: gangStatic)
DO i = 1, n
@@ -94,7 +94,7 @@ program acc_loop
! CHECK: acc.loop gang({num=%{{.*}} : i32, static=%{{.*}} : i32}) private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
!$acc loop vector
DO i = 1, n
@@ -103,7 +103,7 @@ program acc_loop
! CHECK: acc.loop vector private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
!$acc loop vector(128)
DO i = 1, n
@@ -113,7 +113,7 @@ program acc_loop
! CHECK: [[CONSTANT128:%.*]] = arith.constant 128 : i32
! CHECK: acc.loop vector([[CONSTANT128]] : i32) private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: }{{$}}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
!$acc loop vector(vectorLength)
DO i = 1, n
@@ -123,7 +123,7 @@ program acc_loop
! CHECK: [[VECTORLENGTH:%.*]] = fir.load %{{.*}} : !fir.ref<i32>
! CHECK: acc.loop vector([[VECTORLENGTH]] : i32) private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
!$acc loop worker
DO i = 1, n
@@ -132,7 +132,7 @@ program acc_loop
! CHECK: acc.loop worker private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
!$acc loop worker(128)
DO i = 1, n
@@ -142,7 +142,7 @@ program acc_loop
! CHECK: [[WORKER128:%.*]] = arith.constant 128 : i32
! CHECK: acc.loop worker([[WORKER128]] : i32) private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
!$acc loop private(c)
DO i = 1, n
@@ -151,7 +151,7 @@ program acc_loop
! CHECK: acc.loop private(@privatization_ref_10x10xf32 -> %{{.*}} : !fir.ref<!fir.array<10x10xf32>>, @privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
! When the induction variable is explicitly private - only a single private entry should be created.
!$acc loop private(i)
@@ -161,7 +161,7 @@ program acc_loop
! CHECK: acc.loop private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
!$acc loop private(c, d)
DO i = 1, n
@@ -170,7 +170,7 @@ program acc_loop
! CHECK: acc.loop private(@privatization_ref_10x10xf32 -> %{{.*}} : !fir.ref<!fir.array<10x10xf32>>, @privatization_ref_10x10xf32 -> %{{.*}} : !fir.ref<!fir.array<10x10xf32>>, @privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
!$acc loop private(c) private(d)
DO i = 1, n
@@ -179,7 +179,7 @@ program acc_loop
! CHECK: acc.loop private(@privatization_ref_10x10xf32 -> %{{.*}} : !fir.ref<!fir.array<10x10xf32>>, @privatization_ref_10x10xf32 -> %{{.*}} : !fir.ref<!fir.array<10x10xf32>>, @privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
!$acc loop tile(2)
DO i = 1, n
@@ -189,7 +189,7 @@ program acc_loop
! CHECK: [[TILESIZE:%.*]] = arith.constant 2 : i32
! CHECK: acc.loop {{.*}} tile({[[TILESIZE]] : i32}) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
!$acc loop tile(*)
DO i = 1, n
@@ -198,7 +198,7 @@ program acc_loop
! CHECK: [[TILESIZEM1:%.*]] = arith.constant -1 : i32
! CHECK: acc.loop {{.*}} tile({[[TILESIZEM1]] : i32}) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
!$acc loop tile(2, 2)
DO i = 1, n
@@ -211,7 +211,7 @@ program acc_loop
! CHECK: [[TILESIZE2:%.*]] = arith.constant 2 : i32
! CHECK: acc.loop {{.*}} tile({[[TILESIZE1]] : i32, [[TILESIZE2]] : i32}) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
!$acc loop tile(tileSize)
DO i = 1, n
@@ -220,7 +220,7 @@ program acc_loop
! CHECK: acc.loop {{.*}} tile({%{{.*}} : i32}) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
!$acc loop tile(tileSize, tileSize)
DO i = 1, n
@@ -231,7 +231,7 @@ program acc_loop
! CHECK: acc.loop {{.*}} tile({%{{.*}} : i32, %{{.*}} : i32}) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
!$acc loop collapse(2)
DO i = 1, n
@@ -244,7 +244,7 @@ program acc_loop
! CHECK: fir.store %arg0 to %{{.*}} : !fir.ref<i32>
! CHECK: fir.store %arg1 to %{{.*}} : !fir.ref<i32>
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {collapse = [2], collapseDeviceType = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true, true>}
+! CHECK-NEXT: } attributes {collapse = [2], collapseDeviceType = [#acc.device_type<none>]{{.*}}}
!$acc loop
DO i = 1, n
@@ -257,9 +257,9 @@ program acc_loop
! CHECK: acc.loop {{.*}} control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.loop {{.*}} control(%arg1 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
!$acc loop reduction(+:reduction_r) reduction(*:reduction_i)
do i = 1, n
@@ -269,7 +269,7 @@ program acc_loop
! CHECK: acc.loop private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) reduction(@reduction_add_ref_f32 -> %{{.*}} : !fir.ref<f32>, @reduction_mul_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
!$acc loop gang(dim: gangDim, static: gangStatic)
DO i = 1, n
@@ -278,7 +278,7 @@ program acc_loop
! CHECK: acc.loop gang({dim=%{{.*}}, static=%{{.*}} : i32}) {{.*}} control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
! CHECK: acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
!$acc loop gang(dim: 1)
DO i = 1, n
@@ -287,7 +287,7 @@ program acc_loop
! CHECK: acc.loop gang({...
[truncated]
|
! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>} | ||
! CHECK-NEXT: }{{$}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this gone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't feel like those attributes were worth checking since they were not relevant for that part of the test. That said, I just restored them and made sure they work with my new changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok. Yeah the diff in GitHub made it look like a removal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ion mode (llvm#143623) This PR updates the flang lowering to explicitly implement the OpenACC rules: - As per OpenACC 3.3 standard section 2.9.6 independent clause: A loop construct with no auto or seq clause is treated as if it has the independent clause when it is an orphaned loop construct or its parent compute construct is a parallel construct. - As per OpenACC 3.3 standard section 2.9.7 auto clause: When the parent compute construct is a kernels construct, a loop construct with no independent or seq clause is treated as if it has the auto clause. - Loops in serial regions are `seq` if they have no other parallelism marking such as gang, worker, vector. For now the `acc.loop` verifier has not yet been updated to enforce this.
…ion mode (llvm#143623) This PR updates the flang lowering to explicitly implement the OpenACC rules: - As per OpenACC 3.3 standard section 2.9.6 independent clause: A loop construct with no auto or seq clause is treated as if it has the independent clause when it is an orphaned loop construct or its parent compute construct is a parallel construct. - As per OpenACC 3.3 standard section 2.9.7 auto clause: When the parent compute construct is a kernels construct, a loop construct with no independent or seq clause is treated as if it has the auto clause. - Loops in serial regions are `seq` if they have no other parallelism marking such as gang, worker, vector. For now the `acc.loop` verifier has not yet been updated to enforce this.
This PR updates the flang lowering to explicitly implement the OpenACC rules:
seq
if they have no other parallelism marking such as gang, worker, vector.For now the
acc.loop
verifier has not yet been updated to enforce this.