Skip to content

[flang][acc] Ensure all acc.loop get a default parallelism determination mode #143623

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 11, 2025

Conversation

razvanlupusoru
Copy link
Contributor

This PR updates the flang lowering to explicitly implement the OpenACC rules:

  • As per OpenACC 3.3 standard section 2.9.6 independent clause: A loop construct with no auto or seq clause is treated as if it has the independent clause when it is an orphaned loop construct or its parent compute construct is a parallel construct.
  • As per OpenACC 3.3 standard section 2.9.7 auto clause: When the parent compute construct is a kernels construct, a loop construct with no independent or seq clause is treated as if it has the auto clause.
  • Loops in serial regions are seq if they have no other parallelism marking such as gang, worker, vector.

For now the acc.loop verifier has not yet been updated to enforce this.

This PR updates the flang lowering to explicitly implement the OpenACC
rules:
 - As per OpenACC 3.3 standard section 2.9.6 independent clause:
   A loop construct with no auto or seq clause is treated as if it has
   the independent clause when it is an orphaned loop construct or its
   parent compute construct is a parallel construct.
 - As per OpenACC 3.3 standard section 2.9.7 auto clause:
   When the parent compute construct is a kernels construct, a loop
   construct with no independent or seq clause is treated as if it has
   the auto clause.
 - Loops in serial regions are `seq` if they have no other parallelism
   marking such as gang, worker, vector.

For now the `acc.loop` verifier has not yet been updated to enforce
this.
@llvmbot llvmbot added mlir flang Flang issues not falling into any other category mlir:openacc flang:fir-hlfir openacc labels Jun 10, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 10, 2025

@llvm/pr-subscribers-flang-fir-hlfir
@llvm/pr-subscribers-openacc
@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-openacc

Author: Razvan Lupusoru (razvanlupusoru)

Changes

This PR updates the flang lowering to explicitly implement the OpenACC rules:

  • As per OpenACC 3.3 standard section 2.9.6 independent clause: A loop construct with no auto or seq clause is treated as if it has the independent clause when it is an orphaned loop construct or its parent compute construct is a parallel construct.
  • As per OpenACC 3.3 standard section 2.9.7 auto clause: When the parent compute construct is a kernels construct, a loop construct with no independent or seq clause is treated as if it has the auto clause.
  • Loops in serial regions are seq if they have no other parallelism marking such as gang, worker, vector.

For now the acc.loop verifier has not yet been updated to enforce this.


Patch is 29.61 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/143623.diff

7 Files Affected:

  • (modified) flang/lib/Lower/OpenACC.cpp (+57)
  • (modified) flang/test/Lower/OpenACC/acc-kernels-loop.f90 (+14-14)
  • (modified) flang/test/Lower/OpenACC/acc-loop.f90 (+26-26)
  • (modified) flang/test/Lower/OpenACC/acc-parallel-loop.f90 (+14-14)
  • (modified) flang/test/Lower/OpenACC/acc-serial-loop.f90 (+5-5)
  • (modified) mlir/include/mlir/Dialect/OpenACC/OpenACC.h (+6)
  • (modified) mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp (+11)
diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp
index 02dba22c29c7f..11e99d100f31c 100644
--- a/flang/lib/Lower/OpenACC.cpp
+++ b/flang/lib/Lower/OpenACC.cpp
@@ -2150,6 +2150,60 @@ privatizeIv(Fortran::lower::AbstractConverter &converter,
   ivPrivate.push_back(privateValue);
 }
 
+static void determineDefaultLoopParMode(
+    Fortran::lower::AbstractConverter &converter, mlir::acc::LoopOp &loopOp,
+    llvm::SmallVector<mlir::Attribute> &seqDeviceTypes,
+    llvm::SmallVector<mlir::Attribute> &independentDeviceTypes,
+    llvm::SmallVector<mlir::Attribute> &autoDeviceTypes) {
+  auto hasDeviceNone = [](mlir::Attribute attr) -> bool {
+    return mlir::dyn_cast<mlir::acc::DeviceTypeAttr>(attr).getValue() ==
+           mlir::acc::DeviceType::None;
+  };
+  bool hasDefaultSeq = llvm::any_of(seqDeviceTypes, hasDeviceNone);
+  bool hasDefaultIndependent =
+      llvm::any_of(independentDeviceTypes, hasDeviceNone);
+  bool hasDefaultAuto = llvm::any_of(autoDeviceTypes, hasDeviceNone);
+  if (hasDefaultSeq || hasDefaultIndependent || hasDefaultAuto)
+    return; // Default loop par mode is already specified.
+
+  mlir::Region *currentRegion =
+      converter.getFirOpBuilder().getBlock()->getParent();
+  mlir::Operation *parentOp = mlir::acc::getEnclosingComputeOp(*currentRegion);
+  const bool isOrphanedLoop = !parentOp;
+  if (isOrphanedLoop ||
+      mlir::isa_and_present<mlir::acc::ParallelOp>(parentOp)) {
+    // As per OpenACC 3.3 standard section 2.9.6 independent clause:
+    // A loop construct with no auto or seq clause is treated as if it has the
+    // independent clause when it is an orphaned loop construct or its parent
+    // compute construct is a parallel construct.
+    independentDeviceTypes.push_back(mlir::acc::DeviceTypeAttr::get(
+        converter.getFirOpBuilder().getContext(), mlir::acc::DeviceType::None));
+  } else if (mlir::isa_and_present<mlir::acc::SerialOp>(parentOp)) {
+    // Serial construct implies `seq` clause on loop. However, this
+    // conflicts with parallelism assignment if already set. Therefore check
+    // that first.
+    bool hasDefaultGangWorkerOrVector =
+        loopOp.hasVector() || loopOp.getVectorValue() || loopOp.hasWorker() ||
+        loopOp.getWorkerValue() || loopOp.hasGang() ||
+        loopOp.getGangValue(mlir::acc::GangArgType::Num) ||
+        loopOp.getGangValue(mlir::acc::GangArgType::Dim) ||
+        loopOp.getGangValue(mlir::acc::GangArgType::Static);
+    if (!hasDefaultGangWorkerOrVector)
+      seqDeviceTypes.push_back(mlir::acc::DeviceTypeAttr::get(
+          converter.getFirOpBuilder().getContext(),
+          mlir::acc::DeviceType::None));
+  } else {
+    // As per OpenACC 3.3 standard section 2.9.7 auto clause:
+    // When the parent compute construct is a kernels construct, a loop
+    // construct with no independent or seq clause is treated as if it has the
+    // auto clause.
+    assert(mlir::isa_and_present<mlir::acc::KernelsOp>(parentOp) &&
+           "Expected kernels construct");
+    autoDeviceTypes.push_back(mlir::acc::DeviceTypeAttr::get(
+        converter.getFirOpBuilder().getContext(), mlir::acc::DeviceType::None));
+  }
+}
+
 static mlir::acc::LoopOp createLoopOp(
     Fortran::lower::AbstractConverter &converter,
     mlir::Location currentLocation,
@@ -2482,6 +2536,9 @@ static mlir::acc::LoopOp createLoopOp(
     loopOp.setTileOperandsSegmentsAttr(
         builder.getDenseI32ArrayAttr(tileOperandsSegments));
 
+  // Determine the loop's default par mode - either seq, independent, or auto.
+  determineDefaultLoopParMode(converter, loopOp, seqDeviceTypes,
+                              independentDeviceTypes, autoDeviceTypes);
   if (!seqDeviceTypes.empty())
     loopOp.setSeqAttr(builder.getArrayAttr(seqDeviceTypes));
   if (!independentDeviceTypes.empty())
diff --git a/flang/test/Lower/OpenACC/acc-kernels-loop.f90 b/flang/test/Lower/OpenACC/acc-kernels-loop.f90
index 8608b0ad98ce6..4e968144399a8 100644
--- a/flang/test/Lower/OpenACC/acc-kernels-loop.f90
+++ b/flang/test/Lower/OpenACC/acc-kernels-loop.f90
@@ -47,7 +47,7 @@ subroutine acc_kernels_loop
 ! CHECK:      acc.kernels {
 ! CHECK:        acc.loop private{{.*}} {
 ! CHECK:          acc.yield
-! CHECK-NEXT:   }{{$}}
+! CHECK-NEXT:   } attributes {auto_ = [#acc.device_type<none>]{{.*}}}
 ! CHECK:        acc.terminator
 ! CHECK-NEXT: }{{$}}
 
@@ -59,7 +59,7 @@ subroutine acc_kernels_loop
 ! CHECK:      acc.kernels combined(loop) {
 ! CHECK:        acc.loop combined(kernels) private{{.*}} {
 ! CHECK:          acc.yield
-! CHECK-NEXT:   }{{$}}
+! CHECK-NEXT:   } attributes {auto_ = [#acc.device_type<none>]{{.*}}}
 ! CHECK:        acc.terminator
 ! CHECK-NEXT: }{{$}}
 
@@ -490,7 +490,7 @@ subroutine acc_kernels_loop
 ! CHECK:      acc.kernels {{.*}} {
 ! CHECK:        acc.loop {{.*}} gang {{.*}} {
 ! CHECK:          acc.yield
-! CHECK-NEXT:   } attributes {inclusiveUpperbound = array<i1: true>}{{$}}
+! CHECK-NEXT:   } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
 ! CHECK:        acc.terminator
 ! CHECK-NEXT: }{{$}}
 
@@ -503,7 +503,7 @@ subroutine acc_kernels_loop
 ! CHECK:        [[GANGNUM1:%.*]] = arith.constant 8 : i32
 ! CHECK:        acc.loop {{.*}} gang({num=[[GANGNUM1]] : i32}) {{.*}} {
 ! CHECK:          acc.yield
-! CHECK-NEXT:   }{{$}}
+! CHECK-NEXT:   } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
 ! CHECK:        acc.terminator
 ! CHECK-NEXT: }{{$}}
 
@@ -516,7 +516,7 @@ subroutine acc_kernels_loop
 ! CHECK:        [[GANGNUM2:%.*]] = fir.load %{{.*}} : !fir.ref<i32>
 ! CHECK:        acc.loop {{.*}} gang({num=[[GANGNUM2]] : i32}) {{.*}} {
 ! CHECK:          acc.yield
-! CHECK-NEXT:   }{{$}}
+! CHECK-NEXT:   } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
 ! CHECK:        acc.terminator
 ! CHECK-NEXT: }{{$}}
 
@@ -528,7 +528,7 @@ subroutine acc_kernels_loop
 ! CHECK:      acc.kernels {{.*}} {
 ! CHECK:        acc.loop {{.*}} gang({num=%{{.*}} : i32, static=%{{.*}} : i32})
 ! CHECK:          acc.yield
-! CHECK-NEXT:   }{{$}}
+! CHECK-NEXT:   } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
 ! CHECK:        acc.terminator
 ! CHECK-NEXT: }{{$}}
 
@@ -540,7 +540,7 @@ subroutine acc_kernels_loop
 ! CHECK:      acc.kernels {{.*}} {
 ! CHECK:        acc.loop {{.*}} vector {{.*}} {
 ! CHECK:          acc.yield
-! CHECK-NEXT:   } attributes {inclusiveUpperbound = array<i1: true>}{{$}}
+! CHECK-NEXT:   } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
 ! CHECK:        acc.terminator
 ! CHECK-NEXT: }{{$}}
 
@@ -553,7 +553,7 @@ subroutine acc_kernels_loop
 ! CHECK:        [[CONSTANT128:%.*]] = arith.constant 128 : i32
 ! CHECK:        acc.loop {{.*}} vector([[CONSTANT128]] : i32) {{.*}} {
 ! CHECK:          acc.yield
-! CHECK-NEXT:   }{{$}}
+! CHECK-NEXT:   } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
 ! CHECK:        acc.terminator
 ! CHECK-NEXT: }{{$}}
 
@@ -566,7 +566,7 @@ subroutine acc_kernels_loop
 ! CHECK:        [[VECTORLENGTH:%.*]] = fir.load %{{.*}} : !fir.ref<i32>
 ! CHECK:        acc.loop {{.*}} vector([[VECTORLENGTH]] : i32) {{.*}} {
 ! CHECK:          acc.yield
-! CHECK-NEXT:   }{{$}}
+! CHECK-NEXT:   } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
 ! CHECK:        acc.terminator
 ! CHECK-NEXT: }{{$}}
 
@@ -578,7 +578,7 @@ subroutine acc_kernels_loop
 ! CHECK:      acc.kernels {{.*}} {
 ! CHECK:        acc.loop {{.*}} worker {{.*}} {
 ! CHECK:          acc.yield
-! CHECK-NEXT:   } attributes {inclusiveUpperbound = array<i1: true>}{{$}}
+! CHECK-NEXT:   } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
 ! CHECK:        acc.terminator
 ! CHECK-NEXT: }{{$}}
 
@@ -591,7 +591,7 @@ subroutine acc_kernels_loop
 ! CHECK:        [[WORKER128:%.*]] = arith.constant 128 : i32
 ! CHECK:        acc.loop {{.*}} worker([[WORKER128]] : i32) {{.*}} {
 ! CHECK:          acc.yield
-! CHECK-NEXT:   }{{$}}
+! CHECK-NEXT:   } attributes {auto_ = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true>}
 ! CHECK:        acc.terminator
 ! CHECK-NEXT: }{{$}}
 
@@ -605,7 +605,7 @@ subroutine acc_kernels_loop
 ! CHECK:      acc.kernels {{.*}} {
 ! CHECK:        acc.loop {{.*}} {
 ! CHECK:          acc.yield
-! CHECK-NEXT:   } attributes {collapse = [2], collapseDeviceType = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true, true>}
+! CHECK-NEXT:   } attributes {{{.*}}collapse = [2], collapseDeviceType = [#acc.device_type<none>]{{.*}}}
 ! CHECK:        acc.terminator
 ! CHECK-NEXT: }{{$}}
 
@@ -621,9 +621,9 @@ subroutine acc_kernels_loop
 ! CHECK:        acc.loop {{.*}} {
 ! CHECK:            acc.loop {{.*}} {
 ! CHECK:              acc.yield
-! CHECK-NEXT:     }{{$}}
+! CHECK-NEXT:     } attributes {auto_ = [#acc.device_type<none>]{{.*}}}
 ! CHECK:          acc.yield
-! CHECK-NEXT:   }{{$}}
+! CHECK-NEXT:   } attributes {auto_ = [#acc.device_type<none>]{{.*}}}
 ! CHECK:        acc.terminator
 ! CHECK-NEXT: }{{$}}
 
diff --git a/flang/test/Lower/OpenACC/acc-loop.f90 b/flang/test/Lower/OpenACC/acc-loop.f90
index 0246f60705898..eca7fb30da8fa 100644
--- a/flang/test/Lower/OpenACC/acc-loop.f90
+++ b/flang/test/Lower/OpenACC/acc-loop.f90
@@ -29,7 +29,7 @@ program acc_loop
 
 ! CHECK: acc.loop private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}{{$}}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}{{$}}
 
  !$acc loop seq
   DO i = 1, n
@@ -65,7 +65,7 @@ program acc_loop
 
 ! CHECK:      acc.loop gang private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
 
   !$acc loop gang(num: 8)
   DO i = 1, n
@@ -75,7 +75,7 @@ program acc_loop
 ! CHECK:      [[GANGNUM1:%.*]] = arith.constant 8 : i32
 ! CHECK:      acc.loop gang({num=[[GANGNUM1]] : i32}) private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
 
   !$acc loop gang(num: gangNum)
   DO i = 1, n
@@ -85,7 +85,7 @@ program acc_loop
 ! CHECK:      [[GANGNUM2:%.*]] = fir.load %{{.*}} : !fir.ref<i32>
 ! CHECK:      acc.loop gang({num=[[GANGNUM2]] : i32}) private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
 
  !$acc loop gang(num: gangNum, static: gangStatic)
   DO i = 1, n
@@ -94,7 +94,7 @@ program acc_loop
 
 ! CHECK: acc.loop gang({num=%{{.*}} : i32, static=%{{.*}} : i32}) private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
 
   !$acc loop vector
   DO i = 1, n
@@ -103,7 +103,7 @@ program acc_loop
 
 ! CHECK:      acc.loop vector private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
 
   !$acc loop vector(128)
   DO i = 1, n
@@ -113,7 +113,7 @@ program acc_loop
 ! CHECK: [[CONSTANT128:%.*]] = arith.constant 128 : i32
 ! CHECK:      acc.loop vector([[CONSTANT128]] : i32) private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: }{{$}}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
 
   !$acc loop vector(vectorLength)
   DO i = 1, n
@@ -123,7 +123,7 @@ program acc_loop
 ! CHECK:      [[VECTORLENGTH:%.*]] = fir.load %{{.*}} : !fir.ref<i32>
 ! CHECK:      acc.loop vector([[VECTORLENGTH]] : i32) private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
 
 !$acc loop worker
   DO i = 1, n
@@ -132,7 +132,7 @@ program acc_loop
 
 ! CHECK:      acc.loop worker private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
 
   !$acc loop worker(128)
   DO i = 1, n
@@ -142,7 +142,7 @@ program acc_loop
 ! CHECK: [[WORKER128:%.*]] = arith.constant 128 : i32
 ! CHECK:      acc.loop worker([[WORKER128]] : i32) private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
 
   !$acc loop private(c)
   DO i = 1, n
@@ -151,7 +151,7 @@ program acc_loop
 
 ! CHECK:      acc.loop private(@privatization_ref_10x10xf32 -> %{{.*}} : !fir.ref<!fir.array<10x10xf32>>, @privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
 
   ! When the induction variable is explicitly private - only a single private entry should be created.
   !$acc loop private(i)
@@ -161,7 +161,7 @@ program acc_loop
 
 ! CHECK:      acc.loop private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
 
   !$acc loop private(c, d)
   DO i = 1, n
@@ -170,7 +170,7 @@ program acc_loop
 
 ! CHECK:      acc.loop private(@privatization_ref_10x10xf32 -> %{{.*}} : !fir.ref<!fir.array<10x10xf32>>, @privatization_ref_10x10xf32 -> %{{.*}} : !fir.ref<!fir.array<10x10xf32>>, @privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
 
   !$acc loop private(c) private(d)
   DO i = 1, n
@@ -179,7 +179,7 @@ program acc_loop
 
 ! CHECK:      acc.loop private(@privatization_ref_10x10xf32 -> %{{.*}} : !fir.ref<!fir.array<10x10xf32>>, @privatization_ref_10x10xf32 -> %{{.*}} : !fir.ref<!fir.array<10x10xf32>>, @privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
 
   !$acc loop tile(2)
   DO i = 1, n
@@ -189,7 +189,7 @@ program acc_loop
 ! CHECK:      [[TILESIZE:%.*]] = arith.constant 2 : i32
 ! CHECK:      acc.loop {{.*}} tile({[[TILESIZE]] : i32}) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
 
  !$acc loop tile(*)
   DO i = 1, n
@@ -198,7 +198,7 @@ program acc_loop
 ! CHECK:      [[TILESIZEM1:%.*]] = arith.constant -1 : i32
 ! CHECK:      acc.loop {{.*}} tile({[[TILESIZEM1]] : i32}) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
 
   !$acc loop tile(2, 2)
   DO i = 1, n
@@ -211,7 +211,7 @@ program acc_loop
 ! CHECK:      [[TILESIZE2:%.*]] = arith.constant 2 : i32
 ! CHECK:      acc.loop {{.*}} tile({[[TILESIZE1]] : i32, [[TILESIZE2]] : i32}) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
 
   !$acc loop tile(tileSize)
   DO i = 1, n
@@ -220,7 +220,7 @@ program acc_loop
 
 ! CHECK:      acc.loop {{.*}} tile({%{{.*}} : i32}) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
 
   !$acc loop tile(tileSize, tileSize)
   DO i = 1, n
@@ -231,7 +231,7 @@ program acc_loop
 
 ! CHECK:      acc.loop {{.*}} tile({%{{.*}} : i32, %{{.*}} : i32}) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: }{{$}}
 
   !$acc loop collapse(2)
   DO i = 1, n
@@ -244,7 +244,7 @@ program acc_loop
 ! CHECK:        fir.store %arg0 to %{{.*}} : !fir.ref<i32>
 ! CHECK:        fir.store %arg1 to %{{.*}} : !fir.ref<i32>
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {collapse = [2], collapseDeviceType = [#acc.device_type<none>], inclusiveUpperbound = array<i1: true, true>}
+! CHECK-NEXT: } attributes {collapse = [2], collapseDeviceType = [#acc.device_type<none>]{{.*}}}
 
   !$acc loop
   DO i = 1, n
@@ -257,9 +257,9 @@ program acc_loop
 ! CHECK:      acc.loop {{.*}} control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:          acc.loop {{.*}} control(%arg1 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:            acc.yield
-! CHECK-NEXT:   } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT:   } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
 
   !$acc loop reduction(+:reduction_r) reduction(*:reduction_i)
   do i = 1, n
@@ -269,7 +269,7 @@ program acc_loop
 
 ! CHECK:      acc.loop private(@privatization_ref_i32 -> %{{.*}} : !fir.ref<i32>) reduction(@reduction_add_ref_f32 -> %{{.*}} : !fir.ref<f32>, @reduction_mul_ref_i32 -> %{{.*}} : !fir.ref<i32>) control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
 
  !$acc loop gang(dim: gangDim, static: gangStatic)
   DO i = 1, n
@@ -278,7 +278,7 @@ program acc_loop
 
 ! CHECK: acc.loop gang({dim=%{{.*}}, static=%{{.*}} : i32}) {{.*}} control(%arg0 : i32) = (%{{.*}} : i32) to (%{{.*}} : i32) step (%{{.*}} : i32) {
 ! CHECK:        acc.yield
-! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
+! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
 
   !$acc loop gang(dim: 1)
   DO i = 1, n
@@ -287,7 +287,7 @@ program acc_loop
 
 ! CHECK:      acc.loop gang({...
[truncated]

! CHECK-NEXT: } attributes {inclusiveUpperbound = array<i1: true>}
! CHECK-NEXT: }{{$}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this gone?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't feel like those attributes were worth checking since they were not relevant for that part of the test. That said, I just restored them and made sure they work with my new changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok. Yeah the diff in GitHub made it look like a removal.

Copy link
Contributor

@clementval clementval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@razvanlupusoru razvanlupusoru merged commit 775ad3e into llvm:main Jun 11, 2025
7 checks passed
tomtor pushed a commit to tomtor/llvm-project that referenced this pull request Jun 14, 2025
…ion mode (llvm#143623)

This PR updates the flang lowering to explicitly implement the OpenACC
rules:
- As per OpenACC 3.3 standard section 2.9.6 independent clause: A loop
construct with no auto or seq clause is treated as if it has the
independent clause when it is an orphaned loop construct or its parent
compute construct is a parallel construct.
- As per OpenACC 3.3 standard section 2.9.7 auto clause: When the parent
compute construct is a kernels construct, a loop construct with no
independent or seq clause is treated as if it has the auto clause.
- Loops in serial regions are `seq` if they have no other parallelism
marking such as gang, worker, vector.

For now the `acc.loop` verifier has not yet been updated to enforce
this.
akuhlens pushed a commit to akuhlens/llvm-project that referenced this pull request Jun 24, 2025
…ion mode (llvm#143623)

This PR updates the flang lowering to explicitly implement the OpenACC
rules:
- As per OpenACC 3.3 standard section 2.9.6 independent clause: A loop
construct with no auto or seq clause is treated as if it has the
independent clause when it is an orphaned loop construct or its parent
compute construct is a parallel construct.
- As per OpenACC 3.3 standard section 2.9.7 auto clause: When the parent
compute construct is a kernels construct, a loop construct with no
independent or seq clause is treated as if it has the auto clause.
- Loops in serial regions are `seq` if they have no other parallelism
marking such as gang, worker, vector.

For now the `acc.loop` verifier has not yet been updated to enforce
this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category mlir:openacc mlir openacc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants