Skip to content

[MLIR][NVVM] Add ptxas-cmd-options to pass flags to the downstream compiler #127457

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 17, 2025

Conversation

grypp
Copy link
Member

@grypp grypp commented Feb 17, 2025

This PR adds cmd-options to the gpu-lower-to-nvvm-pipeline pipeline and the nvvm-attach-target pass, allowing users to pass flags to the downstream compiler, ptxas.

Example:

mlir-opt -gpu-lower-to-nvvm-pipeline="cubin-chip=sm_80 ptxas-cmd-options='-v --register-usage-level=8'"

@llvmbot
Copy link
Member

llvmbot commented Feb 17, 2025

@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-mlir-gpu

@llvm/pr-subscribers-mlir-llvm

Author: Guray Ozen (grypp)

Changes

This PR adds cmd-options to the gpu-lower-to-nvvm-pipeline pipeline and the nvvm-attach-target pass, allowing users to pass flags to the downstream compiler, ptxas.


Full diff: https://github.com/llvm/llvm-project/pull/127457.diff

10 Files Affected:

  • (modified) mlir/include/mlir/Dialect/GPU/IR/CompilationInterfaces.h (+4)
  • (modified) mlir/include/mlir/Dialect/GPU/Pipelines/Passes.h (+4)
  • (modified) mlir/include/mlir/Dialect/GPU/Transforms/Passes.td (+3)
  • (modified) mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td (+8)
  • (modified) mlir/lib/Dialect/GPU/IR/GPUDialect.cpp (+7-2)
  • (modified) mlir/lib/Dialect/GPU/Pipelines/GPUToNVVMPipeline.cpp (+1)
  • (modified) mlir/lib/Dialect/GPU/Transforms/NVVMAttachTarget.cpp (+17-1)
  • (modified) mlir/lib/Target/LLVM/NVVM/Target.cpp (+26-2)
  • (added) mlir/test/Dialect/GPU/nvvm-attach-target.mlir (+15)
  • (added) mlir/test/Integration/GPU/CUDA/command-line-arg.mlir (+21)
diff --git a/mlir/include/mlir/Dialect/GPU/IR/CompilationInterfaces.h b/mlir/include/mlir/Dialect/GPU/IR/CompilationInterfaces.h
index c950ef220f692..9a890ae24d8fc 100644
--- a/mlir/include/mlir/Dialect/GPU/IR/CompilationInterfaces.h
+++ b/mlir/include/mlir/Dialect/GPU/IR/CompilationInterfaces.h
@@ -108,6 +108,10 @@ class TargetOptions {
   /// Returns the default compilation target: `CompilationTarget::Fatbin`.
   static CompilationTarget getDefaultCompilationTarget();
 
+  /// Returns a tokenization of the command line options.
+  static std::pair<llvm::BumpPtrAllocator, SmallVector<const char *>>
+  tokenizeCmdOptions(const std::string &cmdOptions);
+
 protected:
   /// Derived classes must use this constructor to initialize `typeID` to the
   /// appropiate value: ie. `TargetOptions(TypeID::get<DerivedClass>())`.
diff --git a/mlir/include/mlir/Dialect/GPU/Pipelines/Passes.h b/mlir/include/mlir/Dialect/GPU/Pipelines/Passes.h
index caa0901bb4943..9f3b9f6811776 100644
--- a/mlir/include/mlir/Dialect/GPU/Pipelines/Passes.h
+++ b/mlir/include/mlir/Dialect/GPU/Pipelines/Passes.h
@@ -37,6 +37,10 @@ struct GPUToNVVMPipelineOptions
       *this, "cubin-format",
       llvm::cl::desc("Compilation format to use to serialize to cubin."),
       llvm::cl::init("fatbin")};
+  PassOptions::Option<std::string> cmdOptions{
+      *this, "cmd-options",
+      llvm::cl::desc("Command line options to pass downstream compiler."),
+      llvm::cl::init("")};
   PassOptions::Option<int> optLevel{
       *this, "opt-level",
       llvm::cl::desc("Optimization level for NVVM compilation"),
diff --git a/mlir/include/mlir/Dialect/GPU/Transforms/Passes.td b/mlir/include/mlir/Dialect/GPU/Transforms/Passes.td
index e055164a1c384..f351cea762d60 100644
--- a/mlir/include/mlir/Dialect/GPU/Transforms/Passes.td
+++ b/mlir/include/mlir/Dialect/GPU/Transforms/Passes.td
@@ -143,6 +143,9 @@ def GpuNVVMAttachTarget: Pass<"nvvm-attach-target", ""> {
            "Enable flush to zero for denormals.">,
     ListOption<"linkLibs", "l", "std::string",
            "Extra bitcode libraries paths to link to.">,
+    Option<"cmdOptions", "cmd-options", "std::string",
+           /*default=*/ [{""}],
+           "Command line options passed to downstream compiler">,
   ];
 }
 
diff --git a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
index fe15a524ec3b5..6db146f3b2cd7 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
@@ -2862,6 +2862,8 @@ def NVVM_TargettAttr : NVVM_Attr<"NVVMTarget", "target"> {
     bool hasFlag(StringRef flag) const;
     bool hasFastMath() const;
     bool hasFtz() const;
+    bool hasCmdOptions() const;
+    std::optional<mlir::NamedAttribute> getCmdOptions() const;
   }];
   let extraClassDefinition = [{
     bool $cppClass::hasFlag(StringRef flag) const {
@@ -2875,6 +2877,12 @@ def NVVM_TargettAttr : NVVM_Attr<"NVVMTarget", "target"> {
     bool $cppClass::hasFtz() const {
       return hasFlag("ftz");
     }
+    bool $cppClass::hasCmdOptions() const {
+      return hasFlag("cmd-options");
+    }
+    std::optional<mlir::NamedAttribute> $cppClass::getCmdOptions() const {
+      return getFlags().getNamed("cmd-options");
+    }
   }];
 }
 
diff --git a/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp b/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
index d06f10d3137a1..2eebd5ab946e4 100644
--- a/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
+++ b/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
@@ -2563,8 +2563,8 @@ CompilationTarget TargetOptions::getDefaultCompilationTarget() {
   return CompilationTarget::Fatbin;
 }
 
-std::pair<llvm::BumpPtrAllocator, SmallVector<const char *>>
-TargetOptions::tokenizeCmdOptions() const {
+sstd::pair<llvm::BumpPtrAllocator, SmallVector<const char *>>
+TargetOptions::tokenizeCmdOptions(const std::string &cmdOptions) {
   std::pair<llvm::BumpPtrAllocator, SmallVector<const char *>> options;
   llvm::StringSaver stringSaver(options.first);
   StringRef opts = cmdOptions;
@@ -2586,6 +2586,11 @@ TargetOptions::tokenizeCmdOptions() const {
   return options;
 }
 
+std::pair<llvm::BumpPtrAllocator, SmallVector<const char *>>
+TargetOptions::tokenizeCmdOptions() const {
+  return tokenizeCmdOptions(cmdOptions);
+}
+
 MLIR_DEFINE_EXPLICIT_TYPE_ID(::mlir::gpu::TargetOptions)
 
 #include "mlir/Dialect/GPU/IR/GPUOpInterfaces.cpp.inc"
diff --git a/mlir/lib/Dialect/GPU/Pipelines/GPUToNVVMPipeline.cpp b/mlir/lib/Dialect/GPU/Pipelines/GPUToNVVMPipeline.cpp
index 20d7372eef85d..5048bdd96f154 100644
--- a/mlir/lib/Dialect/GPU/Pipelines/GPUToNVVMPipeline.cpp
+++ b/mlir/lib/Dialect/GPU/Pipelines/GPUToNVVMPipeline.cpp
@@ -58,6 +58,7 @@ void buildCommonPassPipeline(
   nvvmTargetOptions.chip = options.cubinChip;
   nvvmTargetOptions.features = options.cubinFeatures;
   nvvmTargetOptions.optLevel = options.optLevel;
+  nvvmTargetOptions.cmdOptions = options.cmdOptions;
   pm.addPass(createGpuNVVMAttachTarget(nvvmTargetOptions));
   pm.addPass(createLowerAffinePass());
   pm.addPass(createArithToLLVMConversionPass());
diff --git a/mlir/lib/Dialect/GPU/Transforms/NVVMAttachTarget.cpp b/mlir/lib/Dialect/GPU/Transforms/NVVMAttachTarget.cpp
index dd705cd338312..c8ae914830889 100644
--- a/mlir/lib/Dialect/GPU/Transforms/NVVMAttachTarget.cpp
+++ b/mlir/lib/Dialect/GPU/Transforms/NVVMAttachTarget.cpp
@@ -45,7 +45,7 @@ struct NVVMAttachTarget
 
 DictionaryAttr NVVMAttachTarget::getFlags(OpBuilder &builder) const {
   UnitAttr unitAttr = builder.getUnitAttr();
-  SmallVector<NamedAttribute, 2> flags;
+  SmallVector<NamedAttribute, 3> flags;
   auto addFlag = [&](StringRef flag) {
     flags.push_back(builder.getNamedAttr(flag, unitAttr));
   };
@@ -53,6 +53,22 @@ DictionaryAttr NVVMAttachTarget::getFlags(OpBuilder &builder) const {
     addFlag("fast");
   if (ftzFlag)
     addFlag("ftz");
+
+  // Tokenize and set the optional command line options.
+  if (!cmdOptions.empty()) {
+    auto options = gpu::TargetOptions::tokenizeCmdOptions(cmdOptions);
+    if (!options.second.empty()) {
+      llvm::SmallVector<mlir::Attribute> nvvmOptionAttrs;
+      for (const char *opt : options.second) {
+        nvvmOptionAttrs.emplace_back(
+            mlir::StringAttr::get(builder.getContext(), StringRef(opt)));
+      }
+      flags.push_back(builder.getNamedAttr(
+          "cmd-options",
+          mlir::ArrayAttr::get(builder.getContext(), nvvmOptionAttrs)));
+    }
+  }
+
   if (!flags.empty())
     return builder.getDictionaryAttr(flags);
   return nullptr;
diff --git a/mlir/lib/Target/LLVM/NVVM/Target.cpp b/mlir/lib/Target/LLVM/NVVM/Target.cpp
index e240a7ae4917f..fa8c597da58b1 100644
--- a/mlir/lib/Target/LLVM/NVVM/Target.cpp
+++ b/mlir/lib/Target/LLVM/NVVM/Target.cpp
@@ -321,6 +321,25 @@ std::optional<std::string> NVPTXSerializer::findTool(StringRef tool) {
   return std::nullopt;
 }
 
+/// Adds optional command-line arguments to existing arguments.
+template <typename T>
+static void setOptionalCommandlineArguments(NVVMTargetAttr target,
+                                            SmallVectorImpl<T> &ptxasArgs) {
+  if (!target.hasCmdOptions())
+    return;
+
+  std::optional<mlir::NamedAttribute> cmdOptions = target.getCmdOptions();
+  for (Attribute attr : cast<ArrayAttr>(cmdOptions->getValue())) {
+    if (auto strAttr = dyn_cast<StringAttr>(attr)) {
+      if constexpr (std::is_same_v<T, StringRef>) {
+        ptxasArgs.push_back(strAttr.getValue());
+      } else if constexpr (std::is_same_v<T, const char *>) {
+        ptxasArgs.push_back(strAttr.getValue().data());
+      }
+    }
+  }
+}
+
 // TODO: clean this method & have a generic tool driver or never emit binaries
 // with this mechanism and let another stage take care of it.
 std::optional<SmallVector<char, 0>>
@@ -359,8 +378,8 @@ NVPTXSerializer::compileToBinary(const std::string &ptxCode) {
     return std::nullopt;
   TmpFile cubinFile;
   if (createFatbin) {
-    Twine cubinFilename = ptxFile->first + ".cubin";
-    cubinFile = TmpFile(cubinFilename.str(), llvm::FileRemover(cubinFilename));
+    std::string cubinFilename = (ptxFile->first + ".cubin").str();
+    cubinFile = TmpFile(cubinFilename, llvm::FileRemover(cubinFilename));
   } else {
     cubinFile.first = binaryFile->first;
   }
@@ -412,6 +431,9 @@ NVPTXSerializer::compileToBinary(const std::string &ptxCode) {
       useFatbin32 = true;
   }
 
+  // Set optional command line arguments
+  setOptionalCommandlineArguments(getTarget(), ptxasArgs);
+
   // Create the `fatbinary` args.
   StringRef chip = getTarget().getChip();
   // Remove the arch prefix to obtain the compute capability.
@@ -562,6 +584,8 @@ NVPTXSerializer::compileToBinaryNVPTX(const std::string &ptxCode) {
   cmdOpts.second.append(
       {"-arch", getTarget().getChip().data(), "--opt-level", optLevel.c_str()});
 
+  // Set optional command line arguments
+  setOptionalCommandlineArguments(getTarget(), cmdOpts.second);
   // Create the compiler handle.
   RETURN_ON_NVPTXCOMPILER_ERROR(
       nvPTXCompilerCreate(&compiler, ptxCode.size(), ptxCode.c_str()));
diff --git a/mlir/test/Dialect/GPU/nvvm-attach-target.mlir b/mlir/test/Dialect/GPU/nvvm-attach-target.mlir
new file mode 100644
index 0000000000000..8e178c33eead4
--- /dev/null
+++ b/mlir/test/Dialect/GPU/nvvm-attach-target.mlir
@@ -0,0 +1,15 @@
+// RUN: mlir-opt %s --nvvm-attach-target="" | FileCheck %s
+// RUN: mlir-opt %s --nvvm-attach-target="cmd-options='-v --register-usage-level=8''" | FileCheck %s -check-prefix=CHECK-OPTIONS
+
+module attributes {gpu.container_module} {
+    // CHECK-LABEL:gpu.module @kernel_module1
+    // CHECK: [#nvvm.target]
+    // CHECK-OPTIONS:#nvvm.target<flags = {"cmd-options" = ["-v", "--register-usage-level=8"]}>
+    gpu.module @kernel_module1 {
+    llvm.func @kernel(%arg0: i32, %arg1: !llvm.ptr,
+        %arg2: !llvm.ptr, %arg3: i64, %arg4: i64,
+        %arg5: i64) attributes {gpu.kernel} {
+            llvm.return
+        }
+    }
+}
diff --git a/mlir/test/Integration/GPU/CUDA/command-line-arg.mlir b/mlir/test/Integration/GPU/CUDA/command-line-arg.mlir
new file mode 100644
index 0000000000000..f641cc519be77
--- /dev/null
+++ b/mlir/test/Integration/GPU/CUDA/command-line-arg.mlir
@@ -0,0 +1,21 @@
+// RUN: mlir-opt %s \
+// RUN:  | mlir-opt -gpu-lower-to-nvvm-pipeline="cubin-chip=sm_80 cmd-options='-v --register-usage-level=8'" -debug-only=serialize-to-binary \
+// RUN:  2>&1 | FileCheck %s
+
+func.func @host_function(%arg0 : f32, %arg1 : memref<?xf32>) {
+    %cst = arith.constant 1 : index
+    %c0 = arith.constant 0 : index
+    %cst2 = memref.dim %arg1, %c0 : memref<?xf32>
+    
+    gpu.launch blocks(%bx, %by, %bz) in (%grid_x = %cst, %grid_y = %cst, %grid_z = %cst)
+                threads(%tx, %ty, %tz) in (%block_x = %cst2, %block_y = %cst, %block_z = %cst) {
+        memref.store %arg0, %arg1[%tx] : memref<?xf32>
+        gpu.terminator
+    }
+
+    return
+}
+
+// CHECK: ptxas -arch sm_80
+// CHECK-SAME: -v 
+// CHECK-SAME: --register-usage-level=8

@grypp
Copy link
Member Author

grypp commented Feb 17, 2025

cc @MikaOvO

This PR adds `cmd-options` to the `gpu-lower-to-nvvm-pipeline` pipeline and the `nvvm-attach-target` pass, allowing users to pass flags to the downstream compiler, *ptxas*.
@grypp grypp force-pushed the add-ptxas-cmd-args branch from 75767cb to fc93398 Compare February 17, 2025 08:51
@grypp grypp changed the title [MLIR][NVVM] Add cmd-options to pass flags to the downstream compiler [MLIR][NVVM] Add ptxas-cmd-options to pass flags to the downstream compiler Feb 17, 2025
Copy link
Contributor

@durga4github durga4github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@grypp grypp merged commit 837b89f into llvm:main Feb 17, 2025
8 checks passed
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025
…compiler (llvm#127457)

This PR adds `cmd-options` to the `gpu-lower-to-nvvm-pipeline` pipeline
and the `nvvm-attach-target` pass, allowing users to pass flags to the
downstream compiler, *ptxas*.

Example:
```
mlir-opt -gpu-lower-to-nvvm-pipeline="cubin-chip=sm_80 ptxas-cmd-options='-v --register-usage-level=8'"
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants