Skip to content

[ElimAvailExtern] Add an option to allow to convert global variables in a specified address space to local #144287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 17, 2025

Conversation

shiltian
Copy link
Contributor

@shiltian shiltian commented Jun 16, 2025

Currently, the EliminateAvailableExternallyPass only converts certain
available externally functions to local if avail-extern-to-local is set or in
contextual profiling mode. For global variables, it only drops their
initializers.

This PR adds an option to allow the pass to convert global variables in a
specified address space to local. The motivation for this change is to correctly
support lowering of LDS variables (__shared__ variables, in more generic
terminology) when ThinLTO is enabled for AMDGPU.

A __shared__ variable is lowered to a hidden global variable in a particular
address space by the frontend, which is roughly same as a static local
variable. To properly lower it in the backend, the compiler needs to check all
its uses. Enabling ThinLTO currently breaks this when a function containing a
__shared__ variable is imported from another module. Even though the global
variable is imported along with its associated function, and the function is
privatized by the EliminateAvailableExternallyPass, the global variable itself
is not.

It's safe to privatize such global variables, because they're local to their
associated functions. If the function itself is privatized, its associated
global variables should also be privatized accordingly.

Copy link
Contributor Author

@llvmbot
Copy link
Member

llvmbot commented Jun 16, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Shilei Tian (shiltian)

Changes

Currently, the EliminateAvailableExternallyPass only converts certain
available externally functions to local. For global variables, it only drops
their initializers.

This PR adds an option to allow the pass to convert global variables in a
specified address space to local. The motivation for this change is to correctly
support lowering of LDS variables (__shared__ variables, in more generic
terminology) when ThinLTO is enabled for AMDGPU.

A __shared__ variable is lowered to a hidden global variable in a particular
address space by the frontend, which is roughly same as a static local
variable. To properly lower it in the backend, the compiler needs to check all
its uses. Enabling ThinLTO currently breaks this when a function containing a
__shared__ variable is imported from another module. Even though the global
variable is imported along with its associated function, and the function is
privatized by the EliminateAvailableExternallyPass, the global variable itself
is not.

It's safe to privatize such global variables, because they're local to their
associated functions. If the function itself is privatized, its associated
global variables should also be privatized accordingly.


Full diff: https://github.com/llvm/llvm-project/pull/144287.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/ElimAvailExtern.cpp (+30-1)
  • (added) llvm/test/Transforms/EliminateAvailableExternally/convert-global-variables-to-local.ll (+21)
diff --git a/llvm/lib/Transforms/IPO/ElimAvailExtern.cpp b/llvm/lib/Transforms/IPO/ElimAvailExtern.cpp
index 718452fc02764..a015ba8ccfd4a 100644
--- a/llvm/lib/Transforms/IPO/ElimAvailExtern.cpp
+++ b/llvm/lib/Transforms/IPO/ElimAvailExtern.cpp
@@ -35,8 +35,14 @@ static cl::opt<bool> ConvertToLocal(
     cl::desc("Convert available_externally into locals, renaming them "
              "to avoid link-time clashes."));
 
+static cl::opt<unsigned> ConvertGlobalVariableInAddrSpace(
+    "avail-extern-gv-in-addrspace-to-local", cl::Hidden,
+    cl::desc(
+        "Convert available_externally global variables into locals if they are "
+        "in specificed addrspace, renaming them to avoid link-time clashes."));
+
 STATISTIC(NumRemovals, "Number of functions removed");
-STATISTIC(NumConversions, "Number of functions converted");
+STATISTIC(NumConversions, "Number of functions and globalbs converted");
 STATISTIC(NumVariables, "Number of global variables removed");
 
 void deleteFunction(Function &F) {
@@ -88,9 +94,32 @@ static void convertToLocalCopy(Module &M, Function &F) {
   ++NumConversions;
 }
 
+static void convertToLocalCopy(Module &M, GlobalValue &GV) {
+  assert(GV.hasAvailableExternallyLinkage());
+  std::string OrigName = GV.getName().str();
+  std::string NewName = OrigName + ".__uniq" + getUniqueModuleId(&M);
+  GV.setName(NewName);
+  GV.setLinkage(GlobalValue::InternalLinkage);
+  ++NumConversions;
+}
+
 static bool eliminateAvailableExternally(Module &M, bool Convert) {
   bool Changed = false;
 
+  // Convert global variables in specified address space before changing it to
+  // external linkage below.
+  if (ConvertGlobalVariableInAddrSpace.getNumOccurrences()) {
+    for (GlobalVariable &GV : M.globals()) {
+      if (!GV.hasAvailableExternallyLinkage() || GV.use_empty())
+        continue;
+
+      if (GV.getAddressSpace() == ConvertGlobalVariableInAddrSpace)
+        convertToLocalCopy(M, GV);
+
+      Changed = true;
+    }
+  }
+
   // Drop initializers of available externally global variables.
   for (GlobalVariable &GV : M.globals()) {
     if (!GV.hasAvailableExternallyLinkage())
diff --git a/llvm/test/Transforms/EliminateAvailableExternally/convert-global-variables-to-local.ll b/llvm/test/Transforms/EliminateAvailableExternally/convert-global-variables-to-local.ll
new file mode 100644
index 0000000000000..6995b97e79887
--- /dev/null
+++ b/llvm/test/Transforms/EliminateAvailableExternally/convert-global-variables-to-local.ll
@@ -0,0 +1,21 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals all --version 5
+; RUN: opt -S -passes=elim-avail-extern -avail-extern-gv-in-addrspace-to-local=3 %s -o - | FileCheck %s
+
+@shared = internal addrspace(3) global i32 undef, align 4
+@shared.imported = available_externally hidden unnamed_addr addrspace(3) global i32 undef, align 4
+
+;.
+; CHECK: @shared = internal addrspace(3) global i32 undef, align 4
+; CHECK: @shared.imported.__uniq.[[UUID:.*]] = internal unnamed_addr addrspace(3) global i32 undef, align 4
+;.
+define void @foo(i32 %v) {
+; CHECK-LABEL: define void @foo(
+; CHECK-SAME: i32 [[V:%.*]]) {
+; CHECK-NEXT:    store i32 [[V]], ptr addrspace(3) @shared, align 4
+; CHECK-NEXT:    store i32 [[V]], ptr addrspace(3) @shared.imported.__uniq.[[UUID]], align 4
+; CHECK-NEXT:    ret void
+;
+  store i32 %v, ptr addrspace(3) @shared, align 4
+  store i32 %v, ptr addrspace(3) @shared.imported, align 4
+  ret void
+}

@shiltian shiltian force-pushed the users/shiltian/convert-to-local-in-as branch from 3127751 to b633297 Compare June 16, 2025 02:19
Copy link

github-actions bot commented Jun 16, 2025

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:
git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 'HEAD~1' HEAD llvm/test/Transforms/EliminateAvailableExternally/convert-global-variables-to-local.ll llvm/lib/Transforms/IPO/ElimAvailExtern.cpp

The following files introduce new uses of undef:

  • llvm/test/Transforms/EliminateAvailableExternally/convert-global-variables-to-local.ll

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

@shiltian shiltian force-pushed the users/shiltian/convert-to-local-in-as branch 2 times, most recently from 1cfc903 to 9c643a2 Compare June 16, 2025 14:56
Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the EliminateAvailableExternallyPass only converts certain available externally functions to local.

This is not the default behavior. This pass mostly exists to drop definitions, not to convert to local. The conversion was added for contextual profiling. So the reference to this behavior in the summary is a bit odd. Do you also want to convert functions to locals? If so, should this be under the same option(s)?

@shiltian
Copy link
Contributor Author

This is not the default behavior. This pass mostly exists to drop definitions, not to convert to local. The conversion was added for contextual profiling. So the reference to this behavior in the summary is a bit odd.

Thanks for the information. I'll correct the description.

Do you also want to convert functions to locals?

Yes but that is controlled by the existing option avail-extern-to-local.

If so, should this be under the same option(s)?

My understanding is avail-extern-to-local is for function and the new option is for global variable. Do you think it's a better idea to check both options for global variables?

@teresajohnson
Copy link
Contributor

This is not the default behavior. This pass mostly exists to drop definitions, not to convert to local. The conversion was added for contextual profiling. So the reference to this behavior in the summary is a bit odd.

Thanks for the information. I'll correct the description.

Do you also want to convert functions to locals?

Yes but that is controlled by the existing option avail-extern-to-local.

I guess there are 2 cases that we convert functions to locals, one is for contextual profiling (sets the Convert bool) and one is if the avail-extern-to-local flag is given.

If so, should this be under the same option(s)?

My understanding is avail-extern-to-local is for function and the new option is for global variable. Do you think it's a better idea to check both options for global variables?

I think it is fine to keep them separate for now, please just update the description.

…in a specified address space to local

Currently, the `EliminateAvailableExternallyPass` only converts certain
available externally functions to local. For global variables, it only drops
their initializers.

This PR adds an option to allow the pass to convert global variables in a
specified address space to local. The motivation for this change is to correctly
support lowering of LDS variables (`__shared__` variables, in more generic
terminology) when ThinLTO is enabled for AMDGPU.

A `__shared__` variable is lowered to a hidden global variable in a particular
address space by the frontend, which is roughly same as a `static` local
variable. To properly lower it in the backend, the compiler needs to check all
its uses. Enabling ThinLTO currently breaks this when a function containing a
`__shared__` variable is imported from another module. Even though the global
variable is imported along with its associated function, and the function is
privatized by the `EliminateAvailableExternallyPass`, the global variable itself
is not.

It's safe to privatize such global variables, because they're _local_ to their
associated functions. If the function itself is privatized, its associated
global variables should also be privatized accordingly.
@shiltian shiltian force-pushed the users/shiltian/convert-to-local-in-as branch from 9c643a2 to 354afb9 Compare June 17, 2025 02:15
@@ -45,6 +52,10 @@ void deleteFunction(Function &F) {
++NumRemovals;
}

static std::string getNewName(Module &M, const GlobalValue &GV) {
return GV.getName().str() + ".__uniq" + getUniqueModuleId(&M);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I meant also pull into the refactoring the setting of the name and the setting of the linkage type - basically reusing convertToLocalCopy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that route but not sure if that would be better. The new name is used at L85 as well. If we are gonna have a function that does both rename and set linkage, it also needs to return the new name for L85, which I find the semantics of it weird.

Copy link
Member

@mtrofin mtrofin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, some nits.

also note the github-actions warning about undef / poison.

@teresajohnson
Copy link
Contributor

I have a couple questions/comments about the description:

A __shared__ variable is lowered to a hidden global variable in a particular address space by the frontend, which is roughly same as a static local variable. To properly lower it in the backend, the compiler needs to check all its uses. Enabling ThinLTO currently breaks this when a function containing a __shared__ variable is imported from another module. > Even though the global variable is imported along with its associated function, and the function is privatized by the EliminateAvailableExternallyPass, the global variable itself is not.

This should maybe be "if the function is privatized by ..." because that's only optional behavior.

It's safe to privatize such global variables, because they're local to their associated functions. If the function itself is privatized, its associated global variables should also be privatized accordingly.

This is only safe if the function is also privatized, correct? Should the new option only take effect if the Convert bool is set (ensuring functions are also privatized)? And possibly assert if not?

@shiltian
Copy link
Contributor Author

also note the github-actions warning about undef / poison.

Yeah, but I think we'd want to keep that undef since it is the initializer for the global variable.

@shiltian
Copy link
Contributor Author

This is only safe if the function is also privatized, correct? Should the new option only take effect if the Convert bool is set (ensuring functions are also privatized)? And possibly assert if not?

At a high level (from a target-independent perspective), these are separate. We want to provide an option to privatize global variables in a specified address space, that's all. For the AMDGPU LDS variable scenario specifically, it should be the driver's responsibility to make sure the correct combination of options is used.

@shiltian shiltian merged commit 15482c8 into main Jun 17, 2025
6 of 7 checks passed
@shiltian shiltian deleted the users/shiltian/convert-to-local-in-as branch June 17, 2025 23:58
@teresajohnson
Copy link
Contributor

This is only safe if the function is also privatized, correct? Should the new option only take effect if the Convert bool is set (ensuring functions are also privatized)? And possibly assert if not?

At a high level (from a target-independent perspective), these are separate. We want to provide an option to privatize global variables in a specified address space, that's all. For the AMDGPU LDS variable scenario specifically, it should be the driver's responsibility to make sure the correct combination of options is used.

Enabling ThinLTO currently breaks this when a function containing a __shared__ variable is imported from another module. Even though the global variable is imported along with its associated function, and the function is privatized by the EliminateAvailableExternallyPass, the global variable itself is not.

It's safe to privatize such global variables, because they're local to their associated functions. If the function itself is privatized, its associated global variables should also be privatized accordingly.

Except that the PR description indicates that this is only for the case that the function was imported and privatized:

Enabling ThinLTO currently breaks this when a function containing a
shared variable is imported from another module. Even though the global
variable is imported along with its associated function, and the function is
privatized by the EliminateAvailableExternallyPass, the global variable itself
is not.

It's safe to privatize such global variables, because they're local to their
associated functions. If the function itself is privatized, its associated
global variables should also be privatized accordingly.

Is it ever safe to do this for an available externally variable not in an available externally function that is being privatized?

(BTW I see that this was already merged, which was a little premature as there was outstanding questions from a reviewer who hadn't yet approved - it's good to at least give it a day or so to see if there is follow on, I didn't have a chance to go back and look at this a second time yesterday)

fschlimb pushed a commit to fschlimb/llvm-project that referenced this pull request Jun 18, 2025
…in a specified address space to local (llvm#144287)

Currently, the `EliminateAvailableExternallyPass` only converts certain
available externally functions to local if `avail-extern-to-local` is
set or in
contextual profiling mode. For global variables, it only drops their
initializers.

This PR adds an option to allow the pass to convert global variables in
a
specified address space to local. The motivation for this change is to
correctly
support lowering of LDS variables (`__shared__` variables, in more
generic
terminology) when ThinLTO is enabled for AMDGPU.

A `__shared__` variable is lowered to a hidden global variable in a
particular
address space by the frontend, which is roughly same as a `static` local
variable. To properly lower it in the backend, the compiler needs to
check all
its uses. Enabling ThinLTO currently breaks this when a function
containing a
`__shared__` variable is imported from another module. Even though the
global
variable is imported along with its associated function, and the
function is
privatized by the `EliminateAvailableExternallyPass`, the global
variable itself
is not.

It's safe to privatize such global variables, because they're _local_ to
their
associated functions. If the function itself is privatized, its
associated
global variables should also be privatized accordingly.
@shiltian
Copy link
Contributor Author

Except that the PR description indicates that this is only for the case that the function was imported and privatized:

The description just indicates the motivation and a use of this option, but it can be definitely used for other purposes but it would be more target dependent.

Is it ever safe to do this for an available externally variable not in an available externally function that is being privatized?

If we are talking about from AMDGPU's perspective, there would be no available externally function that is not being privatized in the first place because no external function call can be supported.

If we think from a target agnostic perspective, I'd say it'd depend on the target about how to interpret the AS of a global variable and there is no hard relation between an externally available function and an externally available global variable.

(BTW I see that this was already merged, which was a little premature as there was outstanding questions from a reviewer who hadn't yet approved - it's good to at least give it a day or so to see if there is follow on, I didn't have a chance to go back and look at this a second time yesterday)

Understood. I thought you would not have new response since I did wait for half a day. I'll take a better practice next time.

@teresajohnson
Copy link
Contributor

Except that the PR description indicates that this is only for the case that the function was imported and privatized:

The description just indicates the motivation and a use of this option, but it can be definitely used for other purposes but it would be more target dependent.

Is it ever safe to do this for an available externally variable not in an available externally function that is being privatized?

If we are talking about from AMDGPU's perspective, there would be no available externally function that is not being privatized in the first place because no external function call can be supported.

Ok. The reason I ask all these questions is that it is fundamentally more dangerous to privatize a copy of a global variable, because multiple copies of a read-write variable may end up reading different values (versus a function which is always read-only). At the least, can you add a comment about the intended use and the risk of using this improperly? Is it possible to guard by the target triple or something indicating that it is compiling for AMDGPU?

If we think from a target agnostic perspective, I'd say it'd depend on the target about how to interpret the AS of a global variable and there is no hard relation between an externally available function and an externally available global variable.

(BTW I see that this was already merged, which was a little premature as there was outstanding questions from a reviewer who hadn't yet approved - it's good to at least give it a day or so to see if there is follow on, I didn't have a chance to go back and look at this a second time yesterday)

Understood. I thought you would not have new response since I did wait for half a day. I'll take a better practice next time.

No worries, thanks.

@shiltian
Copy link
Contributor Author

At the least, can you add a comment about the intended use and the risk of using this improperly? Is it possible to guard by the target triple or something indicating that it is compiling for AMDGPU?

Will do a follow-up PR for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants