forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 0
[lldb] Revive TestSimulatorPlatform.py (#142244) #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
+117,510
−53,528
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
Prepare for removing AVRMCExpr. Adopt the new naming convention (S_ instead of VK_; the relocation specifier was previously named `VariantKind`)) used by most other targets. Make AVRMCAsmInfo.h include AVRMCExpr.h and change .cpp files to include AVRMCAsmInfo.h. We will eventually remove AVRMCExpr.h.
The attributor conservatively marks pointers whose loads are eligible to be marked as `!invariant.load`. It does so by identifying: 1. Pointers marked `noalias` and `readonly` 2. Pointers whose underlying objects are all eligible for invariant loads. The attributor then manifests this attribute at non-atomic non-volatile load instructions.
Although taskgroup is a privatizing construct, because of task_reduction clause, a new scope was not being created for it. This could cause an extra privatization of variables when taskgroup was lowered, because its scope would be the same as of the parent privatizing construct. This fixes regressions in tests 1052_0201 and 1052_0205, from Fujitsu testsuite. This issue didn't happen before because implicit symbols were being created in a different way before llvm#142154.
A canonicalization pattern from `spirv.GL.Length` to `spirv.GL.FAbs` for scalar operands is also added.
This patch enhances the PtrReplacer as follows: 1. Users are now collected iteratively to be generous on the stack. In the case of PHIs with incoming values which have not yet been visited, they are pushed back into the stack for reconsideration. 2. Replace users of the pointer root in a reverse-postorder traversal, instead of a simple traversal over the collected users. This reordering ensures that the operands of an instruction are replaced before replacing the instruction itself. 3. During the replacement of PHI, use the same incoming value if it does not have a replacement. This patch specifically fixes the case when an incoming value of a PHI is addrspacecasted.
putting the function name is the dissassembly instruction messes up the alignment making it less readable. put it instead with the comment. This also aligns the opcodes and instruction to the left matching the cli
To migrate away from the legacy XXXMCExpr::printImpl/evaluateAsRelocatableImpl overrides and align with other targets. While the AArch64MCAsmInfoXXX hooks introduce some duplication, they enable better separation for object file formats. Note: While AArch64MCAsmInfoDarwin uses the `@specifier` notation, it might use AArch64MCExpr with specifier VK_ABS. test/tools/llvm-mca/AArch64/Exynos/zero-latency-move.s abuses a parser behavior that :lo12: is also parsed for Mach-O (though it will fail for -filetype=obj).
…en if the element type isn't a legal scalar type. (llvm#144007) This fixes an inconsistency in i64 vector handling between RV32 and RV64. Even if i64 isn't legal as a scalar, we should still be able to split a large i64 vector to get down to a legal vector type. We only need to give up if we need to split a vscale x 1 vector.
) Reverts llvm#137215 This commit caused a failure in the LLVM CI: https://lab.llvm.org/buildbot/#/builders/10/builds/7442
Directly modeled after what we do for vector.reverse, but with restrictions on EVL and mask added.
This contains two closely related changes: 1) Explicitly recurse on the i1 case - "3" happens to be the right magic constant at m1, but is not otherwise correct, and we're better off deferring this to existing logic. 2) Match the lowering for high LMUL shuffles - we've switched to using a linear number of m1 vrgather instead of a single big vrgather. This results in substantially faster (but also larger) code for reverse shuffles larger than m1. Note that fixed vectors need a slide at the end, but scalable ones don't. This will have the effect of biasing the vectorizer towards larger (particularly scalable larger) vector factors. This increases VF for the s112 and s1112 loops from TSVC_2 (in all configurations). We could refine the high LMUL estimates a bit more, but I think getting the linear scaling right is probably close enough for the moment.
This change adds a folder for the VecCmpOp Issue llvm#136487
Removed strcmp, strlen, and memset calls from table.h and replaced them with internal functions. --------- Co-authored-by: Sriya Pratipati <[email protected]>
## Purpose This patch is one in a series of code-mods that annotate LLVM’s public interface for export. This patch annotates the `llvm/Passes` library and other pass-related headers. These annotations currently have no meaningful impact on the LLVM build; however, they are a prerequisite to support an LLVM Windows DLL (shared library) build. ## Background This effort is tracked in llvm#109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307), and documentation for `LLVM_ABI` and related annotations is found in the LLVM repo [here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst). The bulk of these changes were generated automatically using the [Interface Definition Scanner (IDS)](https://github.com/compnerd/ids) tool, followed formatting with `git clang-format`. The following manual adjustments were also applied after running IDS on Linux: - Remove the redundant declaration of the `initializeKCFIPass` function from llvm/include/llvm/InitializePasses.h because IDS only auto-annotates the first declaration it encounters, and the second un-annotated declaration results in an MSVC warning - Add `LLVM_ABI` to a number of private `AnalysisKey` fields in classes that extend the `AnalysisInfoMixin` template class. - Add `LLVM_ABI` to the `ChangeReporter` and `TextChangeReporter` template class definitions in llvm/include/llvm/Passes/StandardInstrumentations.h and remove the extern template instantiations. This is the only way I've found to get everything compiling warning-free when building a DLL because both template classes have methods implemented out-of-line. ## Validation Local builds and tests to validate cross-platform compatibility. This included llvm, clang, and lldb on the following configurations: - Windows with MSVC - Windows with Clang - Linux with GCC - Linux with Clang - Darwin with Clang
## Purpose This patch is one in a series of code-mods that annotate LLVM’s public interface for export. This patch annotates the `llvm/XRay` library. These annotations currently have no meaningful impact on the LLVM build; however, they are a prerequisite to support an LLVM Windows DLL (shared library) build. ## Background This effort is tracked in llvm#109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307), and documentation for `LLVM_ABI` and related annotations is found in the LLVM repo [here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst). The bulk of these changes were generated automatically using the [Interface Definition Scanner (IDS)](https://github.com/compnerd/ids) tool, followed formatting with `git clang-format`. Additionally, I manually added `LLVM_ABI_FRIEND` to friend member functions declared with `LLVM_ABI`. ## Validation Local builds and tests to validate cross-platform compatibility. This included llvm, clang, and lldb on the following configurations: - Windows with MSVC - Windows with Clang - Linux with GCC - Linux with Clang - Darwin with Clang
…sor::createPadHighOp` (llvm#144397) Use `ValueRange` instead of `SmallVector` in `tensor::createPadHighOp` for the `dynOutDims` arg.
…3763) ## Purpose This patch is one in a series of code-mods that annotate LLVM’s public interface for export. This patch annotates the `llvm/ObjectYAML` library. These annotations currently have no meaningful impact on the LLVM build; however, they are a prerequisite to support an LLVM Windows DLL (shared library) build. ## Background This effort is tracked in llvm#109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307), and documentation for `LLVM_ABI` and related annotations is found in the LLVM repo [here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst). These were generated automatically using the [Interface Definition Scanner (IDS)](https://github.com/compnerd/ids) tool, followed formatting with `git clang-format`. ## Validation Local builds and tests to validate cross-platform compatibility. This included llvm, clang, and lldb on the following configurations: - Windows with MSVC - Windows with Clang - Linux with GCC - Linux with Clang - Darwin with Clang
This change adds support for ComplexType ImaginaryLiteral llvm#141365
…144391) Change in llvm#107278 modified the CMake CACHE variable with values that are not supported for it as documented. This patch renames the derived vars so that they do not conflict with the CACHE variable.
uses the `SendTargetCapabilities` from llvm#142831
This change introduces a ConstantLValueEmitter class, which will be needed for emitting CIR for non-trivial constant pointers. This change introduces the class with most branches reaching an NYI diagnostic. The only path that is currently implemented is the case where an absolute pointer (usually a null pointer) is emitted. This corresponds to the existing handler for emitting l-value constants.
…Signature Elements (llvm#144106) It has pointed out [here](llvm#143198 (comment)) that we may be able to use `llvm::EnumEntry` so that we can re-use the printing logic across enumerations. - Enables re-use of `printEnum` and `printFlags` methods via templates - Allows easy definition of `getEnumName` function for enum-to-string conversion, eliminating the need to use a string stream for constructing the Name SmallString - Also, does a small fix-up of the operands for descriptor table clause to be consistent with other `Build*` methods For reference, the [test-cases](https://github.com/llvm/llvm-project/blob/main/llvm/unittests/Frontend/HLSLRootSignatureDumpTest.cpp) that must not change expected output.
…lvm#144402) Reverts llvm#144022 This has been failing postcommit CI for two days: https://lab.llvm.org/buildbot/#/builders/63
…o a read of vlenb. (llvm#144571) We know that vlenb is a multiple of RVVBytesPerBlock so we aren't shifting out any non-zero bits.
…#143762) CFIPrograms' most common uses are within debug frames, but it is not their only use. For example, some assembly writers encode them by hand into .cfi_escape directives. This PR extracts printing code for them into its own files, which avoids the need for the main class to depend on DWARFUnit, sections, and similar. One in a series of NFC DebugInfo/DWARF refactoring changes to layer it more cleanly, so that binary CFI parsing can be used from low-level code, (such as byte strings created via .cfi_escape) without circular dependencies. The final goal is to make a more limited dwarf library usable from lower-level code. More information can be found at https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665
…llvm#142713) We already evaluate the initializers for all global variables, as required by the standard. Leverage that evaluation instead of trying to separately validate static class members. This has a few benefits: - Improved diagnostics; we now get notes explaining what failed to evaluate. - Improved correctness: is_constant_evaluated is handled correctly. The behavior follows the proposed resolution for CWG1721. Fixes llvm#88462. Fixes llvm#99680.
…3820) Implement Xtensa Interrupt. HighInterrupts, Exception, Debug Options. Also implement small Xtensa Options like PRID, Coprocessor and Timers.
…in a specified address space to local (llvm#144287) Currently, the `EliminateAvailableExternallyPass` only converts certain available externally functions to local if `avail-extern-to-local` is set or in contextual profiling mode. For global variables, it only drops their initializers. This PR adds an option to allow the pass to convert global variables in a specified address space to local. The motivation for this change is to correctly support lowering of LDS variables (`__shared__` variables, in more generic terminology) when ThinLTO is enabled for AMDGPU. A `__shared__` variable is lowered to a hidden global variable in a particular address space by the frontend, which is roughly same as a `static` local variable. To properly lower it in the backend, the compiler needs to check all its uses. Enabling ThinLTO currently breaks this when a function containing a `__shared__` variable is imported from another module. Even though the global variable is imported along with its associated function, and the function is privatized by the `EliminateAvailableExternallyPass`, the global variable itself is not. It's safe to privatize such global variables, because they're _local_ to their associated functions. If the function itself is privatized, its associated global variables should also be privatized accordingly.
…ter' (llvm#92467) Clarify that these functions are no-ops when linking to LLVM as a shared object.
## Purpose This patch makes a minor changes to LLVM and Clang so that LLVM can be built as a Windows DLL with `clang-cl`. These changes were not required for building a Windows DLL with MSVC. ## Background The Windows DLL effort is tracked in llvm#109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307), and documentation for `LLVM_ABI` and related annotations is found in the LLVM repo [here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst). ## Overview Specific changes made in this patch: - Remove `constexpr` fields that reference DLL exported symbols. These symbols cannot be resolved at compile time when building a Windows DLL using `clang-cl`, so they cannot be `constexpr`. Instead, they are made `const` and initialized in the implementation file rather than at declaration in the header. - Annotate symbols now defined out-of-line with `LLVM_ABI` so they are exported when building as a shared library. - Explicitly add default copy assignment operator for `ELFFile` to resolve a compiler warning. ## Validation Local builds and tests to validate cross-platform compatibility. This included llvm, clang, and lldb on the following configurations: - Windows with MSVC - Windows with Clang - Linux with GCC - Linux with Clang - Darwin with Clang
July 14 2024 I landed a change to update progress reporting when loading kernel/firmware binaries llvm#98845 In DynamicLoader::LoadBinaryWithUUIDAndAddress I removed code that was setting the ModuleSpec to the provided name, if the name provided is that of a file on disk. With this code missing, if a filepath name is passed in, this code will fail to find that binary on the local disk. There's nothing in the PR / intention that would lead to this change, it was unintentional.
Test cleanup: 1) separate layout.mlir from ops.mlir for layout related test 2) remove lane layout for ops working at work item scope. 3) remove redundant test in create_tdesc/update_tdesc/prefetch. 4) remove "test_" from all test function name.
…ement and fptrunc users (llvm#141758) Now we only support D16 folding for `image sample` instructions with a single user: a `fptrunc` to half. However, we can actually support D16 folding for image.sample instructions with multiple users, as long as each user follows the pattern of extractelement followed by fptrunc to half. For example: ``` %sample = call <4 x float> @llvm.amdgcn.image.sample %e0 = extractelement <4 x float> %sample, i32 0 %h0 = fptrunc float %e0 to half %e1 = extractelement <4 x float> %sample, i32 1 %h1 = fptrunc float %e1 to half %e2 = extractelement <4 x float> %sample, i32 2 %h2 = fptrunc float %e2 to half ``` This change enables D16 folding for such cases and avoids generating `v_cvt_f16_f32_e32` instructions.
… alias. The motivation for this is that it causes the jump table entry's symbol to have an st_size equal to the jump table entry size, instead of being equal to the size of the entire jump table, which is incorrect and can lead to unexpected behavior in binary analysis tools that rely on the size field such as Bloaty. Reviewers: fmayer Reviewed By: fmayer Pull Request: llvm#144462
…tension (llvm#144320) The spec can be found at: https://github.com/andestech/andes-v5-isa/releases/tag/ast-v5_4_0-release. This patch only supports assembler. The instructions are similar to `Zvfbfmin` and the only difference with `Zvfbfmin` is that `XAndesVBFHCvt` doesn't have mask variant.
…mory ranges (llvm#136040) Recently I was debugging a Minidump with a few thousand ranges, and came across the (now deleted) comment: ``` // I don't have a sense of how frequently this is called or how many memory // ranges a Minidump typically has, so I'm not sure if searching for the // appropriate range linearly each time is stupid. Perhaps we should build // an index for faster lookups. ``` blaming this comment, it's 9 years old! Much overdue for this simple fix with a range data vector. I had to add a default constructor to Range in order to implement the RangeDataVector, but otherwise this just a replacement of look up logic.
These instructions count leading/trailing ones in the register. Currently these are only generated when we have `Zbb` enabled (along with `Xqcibm`) since it contains the `CTTZ/CTLZ` instructions.
Some of these incorrectly call the l suffixed version of libm functions and others assert.
…vm#144382) This wasn't setting the correct libcall names, which default to the l suffixed libm names.
Previously, the loop would only add labels in the set of sections that were closest to Target. Set of section here because multiple sections can have the same address, so all of their symbols would be added to the set of candidate symbols. The following changes make it such that we loop down from sections closest to the Target and populate the set of candidate symbols with symbols from the first set of sections that do contain symbols
This stack of pull requests is managed by Graphite. Learn more about stacking. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
[lldb] Revive TestSimulatorPlatform.py (llvm#142244)
This test was incorrectly disabled and bitrotted since then. This PR
fixes up the test and re-enables it.
[llvm-rc] Add support for multiplication and division in expressions (llvm#143373)
This is supported by GNU windres. MS rc.exe does accept these
expressions, but doesn't evalulate them correctly, it only returns the
left hand side.
This fixes one aspect of
llvm#143157.
[LV] Remove unused LoopBypassBlocks from ILV (NFC).
After recent refactorings to move parts of skeleton creation
LoopBypassBlocks isn't used any more. Remove it.
[Clang] [Cygwin] va_list must be treated like normal Windows (llvm#143115)
Handling of va_list on Cygwin environment must be matched to normal
Windows environment.
The existing test
test/CodeGen/ms_abi.c
seems relevant, but itcontains
__attribute__((sysv_abi))
, which is not supported on Cygwin.The new test is based on the
__attribute__((ms_abi))
portion of thattest.
Co-authored-by: jeremyd2019 [email protected]
[SeparateConstOffsetFromGEP] Decompose constant xor operand if possible (llvm#135788)
Try to transform XOR(A, B+C) in to XOR(A,C) + B where XOR(A,C) becomes
the base for memory operations. This transformation is true under the
following conditions
Check 1 - B and C are disjoint.
Check 2 - XOR(A,C) and B are disjoint.
This transformation is beneficial particularly for GEPs because
Disjoint OR operations often map better to addressing modes than XOR.
This can enable further optimizations in the GEP offset folding pipeline
[BOLT] Expose external entry count for functions (llvm#141674)
Record the number of function invocations from external code - code
outside the binary, which may include JIT code and DSOs. Accounting
external entry counts improves the fidelity of call graph flow
conservation analysis.
Test Plan: updated shrinkwrapping.test
[flang][runtime] Replace recursion with iterative work queue (llvm#137727)
Recursion, both direct and indirect, prevents accurate stack size
calculation at link time for GPU device code. Restructure these
recursive (often mutually so) routines in the Fortran runtime with new
implementations based on an iterative work queue with
suspendable/resumable work tickets: Assign, Initialize, initializeClone,
Finalize, and Destroy.
Default derived type I/O is also recursive, but already disabled. It can
be added to this new framework later if the overall approach succeeds.
Note that derived type FINAL subroutine calls, defined assignments, and
defined I/O procedures all perform callbacks into user code, which may
well reenter the runtime library. This kind of recursion is not handled
by this change, although it may be possible to do so in the future using
thread-local work queues.
The effects of this restructuring on CPU performance are yet to be
measured.
[flang][NFC] Clean up code in two new functions (llvm#142037)
Two recently-added functions in Semantics/tools.h need some cleaning up
to conform to the coding style of the project. One of them should
actually be in Parser/tools.{h,cpp}, the other doesn't need to be
defined in the header.
[flang] Ensure overrides of special procedures (llvm#142465)
When a derived type declares a generic procedure binding of interest to
the runtime library, such as for ASSIGNMENT(=), it overrides any binding
that might have been present for the parent type.
Fixes llvm#142414.
[IR] Simplify scalable vector handling in ShuffleVectorInst::getShuffleMask. NFC (llvm#143596)
Combine the scalable vector UndefValue check with the earlier
ConstantAggregateZero handling for fixed and scalable vectors.
Assert that the rest of the code is only reached for fixed vectors.
Use append instead of resize since we know the size is increasing.
[IR2Vec] Exposing Embedding as an data type wrapped around std::vector (llvm#143197)
Currently
Embedding
isstd::vector<double>
. This PR makes it a data type wrapped aroundstd::vector<double>
to overload basic arithmetic operators and expose comparison operations. It simplifies the usage here and in the passes where operations onEmbedding
would be performed.(Tracking issue - llvm#141817)
[RISCV][TTI] Allow partial reduce with mismatched extends (llvm#143608)
This depends on the recently add partial_reduce_sumla node for lowering
but at this point, we have all the parts.
[lldb] Fix
target stop-hook add
help outputThe help output for
target stop-hook add
references non-existingoption
--one-line-command
. The correct option is--one-liner
:This commit fixes the help text.
rdar://152730660
[HWASAN] Disable LSan test on Android (llvm#143625)
Android HWASan does not support LSan.
[flang][cuda] Fix CUDA generic resolution for VALUE arguments in device procedures (llvm#140952)
For actual arguments that have VALUE attribute inside device routines, treat them as if they have device attribute.
Disable prctl test when building for arm or riscv. (llvm#143627)
I'm setting up a buildbot for arm32 using qemu and qemu doesn't support
PR_GET_THP_DISABLE.
Disable the test for now while we figure out what to do about that.
Also disable for riscv because we may do the same for riscv buildbots.
Revert "[SeparateConstOffsetFromGEP] Decompose constant xor operand if possible (llvm#135788)"
This reverts commit 13ccce2.
The tests are on non-canonical IR, and adds an extra unrelated
pre-processing step to the pass. I'm assuming this is a workaround
for the known-bits recursion depth limit in instcombine.
[CIR] Upstream support for calling constructors (llvm#143579)
This change adds support for calling C++ constructors. The support for
actually defining a constructor is still missing and will be added in a
later change.
Revert "[CI] Migrate to runtimes build" (llvm#143612)
Reverts llvm#142696
See llvm#143610 for details; I
believe this PR causes CI builders to build LLVM in a way that's been
broken for a while. To keep CI green, if this is the correct culprit,
those tests should be fixed or skipped
[TySan][CMake] Depend on tysan for check-tysan in runtimes build (llvm#143597)
The runtimes build expects libclang_rt.tysan.a to be available, but the
check-tysan target does not actually depend on it when built using a
runtimes build with LLVM_ENABLE_RUNTIMES pointing at ./llvm. This means
we get test failures when running check-compiler-rt due to the missing
static archive.
This patch also makes check-tysan depend on tysan when we are using the
runtimes build.
This is causing premerge failures currently since we recently migrated
to the runtimes build.
[PGO][Offload] Fix offload coverage mapping (llvm#143490)
This pull request fixes coverage mapping on GPU targets.
from public globals was causing issues in cases where multiple
device-code object files are linked together.
[RISCV][NFC] Factor out VLEN in the SiFive7 scheduling model (llvm#143629)
In preparation of reusing SiFive7Model for sifive-x390, which has a VLEN
of 1024, it's better (and less chaotic) to factor out the VLEN parameter
from various of places first: the plan is to do a major overhaul on this
file in which all the
WriteRes
are going to be encapsulated in a bigmulticlass
, where VLEN is one of its template arguments. Such that wecan instantiate different scheduling models with different VLEN.
Before that happens, a placeholder defvar
SiFive7VLEN
is used insteadin this patch.
NFC.
Co-authored-by: Michael Maitland [email protected]
Revert "[SelectionDAG] Make
(a & x) | (~a & y) -> (a & (x ^ y)) ^ y
available for all targets" (llvm#143648)[NFC] get rid of
undef
in avx512vl-intrinsics.ll test (llvm#143641)[AMDGPU][True16] remove AsmVOP3OpSel (llvm#143465)
This is NFC. Clean up the AsmVOP3OpSel field, and use Vop3Base instead.
[flang][runtime] Fix build bot flang-runtime-cuda-gcc errors (llvm#143650)
Adjust default parent class accessibility to attemp to work around what
appear to be old GCC's interpretation.
[RISCV][NFC] Improve test coverage for xtheadcondmov and xmipscmov (llvm#143567)
Co-authored-by: Harsh Chandel [email protected]
[flang][cuda] Add option to disable warp function in semantic (llvm#143640)
These functions are not available in some lower compute capabilities.
Add option in the language feature to enforce the semantic check on
these.
[RISCV] Select signed bitfield insert for XAndesPerf (llvm#143356)
This patch is similar to llvm#142737
The XAndesPerf extension includes signed bitfield extraction
instruction `NDS.BFOS, which can extract the bits from 0 to Len - 1,
place them starting at bit Lsb, zero-filled the bits from 0 to Lsb -1,
and sign-extend the result.
When Lsb == Msb, it is a special case where the Lsb will be set to 0
instead of being equal to the Msb.
[Clang][NFC] Move UntypedParameters instead of copy (llvm#143646)
Static analysis flagged that UntypedParameters could be moved instead of
copied. This would avoid copying a large object.
[libc++] Add missing C++20 [time.point.arithmetic] (llvm#143165)
This was part of https://wg21.link/p0355r7, but apparently never
implemented.
Co-authored-by: MarcoFalke *~=`'#}+{/-|&$^[email protected]
Co-authored-by: Hristo Hristov [email protected]
[X86] Add test coverage showing failure to merge "zero input passthrough" behaviour for BSR instructions on x86_64 targets
[X86] combineConcatVectorOps - ensure we're only concatenating v2f64 generic shuffles into vXf64 vshufpd
Identified while triaging llvm#143606 - we can't concat v4f64 lhs/rhs subvecs and then expect the v2f64 operands to be in the correct place for VSHUFPD
Test coverage will follow
[test][AArch64] Adjust vector insertion lit tests (llvm#143101)
The test cases test_insert_v16i8_insert_2_undef_base and
test_insert_v16i8_insert_2_undef_base_different_valeus in
CodeGen/AArch64/arm64-vector-insertion.ll was leaving element 8 in the
vector as "undef" without any real explanation. It kind of looked like a
typo as the input IR looked like this
%v.8 = insertelement <16 x i8> %v.7, i8 %a, i32 8
%v.10 = insertelement <16 x i8> %v.7, i8 %a, i32 10
leaving %v.8 as unused.
This patch is cleaning up the tests a bit by adding separate test cases
to validate what is happening when skipping insert at index 8, while
amending the original tests cases to use %v.8 instead of %v.7 when
creating %v.10.
[BOLT][AArch64] Fix adr-relaxation.s test (llvm#143151)
On some AArch64 machines the splitting was inconsistent.
This causes cold
foo
to have amov
instruction before adrp.This patch removes the
mov
instruction right above .L2, makingsplitting deterministic.
[CI] Add mention of LLVM Developer Policy in email-check message (NFC) (llvm#143300)
As for now, It may be hard for people to get truth from long Discourse
discussion, so a link to official document may be enough to convince
changing email from private to public.
[X86] Add test coverage showing failure to merge "zero input passthrough" behaviour for BSF instructions on x86_64 targets
[X86] add test coverage for llvm#143606
[X86] bmi-select-distrib.ll - remove unused check prefixes and pull out PR comments above tests. NFC
Revert "[AArch64][GlobalISel] Expand 64bit extracts to 128bit to allow more patterns (llvm#142904)"
This reverts commit 61cdba6 due to verifier
issues.
[coro][NFC] Move switch basic block to beginning of coroutine (llvm#143626)
This makes the code flow when reading the LLVM IR of a split coroutine a
bit more natural. It does not change anything from an end-user
perspective but makes debugging the CoroSplit pass slightly easier.
Reland "[SelectionDAG] Make
(a & x) | (~a & y) -> (a & (x ^ y)) ^ y
available for all targets" (llvm#143651)[flang] Enable delayed localization by default for
do concurrent
(llvm#142567)This PR aims to make it easier and more self-contained to revert the
switch/flag if we discover any problems with enabling it by default.
[OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (llvm#134709)
Codegen support for reduction over private variable with reduction
clause. Section 7.6.10 in in OpenMP 6.0 spec.
the private copies created by the clause.
its original list item is updated by merging its value with that of the
shared copy and then broadcast to all threads.
Sample Test Case from OpenMP 6.0 Example
Expected Codegen:
Co-authored-by: Chandra Ghale [email protected]
[flang][OpenMP] Map basic
local
specifiers toprivate
clauses (llvm#142735)Starts the effort to map
do concurrent
locality specifiers to OpenMPclauses. This PR adds support for basic specifiers (no
init
orcopy
regions yet).
[MemCpyOpt] handle memcpy from memset in more cases (llvm#140954)
This aims to reduce the divergence between the initial checks in this
function and processMemCpyMemCpyDependence (in particular, adding
handling of offsets), with the goal to eventually reduce duplication
there and improve this pass in other ways.
[AArch64][Clang] Update new Neon vector element types. (llvm#142760)
This updates the element types used in the new __Int8x8_t types added in
llvm#126945, mostly to allow C++ name mangling in ItaniumMangling
mangleAArch64VectorBase to work correctly. Char is replaced by
SignedCharTy or UnsignedCharTy as required and Float16Ty is better using
HalfTy to match the vector types. Same for Long types.
[mlir][async][nfc] Fix typo in async op description (llvm#143621)
[flang][Driver] Enable support for -mmacos-version-min= (llvm#143508)
So far as I can tell this option is driver-only so we can just re-use
what already exists for clang. I've added a unit test based on clang's
unit test to demonstrate that the option is handled.
Still TODO is to ensure that flang-rt is built with the same macos
minimum version as compiler-rt. At the moment, setting the flang minimum
version to older than the macos version on which flang was built will
lead to link warnings because flangrt is built for version of macos on
which flang was built rather than the oldest supported version (as
compiler-rt is).
[C++20][Modules] Fix false compilation error with constexpr (llvm#143168)
Use declaresSameEntity when evaluating constexpr to avoid resetting
computed union value due to using different instances of the merged
field decl.
[libunwind] Remove checks for -nostdlib++ (llvm#143162)
libunwind uses a C linker, so it's never even trying to link against any
C++ libraries. This removes the code which tries to drop C++ libraries,
which makes the CMake configuration simpler and allows for upgrading
GCC.
[LLVM][SROA] Teach SROA how to "bitcast" between fixed and scalable vectors. (llvm#130973)
For function whose vscale_range is limited to a single value we can size
scalable vectors. This aids SROA by allowing scalable vector load and
store operations to be considered for replacement whereby bitcasts
through memory can be replaced by vector insert or extract operations.
LLVM Buildbot failure on openmp runtime test (llvm#143674)
Error looks to be missing includes for complex number support in some
system. Removing test for now.
Relevant PR :
PR-134709
Co-authored-by: Chandra Ghale [email protected]
[DebugInfo][RemoveDIs] Remove scoped-dbg-format-setter (llvm#143450)
This was a utility for flipping between intrinsic and debug record mode
-- we don't need it any more. The "IsNewDbgInfoFormat" should be true
everywhere.
[AArch64] Consider negated powers of 2 when calculating throughput cost (llvm#143013)
Negated powers of 2 have similar or (exact in the case of remainder)
codegen with lowering sdiv. In the case of sdiv, it just negates the
result in the end anyway, so nothing dissimilar at all.
[clang][AArch64] test -cc1 -print-enabled-extensions (llvm#143570)
This adds tests that document how -cc1 and -print-enabled-extensions
interact. The current behaviour looks wrong, and is caused by the fact
that --print-enabled-extensions uses the MC subtarget feature API to
determine the list of extensions to print, whereas the frontend uses the
TargetParser API. The latter does no dependency expansion for the
-target-feature flags but the MC API does.
This doesn't fix anything but at least it documents the current
behaviour, and will serve as a pre-commit test for any future fixes.
[ConstantFolding] Fold sqrt poison -> poison (llvm#141821)
I noticed this when a sqrt produced by VectorCombine with a poison
operand wasn't getting folded away to poison.
Most intrinsics in general could probably be folded to poison if one of
their arguments are poison too. Are there any exceptions to this we need
to be aware of?
[doc] Use ISO nomenclature for 1024 byte units (llvm#133148)
Increase specificity by using the correct unit sizes. KBytes is an
abbreviation for kB, 1000 bytes, and the hardware industry as well as
several operating systems have now switched to using 1000 byte kBs.
If this change is acceptable, sometimes GitHub mangles merges to use the
original email of the account. $dayjob asks contributions have my work
email. Thanks!
[mlir][vector] Fix attaching write effects on transfer_write's base (llvm#142940)
This fixes an issue with
TransferWriteOp
's implementation of theMemoryEffectOpInterface
where the write effect was attached to thestored value rather than the base.
This had the effect that when asking for the memory effects for the
input memref buffer using
getEffectsOnValue(...)
, the function wouldreturn no-effects (as the effect would have been attached to the stored
value rather than the input buffer).
[flang][OpenMP] Extend locality spec to OMP claues (
init
anddealloc
regions) (llvm#142795)Extends support for locality specifier to OpenMP translation by adding
supprot for transling localizers that have
init
anddealloc
regions.[debuginfo][coro] Fix linkage name for clones of coro functions (llvm#141889)
So far, the
DW_AT_linkage_name
of the coroutineresume
,destroy
,cleanup
andnoalloc
function clones were incorrectly set to theoriginal function name instead of the updated function names.
With this commit, we now update the
DW_AT_linkage_name
to the correctname. This has multiple benefits:
output of
llvm-dwarf-dump
when coroutines are involved.of the function you are in. E.g., GDB now prints "Breakpoint 1.2,
coro_func(int) [clone .resume] (v=43) at ..." instead of "Breakpoint
1.2, coro_func(int) (v=43) at ...".
info line coro_func
command now allows you to distinguish themultiple different clones of the function.
In Swift, the linkage names of the clones were already updated. The
comment right above the relevant code in
CoroSplit.cpp
already hintedthat the linkage name should probably also be updated in C++. This
comment was added in commit 6ce76ff, and back then the
corresponding
DW_AT_specification
(i.e.,SP->getDeclaration()
) wasnot updated, yet, which led to problems for C++. In the meantime, commit
ca1a5b3 added code to also update
SP->getDeclaration
, as suchthere is no reason anymore to not update the linkage name for C++.
Note that most test cases used inconsistent function names for the LLVM
function vs. the DISubprogram linkage name. clang would never emit such
LLVM IR. This confused me initially, and hence I fixed it while updating
the test case.
Drive-by fix: The change in
CGVTables.cpp
is purely stylistic, NFC.When looking for other usages of
replaceWithDistinct
, I got initiallyconfused because
CGVTables.cpp
was calling a static function via anobject instance.
MSP430: Add tests for fcmp (llvm#142706)
The existing coverage is thin. libcalls.ll seems to be the main fcmp
test, and it doesn't cover all the condition types, and runs with -O0.
Test all conditions for f32 and f64
[RISCV][FPEnv] Lowering of fpenv intrinsics (llvm#141498)
The change implements custom lowering of
get_fpenv
,set_fpenv
andreset_fpenv
for RISCV target.[lldb] Show coro_frame in
std::coroutine_handle
pretty printer (llvm#141516)This commit adjusts the pretty printer for
std::coroutine_handle
basedon recent personal experiences with debugging C++20 coroutines:
coro_frame
member. This member exposes the completecoroutine frame contents, including the suspension point id and all
internal variables which the compiler decided to persist into the
coroutine frame. While this data is highly compiler-specific, inspecting
it can help identify the internal state of suspended coroutines.
promise
andcoro_frame
members, even ifdevirtualization failed and we could not infer the promise type / the
coro_frame type. Having them available as
void*
pointers can still beuseful to identify, e.g., which two coroutine handles have the same
frame / promise pointers.
MSP430: Stop using setCmpLibcallCC (llvm#142708)
This appears to only be useful for the eq/ne cases, and only for
ARM libcalls. This is setting it to the default values, and there's
no change in the new fcmp test output.
MSP430: Partially move runtime libcall config out of TargetLowering (llvm#142709)
RuntimeLibcalls needs to be correct outside of codegen contexts.
[HLSL][SPIR-V] Change SPV AS map for groupshared (llvm#143519)
The previous mapping we setting the hlsl_groupshared AS to 0, which
translated to either Generic or Function.
Changing this to 3, which translated to Workgroup.
Related to llvm#142804
[HLSL][SPIR-V] Handle SV_Position builtin in PS (llvm#141759)
This commit is using the same mechanism as vk::ext_builtin_input to
implement the SV_Position semantic input.
The HLSL signature is not yet ready for DXIL, hence this commit only
implements the SPIR-V side.
This is incomplete as it doesn't allow the semantic on hull/domain and
other shaders, but it's a first step to validate the overall
input/output
semantic logic.
Fixes llvm#136969
[libc++] Fix constraints in
__countr_zero
and__popcount
Currently these two functions are constrained on
is_unsigned
, which ismore permissive than what is required by the standard for their public
counterparts. This fixes the constraints to match the public functions
by using
__libcpp_is_unsigned_integer
instead.[libc++] Refactor signed/unsigned integer traits (llvm#142750)
This patch does a few things:
__libcpp_is_signed_integer
and__libcpp_is_unsigned_integer
arerefactored to be variable templates instead of class templates.
<__type_traits/integer_traits.h>
.__libcpp_signed_integer
,__libcpp_unsigned_integer
and__libcpp_integer
are moved into the same header.__signed_integer
,__unsigned_integer
and__signed_or_unsigned_integer
respectively.[libc++][NFC] Move __libcpp_is_integral into the else branch (llvm#142556)
This makes it clear that
__libcpp_is_integral
is an implementationdetail of
is_integral
if we don't have__is_integral
and not its ownutility.
[gn build] Port 3c56437
[DebugInfo][RemoveDIs] Use autoupgrader to convert old debug-info (llvm#143452)
By chance, two things have prevented the autoupgrade path being
exercised much so far:
block.
In practice, this appears to mean this code path hasn't seen the various
invalid inputs that can come its way. This commit does a number of
things:
debug-intrinsics, and that must be tolerated until the Verifier runs,
is,
Plus a few new tests for other intrinsic-to-debug-record failures modes
I found. There are also two edge cases:
record modes at will; I've deleted coverage and some assertions to
tolerate this as intrinsic support is now Gone (TM),
because the autoupgrader upgrades in the opposite order to the basic
block conversion routines... which doesn't change the record order, but
does change the use list order in Metadata! This should (TM) have no
consequence to the correctness of LLVM, but will change the order of
various records and the order of DWARF record output too.
I tried to reduce this patch to a smaller collection of changes, but
they're all intertwined, sorry.
[mlir][spirv] Add lowering of multiple math trig/hypb functions (llvm#143604)
Add Math to SPIRV lowering for tan, asin, acos, sinh, cosh, asinh, acosh
and atanh. This completes the lowering of all trigonometric and
hyperbolic functions from math to SPIRV.
[flang][OpenMP] Consider previous DSA for static duration variables (llvm#143601)
Symbols that have a pre-existing DSA set in the enclosing context should
not be made shared based on them being static duration variables.
Suggested-by: Leandro Lupori [email protected]
Signed-off-by: Kajetan Puchalski [email protected]
[flang][runtime] Another try to fix build failure (llvm#143702)
Tweak accessibility to try to get code past whatever gcc is being used
by the flang-runtime-cuda-gcc build bot.
[mlir][spirv] Include
SPIRV_AnyImage
inSPIRV_Type
(llvm#143676)This change is trigger by encountering the following error:
[Clang] default-movable should be based on the first declaration (llvm#143661)
When the definition of a special member function was defaulted we would
not consider it user-provided, even when the first declaration was not
defaulted.
Fixes llvm#143599
[DebugInfo][RemoveDIs] Remove some debug intrinsic-only codepaths (llvm#143451)
These are opportunistic deletions as more places that make use of the
IsNewDbgInfoFormat flag are removed. It should (TM)(R) all be dead code
now that
IsNewDbgInfoFormat
should be true everywhere.FastISel: we don't need to do debug-aware instruction counting any more,
because there are no debug instructions,
Autoupgrade: you can no-longer avoid autoupgrading of intrinsics to
records
DIBuilder: Delete the code for creating debug intrinsics (!)
LoopUtils: No need to handle debug instructions, they don't exist
[flang] Add David Truby as maintainer for Flang on Windows (llvm#142619)
Revert "[DebugInfo][RemoveDIs] Remove some debug intrinsic-only codepaths (llvm#143451)"
This reverts commit c71a2e6.
/me squints -- this is hitting an assertion I thought had been deleted,
will revert and investigate for a bit.
[mlir][spirv] Truncate Literal String size at max number words (llvm#142916)
If not truncated the SPIRV serialization would not fail but instead
produce an invalid SPIR-V module.
Signed-off-by: Davide Grohmann [email protected]
[X86][BreakFalseDeps] Using reverse order for undef register selection (llvm#137569)
BreakFalseDeps picks the best register for undef operands if
instructions have false dependency. The problem is if the instruction is
close to the beginning of the function, ReachingDefAnalysis is over
optimism to the unused registers, which results in collision with
registers just defined in the caller.
This patch changes the selection of undef register in an reverse order,
which reduces the probability of register collisions between caller and
callee. It brings improvement in some of our internal benchmarks with
negligible effect on other benchmarks.
[AArch64] Expand llvm.histogram intrinsic to support umax, umin, and uadd.sat operations (llvm#138447)
This patch extends the llvm.histogram intrinsic to support additional
update operations beyond the existing add. Specifically, the new
supported operations are:
umax: unsigned maximum
umin: unsigned minimum
uadd.sat: unsigned saturated addition
Based on the discussion from:
https://discourse.llvm.org/t/rfc-expanding-the-experimental-histogram-intrinsic/84673
[flang][acc] Ensure all acc.loop get a default parallelism determination mode (llvm#143623)
This PR updates the flang lowering to explicitly implement the OpenACC
rules:
construct with no auto or seq clause is treated as if it has the
independent clause when it is an orphaned loop construct or its parent
compute construct is a parallel construct.
compute construct is a kernels construct, a loop construct with no
independent or seq clause is treated as if it has the auto clause.
seq
if they have no other parallelismmarking such as gang, worker, vector.
For now the
acc.loop
verifier has not yet been updated to enforcethis.
[HLSL][Driver] Make vk1.3 the default. (llvm#143384)
The HLSL driver currently defaults the triple to an unversioned os and
subarch when targeting SPIR-V. This means the SPIR-V backend decides the
default value. That is not a great option because a change the backend
could cause a change in Clang.
Now that we want to choose the default we need to consider the best
option. DXC currently defaults to Vulkan1.0. We are planning on not
supporting Vulkan1.0 in the Clang HLSL compiler because it is newer
versions of Vulkan are commonly supported on nearly all hardware, so
users do not use it.
Since we have to change from DXC anyway, we are using VK1.3. It has been
out long enough to be commonly available, and the initial implementation
of SPIR-V features for HLSL are assuming Vulkan 1.3.
Co-authored-by: Nathan Gauër [email protected]
[BasicAA][ValueTracking] Use MaxLookupSearchDepth constant (NFC)
Use MaxLookupSearchDepth in all places limiting an underlying
object walk, instead of hardcoding 6 in various places.
Revert runtime work queue patch, it breaks some tests that need investigation (llvm#143713)
Revert "[flang][runtime] Another try to fix build failure"
This reverts commit 13869ca.
Revert "[flang][runtime] Fix build bot flang-runtime-cuda-gcc errors
(llvm#143650)"
This reverts commit d75e284.
Revert "[flang][runtime] Replace recursion with iterative work queue
(llvm#137727)"
This reverts commit 163c67a.
[mlir][spirv] Add definition for GL Exp2 (llvm#143678)
[Clang][ByteCode][NFC] Move APInt into pushInteger since it is being passed by value (llvm#143578)
Static analysis flagged that we could move APInt instead of copy, indeed
it has a move constructor and so we should move into values for APInt.
[flang][OpenMP] Overhaul implementation of ATOMIC construct (llvm#137852)
The parser will accept a wide variety of illegal attempts at forming an
ATOMIC construct, leaving it to the semantic analysis to diagnose any
issues. This consolidates the analysis into one place and allows us to
produce more informative diagnostics.
The parser's outcome will be parser::OpenMPAtomicConstruct object
holding the directive, parser::Body, and an optional end-directive. The
prior variety of OmpAtomicXyz classes, as well as OmpAtomicClause have
been removed. READ, WRITE, etc. are now proper clauses.
The semantic analysis consistently operates on "evaluation"
representations, mainly evaluate::Expr (as SomeExpr) and
evaluate::Assignment. The results of the semantic analysis are stored in
a mutable member of the OpenMPAtomicConstruct node. This follows a
precedent of having
typedExpr
member in parser::Expr, for example.This allows the lowering code to avoid duplicated handling of AST nodes.
Using a BLOCK construct containing multiple statements for an ATOMIC
construct that requires multiple statements is now allowed. In fact, any
nesting of such BLOCK constructs is allowed.
This implementation will parse, and perform semantic checks for both
conditional-update and conditional-update-capture, although no MLIR will
be generated for those. Instead, a TODO error will be issues prior to
lowering.
The allowed forms of the ATOMIC construct were based on the OpenMP 6.0
spec.
[flang][Driver] Guard check for pic/pie settings without driver flags (llvm#143530)
The default relocation model for clang depends on the cmake flag
CLANG_DEFAULT_PIE_ON_LINUX. By default it is set to ON, but when it's
OFF, the default relocation model will be "static".
The outcome of the test running clang without any PIC/PIE flags will
depend on the cmake flag, so make sure it only runs when the flag is ON.
[PowerPC][AIX] xfail atan-intrinsic to unblock bot (llvm#143723)
Testcase from llvm#143416 is
causing the AIX bot to be red. XFAIL for now till issue can be resolved.
[LTO] Fix used before intialised warning (llvm#143705)
For whatever reason I can't reproduce this locally but I can on Compiler
Explorer (https://godbolt.org/z/nfv4b83q6) and on our flang gcc bot
(https://lab.llvm.org/buildbot/#/builders/130/builds/13683/steps/5/logs/stdio).
In file included from ../llvm-project/llvm/include/llvm/LTO/LTO.h:33,
from
../llvm-project/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:29:
../llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h: In
constructor ‘llvm::FunctionImporter::ImportListsTy::ImportListsTy()’:
../llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h:275:33:
warning: member ‘llvm::FunctionImporter::ImportListsTy::ImportIDs’ is
used uninitialized [-Wuninitialized]
275 | ImportListsTy() : EmptyList(ImportIDs) {}
| ^~~~~~~~~
../llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h: In
constructor
‘llvm::FunctionImporter::ImportListsTy::ImportListsTy(size_t)’:
../llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h:276:44:
warning: member ‘llvm::FunctionImporter::ImportListsTy::ImportIDs’ is
used uninitialized [-Wuninitialized]
276 | ImportListsTy(size_t Size) : EmptyList(ImportIDs), ListsImpl(Size)
{}
| ^~~~~~~~~
ImportIDs was being used during construction of EmptyList, before
ImportIDs itself had been constructed.
[flang] Fix warnings
This patch fixes:
flang/lib/Lower/OpenMP/OpenMP.cpp:3904:9: error: unused variable
'action0' [-Werror,-Wunused-variable]
flang/lib/Lower/OpenMP/OpenMP.cpp:3905:9: error: unused variable
'action1' [-Werror,-Wunused-variable]
[NFC][PowerPC] Rename xxevalPattern to adhere to naming convention. (llvm#143675)
Rename class
xxevalPattern
to adhere to naming convention listed inthe coding guideline and used for all other classes in the td file.
[libc++] Make forward_list constexpr as part of P3372R3 (llvm#129435)
Fixes llvm#128658
[llvm] annotate interfaces in llvm/TargetParser for DLL export (llvm#143616)
Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the
llvm/TargetParser
library. These annotations currently have no meaningful impact on the
LLVM build; however, they are a prerequisite to support an LLVM Windows
DLL (shared library) build.
Background
This effort is tracked in llvm#109483. Additional context is provided in
this
discourse,
and documentation for
LLVM_ABI
and related annotations is found in theLLVM repo
here.
Most of these changes were generated automatically using the Interface
Definition Scanner (IDS) tool,
followed formatting with
git clang-format
.Additionally, I manually removed the redundant declaration of
getCanonicalArchName
fromllvm/include/llvm/TargetParser/ARMTargetParser.h because IDS only
auto-annotates the first declaration it encounters, and the second
un-annotated declaration results in an MSVC warning.
Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
[llvm] annotate interfaces in llvm/SandboxIR for DLL export (llvm#142863)
Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the
llvm/SandboxIR
library.These annotations currently have no meaningful impact on the LLVM build;
however, they are a prerequisite to support an LLVM Windows DLL (shared
library) build.
Background
This effort is tracked in llvm#109483. Additional context is provided in
this
discourse,
and documentation for
LLVM_ABI
and related annotations is found in theLLVM repo
here.
The bulk of these changes were generated automatically using the
Interface Definition Scanner (IDS)
tool, followed formatting with
git clang-format
.The following manual adjustments were also applied after running IDS on
Linux:
GlobalWithNodeAPI::LLVMGVToGV::operator()
templatefunction instantiations that were previously added for the dylib build.
Instead, directly annotate the
LLVMGVToGV::operator()
method withLLVM_ABI
. This is done so the DLL build works with both MSVC andclang-cl.
#include "llvm/SandboxIR/Value.h"
inTracker.h
so thatthe symbol is available for exported templates in this file. These
templates get fully instantiated on DLL export, so they require the full
definition of
Value
.GlobalWithNodeAPI
template types in
Constants.h
and annotate them withLLVM_TEMPLATE_ABI
.LLVM_EXPORT_TEMPLATE
toGlobalWithNodeAPI
templateinstantiations in
Constants.cpp
.Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
[llvm] annotate interfaces in llvm/TextAPI for DLL export (llvm#143447)
Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the
llvm/TextAPI
library.These annotations currently have no meaningful impact on the LLVM build;
however, they are a prerequisite to support an LLVM Windows DLL (shared
library) build.
Background
This effort is tracked in llvm#109483. Additional context is provided in
this
discourse,
and documentation for
LLVM_ABI
and related annotations is found in theLLVM repo
here.
These changes were generated automatically using the Interface
Definition Scanner (IDS) tool,
followed formatting with
git clang-format
.Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
[TableGen] Simplify computeUberWeights. NFC. (llvm#143716)
Using RegUnitIterator made the code more complicated than having two
nested loops over each register and each register's regunits.
[CIR] Upstream minimal builtin function call support (llvm#142981)
This patch adds all bits required to implement builtin function calls to
ClangIR. It doesn't actually implement any of the builtins except those
that fold to a constant ahead of CodeGen
(
__builtin_is_constant_evaluated()
being one example).[clang][analyzer] Correct SMT Layer for _BitInt cases refutations (llvm#143310)
Since _BitInt was added later, ASTContext did not comprehend getting a
type by bitwidth that's not a power of 2, and the SMT layer also did not
comprehend this. This led to unexpected crashes using Z3 refutation
during randomized testing. The assertion and redacted and summarized
crash stack is shown here.
clang:
clang::ento::SMTConv::fromBinOp(std::shared_ptr&, llvm::SMTExpr const* const&, clang::BinaryOperatorKind, llvm::SMTExpr const* const&, bool) SMTConstraintManager.cpp clang::ASTContext&, llvm::SMTExpr const* const&, clang::QualType, clang::BinaryOperatorKind, llvm::SMTExpr const* const&, clang::QualType, clang::QualType*) SMTConstraintManager.cpp clang::ASTContext&, clang::ento::SymExpr const*, llvm::APSInt const&, llvm::APSInt const&, bool) SMTConstraintManager.cpp clang::ento::ExplodedNode const*, clang::ento::PathSensitiveBugReport&)../../clang/include/clang/StaticAnalyzer/Core/PathSensitive/SMTConv.h:103:
static llvm::SMTExprRef
clang::ento::SMTConv::fromBinOp(llvm::SMTSolverRef &,
const llvm::SMTExprRef &, const BinaryOperator::Opcode, const
llvm::SMTExprRef &, bool):
Assertion `*Solver->getSort(LHS) == *Solver->getSort(RHS) && "AST's must
have the same sort!"' failed.
...
Co-authored-by: Vince Bridgers [email protected]
[MLIR][Transform] apply_registered_pass op's options as a dict (llvm#143159)
Improve ApplyRegisteredPassOp's support for taking options by taking
them as a dict (vs a list of string-valued key-value pairs).
Values of options are provided as either static attributes or as params
(which pass in attributes at interpreter runtime). In either case, the
keys and value attributes are converted to strings and a single
options-string, in the format used on the commandline, is constructed to
pass to the
addToPipeline
-pass API.Reapply 76197ea after removing an assertion
Specifically this is the assertion in BasicBlock.cpp. Now that we're not
examining or setting that flag consistently (because it'll be deleted in
about an hour) there's no need to keep this assertion.
Original commit title:
[DebugInfo][RemoveDIs] Remove some debug intrinsic-only codepaths (llvm#143451)
[libc][NFC] Remove template from GPU allocator reference counter
Summary:
We don't need this to be generic, precommit for
llvm#143607
[DLCov][NFC] Annotate intentionally-blank DebugLocs in existing code (llvm#136192)
Following the work in PR llvm#107279, this patch applies the annotative
DebugLocs, which indicate that a particular instruction is intentionally
missing a location for a given reason, to existing sites in the compiler
where their conditions apply. This is NFC in ordinary LLVM builds (each
function
DebugLoc::getFoo()
is inlined asDebugLoc()
), but marks theinstruction in coverage-tracking builds so that it will be ignored by
Debugify, allowing only real errors to be reported. From a developer
standpoint, it also communicates the intentionality and reason for a
missing DebugLoc.
Some notes for reviewers:
I->dropLocation()
andI->setDebugLoc(DebugLoc::getDropped())
is that the former may decideto keep some debug info alive, while the latter will always be empty; in
this patch, I always used the latter (even if the former could
technically be correct), because the former could result in some
(barely) different output, and I'd prefer to keep this patch purely NFC.
DebugLoc::getUnknown()
, withthe exception of the vectorizers - in summary, they are a huge cause of
dropped source locations, and I don't have the time or the domain
knowledge currently to solve that, so I've plastered it all over them as
a form of "fixme".
[libc] Add NULL macro definitions to header files (llvm#142764)
By the C standard, <locale.h>, <stddef.h> <stdio.h>, <stdlib.h>,
<string.h>, <time.h>, and <wchar.h> require NULL to be defined.
[X86] Don't emit ENDBR for asm goto branch targets (llvm#143439)
Similarly to llvm#141562, which disabled BTI generation for ARM asm goto
branch targets, drop unnecessary ENDBRs from IsInlineAsmBrIndirectTarget
machine basic blocks.
[lldb][nfc] Factor out code checking if Variable is in scope (llvm#143572)
This is useful for checking whether a variable is in scope inside a
specific block.
[CIR] Upstream splat op for VectorType (llvm#139827)
This change adds support for splat op for VectorType
Issue llvm#136487
[flang] silence bogus error with BIND(C) variable in hermetic module (llvm#143737)
The global name semantic check was firing in a bogus way when BIND(C)
variables are in hermetic module.
Do not raise the error if one of the symbol with the conflicting global
name is an "hermetic variant" of the other.
Squelch an unused-function warning
After removing some debug-intrinsic creation code, this function is now
unused (and un-necessary)
[Clang][Tooling][NFC] Use move to avoid copies of large objects (llvm#143603)
Static analysis flagged these cases in which can use std::move and avoid
copies of large objects.
[IR] Fix warnings (llvm#143752)
This patch fixes:
llvm/lib/IR/DIBuilder.cpp:1072:18: error: unused function
'getDeclareIntrin' [-Werror,-Wunused-function]
llvm/include/llvm/IR/DIBuilder.h:51:15: error: private field
'DeclareFn' is not used [-Werror,-Wunused-private-field]
llvm/include/llvm/IR/DIBuilder.h:52:15: error: private field
'ValueFn' is not used [-Werror,-Wunused-private-field]
llvm/include/llvm/IR/DIBuilder.h:53:15: error: private field
'LabelFn' is not used [-Werror,-Wunused-private-field]
llvm/include/llvm/IR/DIBuilder.h:54:15: error: private field
'AssignFn' is not used [-Werror,-Wunused-private-field]
[GISelValueTracking] Add test case for G_PTRTOINT
While we can only reason about the index/address, the G_PTRTOINT
operations returns all representation bits, so we can't assume the
remaining ones are all zeroes. This behaviour was clarified as part of
the discussion in https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54.
The LangRef semantics of ptrtoint being a full representation bitcast
were documented in llvm#139349.
Prior to 77c8d21 we were incorrectly
assuming known zeroes beyond the index size even if the input was
completely unknown. This commit adds a test case for G_PTRTOINT which
was omitted from that change.
See llvm#139598
Reviewed By: arsenm
Pull Request: llvm#139608
[OpenMP][Offload] Update the Logic for Configuring Auto Zero-Copy (llvm#143638)
Summary:
Currently the Auto Zero-Copy is enabled by checking every initialized
device to ensure that no dGPU is attached to an APU. However, an APU is
designed to comprise a homogeneous set of GPUs, therefore, it should be
sufficient to check any device for configuring Auto Zero-Copy. In this
PR, it checks the first initialized device in the list.
The changes in this PR are to clearly reflect the design and logic of
enabling the feature for further improving the readibility.
[SPIRV] FIX print the symbolic operand for opcode for the operation OpSpecConstantOp (llvm#135756)
Current implementation outputs opcode is an immediate but spirv-tools
requires that the name of the operation without "Op" is needed for the
instruction OpSpecConstantOp
that is if the opcode is OpBitcast the instruction must be
%1 = OpSpecConstantOp %6 Bitcast %17
instead of
%1 = OpBitcast %6 124 %17
refer this commit for more
info
Co-authored-by: Dmitry Sidorov [email protected]
Co-authored-by: Ebin-McW [email protected]
[libc++] Upgrade to GCC 15 (llvm#138293)
[RISCV] Guard the alternative static chain register use on ILP32E/LP64E (llvm#142715)
Asserts the use of t3(x28) as the static chain register when branch control flow protection is enabled with ILP32E/LP64E, because such register is not present within the ABI.
[NFC][PowerPC] Pre-commit test case for exploitation of xxeval for the pattern ternary(A,X,or(B,C)) (llvm#143693)
Pre-commit test case for exploitation of
xxeval
for ternary operationsof the pattern
ternary(A,X,or(B,C))
.Exploitation of
xxeval
to be added later.Co-authored-by: Tony Varghese [email protected]
Update BUILD.bazel
Add missing dependency after llvm#142916.
[libc++] Simplify the implementation of __next_prime a bit (llvm#143512)
Make clang/test/Frontend/aarch64-print-enabled-extensions-cc1.c write output file to temp dir
[libc++] Remove static_assert from hash.cpp that fires unconditionall
[Clang][OpenMP] Fix mapping of arrays of structs with members with mappers (llvm#142511)
This builds upon llvm#101101 from @jyu2-git, which used compiler-generated
mappers when mapping an array-section of structs with members that have
user-defined default mappers.
Now we do the same when mapping arrays of structs.
[OpenACC][CIR] Add parallelism determ. to all acc.loops (llvm#143751)
PR llvm#143720 adds a requirement to the ACC dialect that every acc.loop
must have a seq, independent, or auto attribute for the 'default'
device_type. The standard has rules for how this can be intuited:
orphan/parallel/parallel loop: independent
kernels/kernels loop: auto
serial/serial loop: seq, unless there is a gang/worker/vector, at which
point it should be 'auto'.
This patch implements all of this rule as a 'cleanup' step on the IR
generation for combined/loop operations. Note that the test impact is
much less since I inadvertently have my 'operation' terminating curley
matching the end curley from 'attribute' instead of the front of the
line, so I've added sufficient tests to ensure I captured the above.
[bazel] Port fe7bf4b
[libc] Reduce direct use of errno in src/stdlib and src/__support tests. (llvm#143767)
those API have been migrated to return error in a struct instead.
strdup from <string.h> to use ErrnoCheckingTest harness.
[SystemZ][z/OS] Refactor AutoConvert.h to remove large MVS guard (llvm#143174)
This AutoConvert.h header frequently gets mislabeled as an unused
include because it is guarded by MVS internally and every usage is also
guarded. This refactors the change to remove this guard and instead make
these functions a noop on other non-z/OS platforms.
[acc] acc.loop verifier now requires parallelism determination flag (llvm#143720)
The OpenACC specification for
acc loop
describe that a loop'sparallelism determination mode is either auto, independent, or seq. The
rules are as follows.
construct with no auto or seq clause is treated as if it has the
independent clause when it is an orphaned loop construct or its parent
compute construct is a parallel construct.
compute construct is a kernels construct, a loop construct with no
independent or seq clause is treated as if it has the auto clause.
guaranteed to be parallel. Specifically noted in 2.9.7 auto clause: If
not, or if it is unable to make a determination, it must treat the auto
clause as if it is a seq clause, and it must ignore any gang, worker, or
vector clauses on the loop construct.
The verifier for
acc.loop
was updated to enforce this marking becausethe context in which a loop appears is not trivially determined once IR
transformations begin. For example, orphaned loops are implicitly
independent
, but after inlining into anacc.kernels
region theywould be implicitly considered
auto
. Thus now the verifier requiresthat a frontend specifically generates acc dialect with this marking
since it knows the context.
[NVPTX] Misc table-gen cleanup (NFC) (llvm#142877)
[VPlan] Always verify VPCanonicalIVPHIRecipe placement (NFC).
Loop regions are dissolved since dcef154, remove the
check for VerifyLate and corresponding TODO.
[SandboxVectorizer] Use llvm::find (NFC) (llvm#143724)
llvm::find allows us to pass a range.
[Format] Use llvm::min_element (NFC) (llvm#143725)
llvm::min_elements allows us to pass a range.
[lld] Use std::tie to implement comparison operators (NFC) (llvm#143726)
std::tie facilitates lexicographical comparisons through std::tuple's
built-in operator< and operator>.
[llvm] Use std::tie to implement operator< (NFC) (llvm#143728)
std::tie facilitates lexicographical comparisons through std::tuple's
built-in operator<.
[mlir] Simplify calls to *Map::{insert,try_emplace} (NFC) (llvm#143729)
This patch simplifies code by removing the values from
insert/try_emplace. Note that default values inserted by try_emplace
are immediately overrideen in all these cases.
[llvm] Add a tool to check mustache compliance against the public spec (llvm#142813)
This is a cli tool to that tests the conformance of LLVM's mustache
implementation against the public Mustache spec, hosted at
https://github.com/mustache/spec. This is a revised version of the
patches in llvm#111487.
Co-authored-by: Peter Chou [email protected]
[SelectionDAG] Add ISD::VSELECT to SelectionDAG::canCreateUndefOrPoison. (llvm#143760)
[LV] Use GeneratedRTChecks to check if safety checks were added (NFC).
Directly check via GeneratedRTChecks if any checks have been added,
instead of needing to go through ILV. This simplifies the code and
enables further refactoring in follow-up patches.
[bazel] port 5dafe9d
[libc] Character converter skeleton class (llvm#143619)
Made CharacterConverter class skeleton
[lldb][RPC] Upstream LLDB to RPC converstion Python script (llvm#138028)
As part of upstreaming LLDB RPC, this commit adds a python script that
is used by LLDB RPC to modify the public lldb header files for use with
RPC.
https://discourse.llvm.org/t/rfc-upstreaming-lldb-rpc/85804
[flang] Don't duplicate hermetic module file dependencies (llvm#143605)
When emitting the modules on which a module depends under the
-fhermetic-module-files options, eliminate duplicates by name rather
than by symbol addresses. This way, when a dependent module is in the
symbol table more than once due to the use of a nested hermetic module,
it doesn't get emitted multiple times to the new module file.
[libc] Switched calls to inline_memcpy to __builtin_memcpy for wide char utilities (llvm#143011)
Switched calls to inline_memcpy to __builtin_memcpy for wide char
utilities
Removed unnecessary wctype_utils dependencies from the cmake file
[MLIR][Transform] apply_registered_op fixes: arg order & python options auto-conversion (llvm#143779)
[libc] Move libc_errno.h to libc/src/__support and make LIBC_ERRNO_MODE_SYSTEM to be header-only. (llvm#143187)
This is the first step in preparation for:
https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450
[libc][obvious] Changed incorrect type (llvm#143780)
After changing mbstate_t to mbstate we forgot to change the
character_converter files to reflect it.
Co-authored-by: Sriya Pratipati [email protected]
[GlobalOpt] Bail out on non-ConstExprs in isSimpleEnoughtToCommit. (llvm#143400)
Bail out for non ConstantExpr constants in
isSimpleEnoughValueToCommitHelper to prevent crash for non-ConstantExpr
constants
PR: llvm#143400
[Clang][NFC] Move HeadingAndSpellings to avoid copying (llvm#143611)
Static analysis flagged that we could move HeadingAndSpellings and avoid
a copy of a large object.
[Clang] fix missing source location for errors in macro-expanded (llvm#143460)
Fixes llvm#143216
This patch fixes diagnostic locations for tokens from macro expansions.
Workaround MSVC Linker Issue when Cross-Compiling for ARM64EC (llvm#143659)
This MR presents a temporary workaround for the issue described at
llvm#143575. While an upstream
MSVC
bug
is reported, it makes sense to apply a workaround in LLVM code to
quickly unblock anyone affected.
[Clang] [NFC] Move diagnostics emitting code from
DiagnosticIDs
intoDiagnosticsEngine
(llvm#143517)It makes more sense for this functionality to be all in one place rather
than split up across two files—at least it caused me a bit of a headache
to try and find all places where we were actually forwarding the
diagnostic to the
DiagnosticConsumer
. Moreover, moving these functionsinto
DiagnosticsEngine
simplifies the code quite a bit since we accessmembers of
DiagnosticsEngine
more frequently than those ofDiagnosticIDs
. There was also a duplicated code snippet that I’vemoved out into a new function.
[mlir] Fix ComposeExpandOfCollapseOp for dynamic case (llvm#142663)
Changes
findCollapsingReassociation
to return nullopt in all caseswhere source shape has
>=2
dynamic dims.expand(collapse)
canreshape to in any valid output shape but a collapse can only collapse
contiguous dimensions. When there are
>=2
dynamic dimensions it isimpossible to determine if it can be simplified to a collapse or if it
is preforming a more advanced reassociation.
This problem was uncovered by
llvm#137963
Signed-off-by: Ian Wood [email protected]
[LOH] Don't emit AdrpAddStr when register could be clobbered (llvm#142849)
llvm@b783aa8
added a check to ensure an
AdrpAddLdr
LOH isn't created when there isan instruction between the
add
andldr
https://github.com/llvm/llvm-project/blob/50c5704dc000cc0af41a511aa44db03233edf0af/llvm/lib/Target/AArch64/AArch64CollectLOH.cpp#L419-L431
We need a similar check for
AdrpAddStr
. Although this technicallyisn't implemented in LLD, it could be in the future.
https://github.com/llvm/llvm-project/blob/50c5704dc000cc0af41a511aa44db03233edf0af/lld/MachO/Arch/ARM64.cpp#L699-L702
[mlir][generate-test-checks] Do not emit the autogenerated note if it exists (llvm#143750)
Prior to this PR, the script removed the already existing autogenerated
note if we came across a line that was equal to the note. But the
default note is multiple lines, so there would never be a match.
Instead, check to see if the current line is a substring of the
autogenerated note.
Co-authored-by: Michael Maitland [email protected]
[mlir][generate-test-checks] Emit attributes with rest of CHECK lines (llvm#143759)
Prior to this patch, generating test checks in place put the ATTR
definitions at the very top of the file, above the RUN lines and
autogenerated note. All CHECK lines should below the RUN lines and
autogenerated note.
This change ensures that the attribute definitions are emitted with the
rest of the CHECK lines.
Co-authored-by: Michael Maitland [email protected]
[ConstantFolding] Add folding for [de]interleave2, insert and extract (llvm#141301)
The change adds folding for 4 vector intrinsics:
interleave2
,deinterleave2
,vector_extract
andvector_insert
. For the last 2intrinsics the change does not use
ShuffleVector
fold mechanism asit's much simpler to construct result vector explicitly.
[libc] Perform bitfield zero initialization wave-parallel (llvm#143607)
Summary:
We need to set the bitfield memory to zero because the system does not
guarantee zeroed out memory. Even if fresh pages are zero, the system
allows re-use so we would need a
kfd
level API to skip this step.Because we can't this patch updates the logic to perform the zero
initialization wave-parallel. This reduces the amount of time it takes
to allocate a fresh by up to a tenth.
This has the unfortunate side effect that