Skip to content

LLVM and SPIRV-LLVM-Translator pulldown (WW32 2025) #19716

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1,292 commits into from
Aug 8, 2025
Merged

Conversation

iclsrc
Copy link
Collaborator

@iclsrc iclsrc commented Aug 5, 2025

ergawy and others added 30 commits July 14, 2025 12:18
…8597)

Extends reduction support for `do concurrent`, in particular, for
associating names. Consider the following input:
```fortran
subroutine dc_associate_reduce
  integer :: i
  real, allocatable, dimension(:) :: x

  associate(x_associate => x)
  do concurrent (i = 1:10) reduce(+: x_associate)
  end do
  end associate
end subroutine
```

The declaration of `x_associate` is emitted as follows:
```mlir
%13:2 = hlfir.declare %10(%12) {uniq_name = "...."} : (!fir.heap<!fir.array<?xf32>>, !fir.shapeshift<1>) -> (!fir.box<!fir.array<?xf32>>, !fir.heap<!fir.array<?xf32>>)
```
where the HLFIR base type is an array descriptor (i.e. the
allocatable/heap attribute is dropped as stipulated by the spec; section
11.1.3.3).

The problem here is that `declare_reduction` ops accept only reference
types. This restriction is already partially handled for
`fir::BaseBoxType`'s by allocating a stack slot for the descriptor and
storing the box in that stack allocation. We have to modify this a
littble bit for `associate` since the HLFIR and FIR base types are
different (unlike most scenarios).
This PR adds a summary and synthetic children for `std::unique_ptr` from
MSVC's STL
([NatVis](https://github.com/microsoft/STL/blob/313964b78a8fd5a52e7965e13781f735bcce13c5/stl/debugger/STL.natvis#L285-L303)).

As with libc++, the deleter is only shown if it's non-empty.

Tested both the shared_ptr and unique_ptr tests on Windows.
Towards #24834.
Make the option visible, improve the help text, and add a release note.
At this time (immediately prior to llvm21 branching) we haven't
instrumented coroutine generation to identify the "key" instructions of
things like co_return and similar. This will lead to worse stepping
behaviours, as there won't be any key instruction for those lines.

This patch removes the key-instructions flag from the DISubprograms for
coroutines, which will cause AsmPrinter to use the "old" / existing
linetable stepping behaviour, avoiding a regression until we can
instrument these constructs.

(I'm going to post on discourse about whether this is a good idea or not
in a moment)
…(#148600)

Ran my python script from
llvm/llvm-project#97043 over the repo again and
there was 1 duplicate test-case that has been introduced since I last
did this.

This patch renames that test.
Update the Neoverse V2 Scheduler to reflect the correct
latencies along with having updated the relevant mca tests.
Add a couple of patterns to generate the Xqciac QC_SHLADD shift left and
add immediate instruction.
At the missing `spirv::ImageFetchOp` operation to the SPIR-V MLIR
dialect ODS with appropriate testing including negative testing of the
verifiers.

Signed-off-by: Jack Frankland <[email protected]>
…#148552)

Fix a false positve warning which was introduced by #146234.
`f128` intrinsic functions from libm sometimes lower to `long double`
library calls when they instead need to be `f128` versions. Add a
generic test demonstrating current behavior.
When `EnumRec::isTyped()` is true, include the
`EnumValueRec::getTaggedType()` to the documentation.
In the transform dialect tutorial chapter 1, there were some errors that
prevented the example from running. This PR fixes them.

---------

Co-authored-by: Renato Golin <[email protected]>
Update test with all zero constant input values which get folded during
IR construction to actually use different input values, which require
materializing build vectors.
…tion (#148614)

This is done by other backends at the start of this function, for
example AArch64Disassembler::getInstruction. Not setting it means you
hit asserts in MCDisassembler::tryAddingSymbolicOperand and
MCDisassembler::tryAddingPcLoadReferenceComment when there is a
symbolizer set.

Which happened to me while debugging a SystemZ program using LLDB.

As the only good way to hit this path is from C++, I've copied X86's
disassembler unit tests and added just enough to hit an assert if the
comment stream is not set.
Preserve the argument-clause for `warn-unused-result` when under clang::
scope.
We are not touching gnu:: scope for now as it's an error for GCC to have
that string. Personally I think it would be ok to relax it here too as
we are not introducing breakage to currently passing code, but feedback
is to go slowly about it.
Split out the calls to __builtin_verbose_trap into a separate header.
This is just a refactoring to make the code a bit more structured.
… (#148565)

If I understand correctly there was a point where we used to need this
before it was implied by Zvl*b.

Now that it is though and we use -mattr=+v in pretty much every test we
can remove it.

In unroll-in-loop-vectorizer.ll we can force a VF of 1 instead by using
-force-vector-width=1, and in scalable-basics.ll the two RUN lines were
the same so I merged them.
Allows expand of sdiv->mul by constant combine for the general case.
Previously this was only occurring in the exact case. This is part of
the resolution to issue #118090
Our setup runs tests with bazel in such a way that the work tree is
readonly, which was causing this test to fail because it couldn't write
the .o file. This fixes that, which was new in 15c3793 when this
test was introduced.
Add `OffloadDeviceTest::getPlatformBackend()` and use it to skip event
tests which currently fail on AMDGPU due to:

```
OL_ERRC_UNIMPLEMENTED: synchronize event not implemented
```
This patch contains fixes for various nits mentioned in #147200:

- This patch removes the `bit.` prefix in the op mnemonic. The operation
names now directly correspond to the builtin function names except for
`bswap` which is represented by `cir.byte_swap` for more clarity.

- Since all bit operations are `SameOperandsAndResultType`, this patch
updates their assembly format and avoids spelling out the operand type
twice.
In #125921, the changes requested by P2372R3 were completed and tested
together with corresponding `chrono` types. But that PR didn't mention
P2372R3. The `__cpp_lib_format` FTM was even bumped by an earlier PR
#98275.

This PR confirms that P2372R3 was completed in LLVM 21 (together with P1361R2).
Closes #100043
To allow all C++ features in constexpr contexts we need to track
constexpr initializers of variables. The mentioned commit moved some
code to handle consteval better but we need the code where it used to be
since it is not only consteval that we care about.
@jsji
Copy link
Contributor

jsji commented Aug 6, 2025

sycl-e2e is failing in for cuda in HandleVirtRegUse.

@intel/llvm-reviewers-cuda Can we get someone to have a look? Thanks!

Simple reproducer is:

bin/clang++  -Werror -fsycl -fsycl-targets=nvptx64-nvidia-cuda ../sycl/test-e2e/Basic/built-ins/math_raw_ptr.cpp -nogpulib

#11 0x00007f902051b929 llvm::LiveVariables::HandleVirtRegUse(llvm::Register, llvm::MachineBasicBlock*, llvm::MachineInstr&) 
#12 0x00007f902051c61c llvm::LiveVariables::runOnInstr(llvm::MachineInstr&, llvm::SmallVectorImpl<llvm::Register>&, unsigned int) 

@jsji
Copy link
Contributor

jsji commented Aug 8, 2025

sycl-e2e is failing in for cuda in HandleVirtRegUse.

@intel/llvm-reviewers-cuda Can we get someone to have a look? Thanks!

Simple reproducer is:

bin/clang++  -Werror -fsycl -fsycl-targets=nvptx64-nvidia-cuda ../sycl/test-e2e/Basic/built-ins/math_raw_ptr.cpp -nogpulib

#11 0x00007f902051b929 llvm::LiveVariables::HandleVirtRegUse(llvm::Register, llvm::MachineBasicBlock*, llvm::MachineInstr&) 
#12 0x00007f902051c61c llvm::LiveVariables::runOnInstr(llvm::MachineInstr&, llvm::SmallVectorImpl<llvm::Register>&, unsigned int) 

Never mind, I had a look myself and found the fix. :)

We may still need to keep CopyToReg even after folding uses into vector
loads, since the original register may be used in other blocks.

Partially reverts 1fdbe69
@jsji jsji temporarily deployed to WindowsCILock August 8, 2025 00:21 — with GitHub Actions Inactive
@jsji jsji temporarily deployed to WindowsCILock August 8, 2025 00:21 — with GitHub Actions Inactive
@jsji jsji temporarily deployed to WindowsCILock August 8, 2025 00:54 — with GitHub Actions Inactive
@jsji jsji temporarily deployed to WindowsCILock August 8, 2025 01:03 — with GitHub Actions Inactive
@jsji jsji temporarily deployed to WindowsCILock August 8, 2025 01:03 — with GitHub Actions Inactive
@jsji
Copy link
Contributor

jsji commented Aug 8, 2025

This is ready for review.

Copy link
Contributor

@sarnex sarnex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed remaining changes

@sarnex
Copy link
Contributor

sarnex commented Aug 8, 2025

/merge

@bb-sycl
Copy link
Contributor

bb-sycl commented Aug 8, 2025

Fri 08 Aug 2025 02:27:51 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes.

@bb-sycl
Copy link
Contributor

bb-sycl commented Aug 8, 2025

Fri 08 Aug 2025 02:38:34 PM UTC --- Merge the branch in this PR to base automatically. Will close the PR later.

@bb-sycl bb-sycl merged commit 4946b5d into sycl Aug 8, 2025
57 of 58 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disable-lint Skip linter check step and proceed with build jobs
Projects
None yet
Development

Successfully merging this pull request may close these issues.