Skip to content

[AutoBump] Merge with fixes of f4943464 (Jan 18) (2) Needs onnx-mlir and torch-mlir bump #534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1,226 commits into from
Jun 30, 2025

Conversation

jorickert
Copy link

No description provided.

frasercrmck and others added 30 commits January 27, 2025 16:37
By using the vector reduction buitins we can avoid scalarization.
Targets that don't support vector reductions will scalarize later on
anyway. The vector reduction builtins should be well-enough supported by
the middle-end to be a generic solution.

This produces conceptually equivalent code: all vector elements are
OR'd/AND'd together and the final scalar is bit-shifted and masked to
produce the final result.

The 'normalize' builtin uses 'all' so its code has similarly improved in
places.
When a hermetic module file is read, use a new scope to hold its
dependent modules so that they don't conflict with any modules in the
global scope.
…lvm#124283)

To finalise the "RemoveDIs" work removing debug intrinsics, we're
updating call sites that insert instructions to use iterators instead.
This set of changes are those where it's not immediately obvious that
just calling getIterator to fetch an iterator is correct, and one or two
places where more than one line needs to change.

Overall the same rule holds though: iterators generated for the start of
a block such as getFirstNonPHIIt need to be passed into insert/move
methods without being unwrapped/rewrapped, everything else can use
getIterator.
This one appears to have been omitted when other ATOMIC_xxx intrinsic
procedures were defined. There's already tests for it, but they
apparently work even when ATOMIC_ADD must be interpreted as an external
procedure with an implicit interface. Extend the tests with INTRINSIC
NONE(EXTERNAL, TYPE) statements to ensure that they require the
intrinsic interpretation.
The event variable in an EVENT POST/WAIT statement can be a coarray
reference, and need not be an entire coarray.

Variables and potential subobject components with EVENT_TYPE/LOCK_TYPE
must be coarrays, unless they are potential subobjects nested within
coarrays or pointers.
…1) (llvm#116836)

SVE2.2 introduces instructions with predicated forms with zeroing of
the inactive lanes. This allows in some cases to save a `movprfx` or
a `mov` instruction when emitting code for `_x` or `_z` variants of
intrinsics.

This patch adds support for emitting the zeroing forms of certain
`RBIT`, `REVB`, `REVH`, `REVW`, and `REVD` instructions.
When a character component reference is applied to a constant array of
derived type, ensure that the length of the resulting character array is
properly defined.

Fixes llvm#123362.
…iagnostic (llvm#124364)

... pointing out the previous declaration.
It is valid to jump to a CHANGE TEAM statement from anywhere in the
containing executable part, and valid to jump to an END TEAM statement
from within the construct.
…vm#123836)

The check for a coarray actual argument being passed to a procedure with
an implicit interface was incorrect, yielding false positives for
coindexed objects. Fix.
Catch and report multiple initializations of the same procedure pointer
rather than assuming that control wouldn't reach a given point in name
resolution in that case.

Fixes llvm#123538.
An assertion in module file generation didn't allow for a case that has
arisen in a test; remove it, extend commentary, and add a regression
test.

Fixes llvm#123534.
…3983)

When a character array constructor does not have an explicit type with a
constant length, the compiler can still fold it if all of its elements
are constants. These array constructors will have been wrapped up in the
internal %SET_LENGTH operation, which will determine the final length of
the folded value, so use the maximum length of the constant elements as
the length of the folded array constructor.

Fixes llvm#123766.
…4158)

The compiler was interpreting 'Q' as an exponent letter in a literal
real constant as meaning real(kind=10) on x86-64, which is the legacy
80387 80-bit extended precision floating-point type. It turns out that
'Q' means kind=16 with all other compilers, even for x86-64 targets.
Change to conform.
An assumed-length character dummy argument is interoperable only if it
is neither a pointer nor allocatable.
…4204)

When an allocatable or pointer was being associated as a storage
sequence with a dummy argument, the checks were using the actual storage
size of the allocatable or pointer's descriptor, not the size of the
storage that it references.

Fixes llvm#123807.
…ject lifecycle. (llvm#122783)

This moves the ownership of the threads that forward stdout/stderr to
the DAP object itself to ensure that the threads are joined and that the
forwarding is cleaned up when the DAP connection is disconnected.

This is part of a larger refactor to allow lldb-dap to run in a
listening mode and accept multiple connections.

This reverts the previous revert and now that the underlying Windows
issue was fixed by 3ea2b54.
…124208)

When ASYNCHRONOUS='NO' appears in a data transfer statement control item
list, don't crash if it isn't appropriate for the kind of I/O under way
(such as child I/O).

Fixes llvm#124135.
…lvm#124323)

GetShape() needed to be called with a FoldingContext in order to
properly construct an extent expression for the shape of an array
constructor whose elements (nested in an implied DO loop) were not
scalars.

Fixes llvm#124191.
A complex literal constant can have one BOZ component, since the type
and value of the literal can be determined by converting the BOZ value
to the type of the other component. But a complex literal constant with
two BOZ components doesn't have a well-defined type. The error message
was confusing in the case; emit a better one.

Fixes llvm#124201.
Previously, it called `::operator new` which may throw `std::bad_alloc`,
regardless of whether LLVM itself was built with exception handling, and
this can cause safety issues if outside code has destructors that will
call back into LLVM. Now we use `::operator new(..., nothrow)` and call
`llvm::report_bad_alloc_error` when allocation fails, which will abort
when LLVM is built without exceptions.

Ref: llvm#85281
…lvm#124233)

The LiveDebugValues pass and the instruction selector (which calls
salvageCopySSA) need to be consistent on what they consider a copy
instruction. With llvm#75184, the
definition of what a copy instruction is was narrowed for AArch64 to
exclude a w->x ORR and treat it as a zero-extend rather than a copy

However, to make sure LiveDebugValues still treats a w->x ORR as a copy,
the new function, isCopyLikeInstr was created. We need to make sure that
salvageCopySSA also calls that function.

This patch addresses this mismatch.
The patch adds the following intrinsics:

    bfloat16x8_t vcvt1_bf16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm)
    bfloat16x8_t vcvt1_low_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    bfloat16x8_t vcvt2_bf16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm)
    bfloat16x8_t vcvt2_low_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    
    bfloat16x8_t vcvt1_high_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    bfloat16x8_t vcvt2_high_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    
    float16x8_t vcvt1_f16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm)
    float16x8_t vcvt1_low_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    float16x8_t vcvt2_f16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm)
    float16x8_t vcvt2_low_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    
    float16x8_t vcvt1_high_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    float16x8_t vcvt2_high_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    
mfloat8x8_t vcvt_mf8_f32_fpm(float32x4_t vn, float32x4_t vm, fpm_t fpm)
mfloat8x16_t vcvt_high_mf8_f32_fpm(mfloat8x8_t vd, float32x4_t vn,
float32x4_t vm, fpm_t fpm)
    
mfloat8x8_t vcvt_mf8_f16_fpm(float16x4_t vn, float16x4_t vm, fpm_t fpm)
mfloat8x16_t vcvtq_mf8_f16_fpm(float16x8_t vn, float16x8_t vm, fpm_t
fpm)

Co-Authored-By: Caroline Concatto <[email protected]>
This patch adds the following pieces to the locale base API:
- __setlocale (for std::setlocale)
- __lconv_t (for std::lconv)
- _LIBCPP_FOO_MASK and _LIBCPP_LC_ALL

This should be sufficient to implement all of the platform-agnostic
localization support in libc++ without relying directly on any public
API names from the C library. This makes it possible to port libc++ to
platforms that don't provide the usual locale APIs.
Summary:
This introduces libc cache files and adds one for building the GPU
support. The cache files will set defaults for these arguments which can
be overridden if the user needs to. They also serve as documentation for
how the builid is expected to look.
If the preload condition is a constant, ExprBuilder::create returns an
integer of the native integer while an i1 is expected. Cast the result
to i1 if that happens.

Fixes llvm#123932
jorickert added 20 commits June 26, 2025 10:10
[AutoBump] Merge with fixes of eb206e9 (Jan 24) (21)
[AutoBump] Merge with b4e81fd (Jan 24) (20)
[AutoBump] Merge with fixes of 8388040 (Jan 23) (19)
[AutoBump] Merge with 08195f3 (Jan 23) (18)
[AutoBump] Merge with fixes of 7e622b6 (Jan 22) (17)
[AutoBump] Merge with 046b064 (Jan 20) (3) [Only tested LLVM]
[AutoBump] Merge with fixes of 5c6db8c (Jan 20) (4)
[AutoBump] Merge with a6bb8a7 (Jan 20) (5)
[AutoBump] Merge with fixes of 5ce271e (Jan 20) (8)
[AutoBump] Merge with fixes of 7a77f14 (Jan 20) (6)
[AutoBump] Merge with 57466db (Jan 20) (9)
[AutoBump] Merge with d70f54f (Jan 20) (7)
[AutoBump] Merge with 2a8c12b (Jan 21) (11)
[AutoBump] Merge with bd56950 (Jan 22) (13)
[AutoBump] Merge with 3057d0f (Jan 22) (16)
[AutoBump] Merge with fixes of 7986e0c (Jan 22) (15)
[AutoBump] Merge with fixes of 977d744 (Jan 20) (10) (May need downstream changes)
[AutoBump] Merge with fixes of 67b9d3f (Jan 21) (12)
[AutoBump] Merge with fixes of 729f958 (Jan 22) (14)
[AutoBump] Merge with 5a8fe9e (Jan 28) (22) [Torch-mlir sync point]
@jorickert jorickert changed the title [AutoBump] Merge with fixes of f4943464 (Jan 18) (2) Needs onnx-mlir bump [AutoBump] Merge with fixes of f4943464 (Jan 18) (2) Needs onnx-mlir and torch-mlir bump Jun 26, 2025
The `VectorTransformsOptions` on the `ConvertVectorToLLVMPass` is
currently represented as a struct, which makes it not serialisable. This
means a pass pipeline that contains this pass cannot be represented as
textual form, which breaks reproducer generation and options such as
`--dump-pass-pipeline`.

This PR expands the `VectorTransformsOptions` struct into the two
options that are actually used by the Pass' patterns:
`vector-contract-lowering` and `vector-transpose-lowering` . The other
options present in VectorTransformOptions are not used by any patterns
in this pass.

Additionally, I have changed some interfaces to only take these specific
options over the full options struct as, again, the vector contract and
transpose lowering patterns only need one of their respective options.

Finally, I have added a simple lit test that just prints the pass
pipeline using `--dump-pass-pipeline` to ensure the options on this pass
remain serialisable.

Fixes llvm#129046
@jorickert jorickert merged commit 42131ee into feature/fused-ops Jun 30, 2025
35 of 36 checks passed
@jorickert jorickert deleted the bump_to_f4943464 branch June 30, 2025 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.