[AutoBump] Merge with fixes of f4943464 (Jan 18) (2) Needs onnx-mlir and torch-mlir bump #534

jorickert · 2025-04-25T09:59:04Z

No description provided.

By using the vector reduction buitins we can avoid scalarization. Targets that don't support vector reductions will scalarize later on anyway. The vector reduction builtins should be well-enough supported by the middle-end to be a generic solution. This produces conceptually equivalent code: all vector elements are OR'd/AND'd together and the final scalar is bit-shifted and masked to produce the final result. The 'normalize' builtin uses 'all' so its code has similarly improved in places.

When a hermetic module file is read, use a new scope to hold its dependent modules so that they don't conflict with any modules in the global scope.

…lvm#124283) To finalise the "RemoveDIs" work removing debug intrinsics, we're updating call sites that insert instructions to use iterators instead. This set of changes are those where it's not immediately obvious that just calling getIterator to fetch an iterator is correct, and one or two places where more than one line needs to change. Overall the same rule holds though: iterators generated for the start of a block such as getFirstNonPHIIt need to be passed into insert/move methods without being unwrapped/rewrapped, everything else can use getIterator.

This one appears to have been omitted when other ATOMIC_xxx intrinsic procedures were defined. There's already tests for it, but they apparently work even when ATOMIC_ADD must be interpreted as an external procedure with an implicit interface. Extend the tests with INTRINSIC NONE(EXTERNAL, TYPE) statements to ensure that they require the intrinsic interpretation.

The event variable in an EVENT POST/WAIT statement can be a coarray reference, and need not be an entire coarray. Variables and potential subobject components with EVENT_TYPE/LOCK_TYPE must be coarrays, unless they are potential subobjects nested within coarrays or pointers.

…1) (llvm#116836) SVE2.2 introduces instructions with predicated forms with zeroing of the inactive lanes. This allows in some cases to save a `movprfx` or a `mov` instruction when emitting code for `_x` or `_z` variants of intrinsics. This patch adds support for emitting the zeroing forms of certain `RBIT`, `REVB`, `REVH`, `REVW`, and `REVD` instructions.

When a character component reference is applied to a constant array of derived type, ensure that the length of the resulting character array is properly defined. Fixes llvm#123362.

…iagnostic (llvm#124364) ... pointing out the previous declaration.

It is valid to jump to a CHANGE TEAM statement from anywhere in the containing executable part, and valid to jump to an END TEAM statement from within the construct.

…vm#123836) The check for a coarray actual argument being passed to a procedure with an implicit interface was incorrect, yielding false positives for coindexed objects. Fix.

Catch and report multiple initializations of the same procedure pointer rather than assuming that control wouldn't reach a given point in name resolution in that case. Fixes llvm#123538.

An assertion in module file generation didn't allow for a case that has arisen in a test; remove it, extend commentary, and add a regression test. Fixes llvm#123534.

…3983) When a character array constructor does not have an explicit type with a constant length, the compiler can still fold it if all of its elements are constants. These array constructors will have been wrapped up in the internal %SET_LENGTH operation, which will determine the final length of the folded value, so use the maximum length of the constant elements as the length of the folded array constructor. Fixes llvm#123766.

…4158) The compiler was interpreting 'Q' as an exponent letter in a literal real constant as meaning real(kind=10) on x86-64, which is the legacy 80387 80-bit extended precision floating-point type. It turns out that 'Q' means kind=16 with all other compilers, even for x86-64 targets. Change to conform.

An assumed-length character dummy argument is interoperable only if it is neither a pointer nor allocatable.

…4204) When an allocatable or pointer was being associated as a storage sequence with a dummy argument, the checks were using the actual storage size of the allocatable or pointer's descriptor, not the size of the storage that it references. Fixes llvm#123807.

…ject lifecycle. (llvm#122783) This moves the ownership of the threads that forward stdout/stderr to the DAP object itself to ensure that the threads are joined and that the forwarding is cleaned up when the DAP connection is disconnected. This is part of a larger refactor to allow lldb-dap to run in a listening mode and accept multiple connections. This reverts the previous revert and now that the underlying Windows issue was fixed by 3ea2b54.

…124208) When ASYNCHRONOUS='NO' appears in a data transfer statement control item list, don't crash if it isn't appropriate for the kind of I/O under way (such as child I/O). Fixes llvm#124135.

…lvm#124323) GetShape() needed to be called with a FoldingContext in order to properly construct an extent expression for the shape of an array constructor whose elements (nested in an implied DO loop) were not scalars. Fixes llvm#124191.

A complex literal constant can have one BOZ component, since the type and value of the literal can be determined by converting the BOZ value to the type of the other component. But a complex literal constant with two BOZ components doesn't have a well-defined type. The error message was confusing in the case; emit a better one. Fixes llvm#124201.

Previously, it called `::operator new` which may throw `std::bad_alloc`, regardless of whether LLVM itself was built with exception handling, and this can cause safety issues if outside code has destructors that will call back into LLVM. Now we use `::operator new(..., nothrow)` and call `llvm::report_bad_alloc_error` when allocation fails, which will abort when LLVM is built without exceptions. Ref: llvm#85281

…lvm#124233) The LiveDebugValues pass and the instruction selector (which calls salvageCopySSA) need to be consistent on what they consider a copy instruction. With llvm#75184, the definition of what a copy instruction is was narrowed for AArch64 to exclude a w->x ORR and treat it as a zero-extend rather than a copy However, to make sure LiveDebugValues still treats a w->x ORR as a copy, the new function, isCopyLikeInstr was created. We need to make sure that salvageCopySSA also calls that function. This patch addresses this mismatch.

…4438) Fixes llvm#123144.

The patch adds the following intrinsics: bfloat16x8_t vcvt1_bf16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm) bfloat16x8_t vcvt1_low_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) bfloat16x8_t vcvt2_bf16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm) bfloat16x8_t vcvt2_low_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) bfloat16x8_t vcvt1_high_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) bfloat16x8_t vcvt2_high_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) float16x8_t vcvt1_f16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm) float16x8_t vcvt1_low_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) float16x8_t vcvt2_f16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm) float16x8_t vcvt2_low_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) float16x8_t vcvt1_high_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) float16x8_t vcvt2_high_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm) mfloat8x8_t vcvt_mf8_f32_fpm(float32x4_t vn, float32x4_t vm, fpm_t fpm) mfloat8x16_t vcvt_high_mf8_f32_fpm(mfloat8x8_t vd, float32x4_t vn, float32x4_t vm, fpm_t fpm) mfloat8x8_t vcvt_mf8_f16_fpm(float16x4_t vn, float16x4_t vm, fpm_t fpm) mfloat8x16_t vcvtq_mf8_f16_fpm(float16x8_t vn, float16x8_t vm, fpm_t fpm) Co-Authored-By: Caroline Concatto <[email protected]>

This patch adds the following pieces to the locale base API: - __setlocale (for std::setlocale) - __lconv_t (for std::lconv) - _LIBCPP_FOO_MASK and _LIBCPP_LC_ALL This should be sufficient to implement all of the platform-agnostic localization support in libc++ without relying directly on any public API names from the C library. This makes it possible to port libc++ to platforms that don't provide the usual locale APIs.

Summary: This introduces libc cache files and adds one for building the GPU support. The cache files will set defaults for these arguments which can be overridden if the user needs to. They also serve as documentation for how the builid is expected to look.

If the preload condition is a constant, ExprBuilder::create returns an integer of the native integer while an i1 is expected. Cast the result to i1 if that happens. Fixes llvm#123932

This reverts commit 7986e0c. Reason for revert: This shape inference si not following tosa spec and causing problems when lowering from onnx to tosa

[AutoBump] Merge with fixes of eb206e9 (Jan 24) (21)

[AutoBump] Merge with b4e81fd (Jan 24) (20)

[AutoBump] Merge with fixes of 8388040 (Jan 23) (19)

[AutoBump] Merge with 08195f3 (Jan 23) (18)

[AutoBump] Merge with fixes of 7e622b6 (Jan 22) (17)

[AutoBump] Merge with 046b064 (Jan 20) (3) [Only tested LLVM]

[AutoBump] Merge with fixes of 5c6db8c (Jan 20) (4)

[AutoBump] Merge with a6bb8a7 (Jan 20) (5)

[AutoBump] Merge with fixes of 5ce271e (Jan 20) (8)

[AutoBump] Merge with fixes of 7a77f14 (Jan 20) (6)

[AutoBump] Merge with 57466db (Jan 20) (9)

[AutoBump] Merge with d70f54f (Jan 20) (7)

[AutoBump] Merge with 2a8c12b (Jan 21) (11)

[AutoBump] Merge with bd56950 (Jan 22) (13)

[AutoBump] Merge with 3057d0f (Jan 22) (16)

[AutoBump] Merge with fixes of 7986e0c (Jan 22) (15)

[AutoBump] Merge with fixes of 977d744 (Jan 20) (10) (May need downstream changes)

[AutoBump] Merge with fixes of 67b9d3f (Jan 21) (12)

[AutoBump] Merge with fixes of 729f958 (Jan 22) (14)

[AutoBump] Merge with 5a8fe9e (Jan 28) (22) [Torch-mlir sync point]

The `VectorTransformsOptions` on the `ConvertVectorToLLVMPass` is currently represented as a struct, which makes it not serialisable. This means a pass pipeline that contains this pass cannot be represented as textual form, which breaks reproducer generation and options such as `--dump-pass-pipeline`. This PR expands the `VectorTransformsOptions` struct into the two options that are actually used by the Pass' patterns: `vector-contract-lowering` and `vector-transpose-lowering` . The other options present in VectorTransformOptions are not used by any patterns in this pass. Additionally, I have changed some interfaces to only take these specific options over the full options struct as, again, the vector contract and transpose lowering patterns only need one of their respective options. Finally, I have added a simple lit test that just prints the pass pipeline using `--dump-pass-pipeline` to ensure the options on this pass remain serialisable. Fixes llvm#129046

frasercrmck and others added 30 commits January 27, 2025 16:37

[flang] Safer hermetic module file reading (llvm#121002)

038b42b

When a hermetic module file is read, use a new scope to hold its dependent modules so that they don't conflict with any modules in the global scope.

[flang] Fix failure to fold character array (llvm#123418)

73f9034

When a character component reference is applied to a constant array of derived type, ensure that the length of the resulting character array is properly defined. Fixes llvm#123362.

[clang][Sema][FMV] Add a note to the 'cannot become multiversioned' d…

73db9ee

…iagnostic (llvm#124364) ... pointing out the previous declaration.

[flang] Accept CHANGE TEAM/END TEAM as branch target (llvm#123822)

210e675

It is valid to jump to a CHANGE TEAM statement from anywhere in the containing executable part, and valid to jump to an END TEAM statement from within the construct.

[flang] Fix check for coarray actual passed to implicit interface (ll…

b16c989

…vm#123836) The check for a coarray actual argument being passed to a procedure with an implicit interface was incorrect, yielding false positives for coindexed objects. Fix.

[flang] Fix crash on erroneous program (llvm#123843)

3ac0078

Catch and report multiple initializations of the same procedure pointer rather than assuming that control wouldn't reach a given point in name resolution in that case. Fixes llvm#123538.

[flang] Fix crash in module file generation (llvm#123859)

f5ddb10

An assertion in module file generation didn't allow for a case that has arisen in a test; remove it, extend commentary, and add a regression test. Fixes llvm#123534.

[flang] Catch assumed-length interoperability error (llvm#124179)

c596aae

An assumed-length character dummy argument is interoperable only if it is neither a pointer nor allocatable.

[flang][runtime] Don't crash on ASYNCHRONOUS='NO' in child I/O (llvm#…

fee393e

…124208) When ASYNCHRONOUS='NO' appears in a data transfer statement control item list, don't crash if it isn't appropriate for the kind of I/O under way (such as child I/O). Fixes llvm#124135.

[X86] combineCMov - pull out repeated getValueType calls. NFC.

e7de603

[clang-format] Treat f<N | M>(a) as template function call (llvm#12…

1e89355

…4438) Fixes llvm#123144.

[Clang] fix test on 32 bits target after 561132e (llvm#124593)

19f0524

[bazel] Remove obsolete mlir-cpu-runner alias

658f850

[Polly] Ensure i1 preload condition

610e33a

If the preload condition is a constant, ExprBuilder::create returns an integer of the native integer while an i1 is expected. Cast the result to i1 if that happens. Fixes llvm#123932

jorickert added 5 commits May 21, 2025 04:02

[AutoBump] Merge with b4e81fd (Jan 24)

9dfe3ec

[AutoBump] Merge with fixes of eb206e9 (Jan 24)

f106a87

[AutoBump] Merge with 5a8fe9e (Jan 28)

4ed6347

Update conv_acc helper function with changes in torch-mlir

ec5f5e6

Revert "[TOSA] bug fix infer shape for slice (llvm#113497)"

590a988

This reverts commit 7986e0c. Reason for revert: This shape inference si not following tosa spec and causing problems when lowering from onnx to tosa

mgehre-amd approved these changes Jun 23, 2025

View reviewed changes

jorickert added 20 commits June 26, 2025 10:10

Merge pull request #559 from Xilinx/bump_to_eb206e9e

c48c70f

[AutoBump] Merge with fixes of eb206e9 (Jan 24) (21)

Merge pull request #558 from Xilinx/bump_to_b4e81fd1

19ec244

[AutoBump] Merge with b4e81fd (Jan 24) (20)

Merge pull request #557 from Xilinx/bump_to_8388040f

9a3a74a

[AutoBump] Merge with fixes of 8388040 (Jan 23) (19)

Merge pull request #556 from Xilinx/bump_to_08195f31

91fab1b

[AutoBump] Merge with 08195f3 (Jan 23) (18)

Merge pull request #555 from Xilinx/bump_to_7e622b61

789c55b

[AutoBump] Merge with fixes of 7e622b6 (Jan 22) (17)

Merge pull request #541 from Xilinx/bump_to_046b064d

cd2ef1c

[AutoBump] Merge with 046b064 (Jan 20) (3) [Only tested LLVM]

Merge pull request #542 from Xilinx/bump_to_5c6db8c9

d4dfd92

[AutoBump] Merge with fixes of 5c6db8c (Jan 20) (4)

Merge pull request #543 from Xilinx/bump_to_a6bb8a70

0123940

[AutoBump] Merge with a6bb8a7 (Jan 20) (5)

Merge pull request #546 from Xilinx/bump_to_5ce271ef

540639b

[AutoBump] Merge with fixes of 5ce271e (Jan 20) (8)

Merge pull request #544 from Xilinx/bump_to_7a77f14c

a40b4d1

[AutoBump] Merge with fixes of 7a77f14 (Jan 20) (6)

Merge pull request #547 from Xilinx/bump_to_57466db7

cd1fe65

[AutoBump] Merge with 57466db (Jan 20) (9)

Merge pull request #545 from Xilinx/bump_to_d70f54f2

4103212

[AutoBump] Merge with d70f54f (Jan 20) (7)

Merge pull request #549 from Xilinx/bump_to_2a8c12b2

7e72031

[AutoBump] Merge with 2a8c12b (Jan 21) (11)

Merge pull request #551 from Xilinx/bump_to_bd56950b

eddfddc

[AutoBump] Merge with bd56950 (Jan 22) (13)

Merge pull request #554 from Xilinx/bump_to_3057d0f1

da007dd

[AutoBump] Merge with 3057d0f (Jan 22) (16)

Merge pull request #553 from Xilinx/bump_to_7986e0ca

3c0cc53

[AutoBump] Merge with fixes of 7986e0c (Jan 22) (15)

Merge pull request #548 from Xilinx/bump_to_977d744b

815db3c

[AutoBump] Merge with fixes of 977d744 (Jan 20) (10) (May need downstream changes)

Merge pull request #550 from Xilinx/bump_to_67b9d3ff

0163da3

[AutoBump] Merge with fixes of 67b9d3f (Jan 21) (12)

Merge pull request #552 from Xilinx/bump_to_729f958c

6373ac4

[AutoBump] Merge with fixes of 729f958 (Jan 22) (14)

Merge pull request #560 from Xilinx/bump_to_5a8fe9e9

44f8844

[AutoBump] Merge with 5a8fe9e (Jan 28) (22) [Torch-mlir sync point]

jorickert changed the title ~~[AutoBump] Merge with fixes of f4943464 (Jan 18) (2) Needs onnx-mlir bump~~ [AutoBump] Merge with fixes of f4943464 (Jan 18) (2) Needs onnx-mlir and torch-mlir bump Jun 26, 2025

jorickert merged commit 42131ee into feature/fused-ops Jun 30, 2025
35 of 36 checks passed

jorickert deleted the bump_to_f4943464 branch June 30, 2025 13:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AutoBump] Merge with fixes of f4943464 (Jan 18) (2) Needs onnx-mlir and torch-mlir bump #534

[AutoBump] Merge with fixes of f4943464 (Jan 18) (2) Needs onnx-mlir and torch-mlir bump #534

Uh oh!

jorickert commented Apr 25, 2025

Uh oh!

Uh oh!

Uh oh!

[AutoBump] Merge with fixes of f4943464 (Jan 18) (2) Needs onnx-mlir and torch-mlir bump #534

[AutoBump] Merge with fixes of f4943464 (Jan 18) (2) Needs onnx-mlir and torch-mlir bump #534

Uh oh!

Conversation

jorickert commented Apr 25, 2025

Uh oh!

Uh oh!

Uh oh!