Skip to content

[ELF][LTO] Add baseline test for invalid relocations against runtime calls #127286

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

arichardson
Copy link
Member

@arichardson arichardson commented Feb 15, 2025

This can happen when using a LTO build of compiler-rt for ARM and the
program uses 64-bit division.
The 64-bit division function in compiler-rt (__aeabi_ldivmod) is written
in assembly and calls the C function __divmoddi4, which works fine in
non-LTO links. However, when building with LTO the call inside
__aeabi_ldivmod is replaced with a jump to address zero, which then
crashes the program.

Building with -pie generates an error instead of a jump to address zero,
and surprisingly just declaring the __aeabi_ldivmod function (but not
calling it) in the input IR also avoids this issue.

Reported as #127284

Co-authored-by: Fangrui Song [email protected]

Created using spr 1.3.6-beta.1
@llvmbot
Copy link
Member

llvmbot commented Feb 15, 2025

@llvm/pr-subscribers-lld-elf

@llvm/pr-subscribers-lld

Author: Alexander Richardson (arichardson)

Changes

This can happen when using a LTO build of compiler-rt for ARM and the
program uses 64-bit division.
The 64-bit division function in compiler-rt (__aeabi_ldivmod) is written
in assembly and calls the C function __divmoddi4, which works fine normally.
However, when building with LTO the call inside __aeabi_ldivmod is replaced
with a jump to address zero, which then crashes the program.

Building with -pie generates an error instead of a jump to address zero,
and surprisingly just declaring the __aeabi_ldivmod function (but not
calling it) in the input IR also avoids this issue.


Full diff: https://github.com/llvm/llvm-project/pull/127286.diff

1 Files Affected:

  • (added) lld/test/ELF/lto/arm-rtlibcall.ll (+122)
diff --git a/lld/test/ELF/lto/arm-rtlibcall.ll b/lld/test/ELF/lto/arm-rtlibcall.ll
new file mode 100644
index 0000000000000..89b0a1eb0e90e
--- /dev/null
+++ b/lld/test/ELF/lto/arm-rtlibcall.ll
@@ -0,0 +1,122 @@
+; REQUIRES: arm
+;; https://github.com/llvm/llvm-project/issues/127284
+;; Test for LTO optimizing out references to symbols that are pulled in by
+;; compiler-generated libcalls (post LTO).
+;; The problem here is that the call to __aeabi_ldivmod is generated post-LTO,
+;; during ISel, so aeabi_ldivmod.o is only marked as required afterwards but by
+;; that time we have decided that all the callees of __aeabi_ldivmod are not
+;; needed and have been marked as ABS zero symbols.
+; RUN: rm -rf %t && split-file %s %t && cd %t
+; RUN: llvm-as divmoddi4.ll -o divmoddi4.bc
+; RUN: llvm-mc -filetype=obj -triple=armv7-none-unknown-eabi aeabi_ldivmod.s -o aeabi_ldivmod.o
+;; With an explicit __aebi_ldivmod call in the input IR this works as expected:
+; RUN: llvm-as main-explicit.ll -o main-explicit-ldivmod.bc
+; RUN: ld.lld main-explicit-ldivmod.bc --start-lib aeabi_ldivmod.o divmoddi4.bc --end-lib -o test.exe -Bstatic
+; RUN: llvm-objdump -d -r -t test.exe | FileCheck %s --check-prefix=GOOD-DUMP
+; GOOD-DUMP-LABEL: SYMBOL TABLE:
+; GOOD-DUMP: [[#]] g     F .text	[[#]] _start
+; GOOD-DUMP: [[#]] g     F .text	00000024 __aeabi_ldivmod
+; GOOD-DUMP: [[#]] g     F .text	[[#]] __divmoddi4
+; GOOD-DUMP-LABEL: <__aeabi_ldivmod>:
+; GOOD-DUMP:       bl	0x20140 <__divmoddi4>   @ imm = #0x28
+
+; But if the call is generated by ISel, we end up with an invalid reference:
+; RUN: llvm-as main-implicit.ll -o main-implicit-ldivmod.bc
+; RUN: ld.lld main-implicit-ldivmod.bc --start-lib aeabi_ldivmod.o divmoddi4.bc --end-lib -o test.exe -Bstatic
+; RUN: llvm-objdump -d -r -t test.exe | FileCheck %s --check-prefix=BAD-DUMP
+;; We jump to address zero here and __divmoddi4 ends up being an absolute symbol:
+; BAD-DUMP-LABEL: SYMBOL TABLE:
+; BAD-DUMP: [[#]] g     F .text	[[#]] _start
+; BAD-DUMP: [[#]] g     F .text	00000024 __aeabi_ldivmod
+; BAD-DUMP: [[#]] g       *ABS*	00000000 __divmoddi4
+; BAD-DUMP-LABEL: <__aeabi_ldivmod>:
+; BAD-DUMP:       bl	0x0 <__divmoddi4>   @ imm = #-0x200fc
+;; Linking with -pie complains about the invalid relocation (and even points back to the source files)
+; RUN: not ld.lld main-implicit-ldivmod.bc --start-lib aeabi_ldivmod.o divmoddi4.bc --end-lib -o test.exe --no-undefined -pie --no-dynamic-linker 2>&1 | FileCheck %s --check-prefix=PIE-ERROR
+PIE-ERROR: ld.lld: error: relocation R_ARM_CALL cannot refer to absolute symbol: __divmoddi4
+PIE-ERROR-NEXT: >>> defined in divmoddi4.bc
+PIE-ERROR-NEXT: >>> referenced by aeabi_ldivmod.o:(__aeabi_ldivmod)
+
+;; Interestingly, just declaring __aeabi_ldivmod is sufficient to not run into this issue.
+; RUN: llvm-as main-declared.ll -o main-declared-ldivmod.bc
+; RUN: ld.lld main-declared-ldivmod.bc --start-lib aeabi_ldivmod.o divmoddi4.bc --end-lib -o test.exe -Bstatic
+; RUN: llvm-objdump -d -r -t test.exe | FileCheck %s --check-prefix=GOOD-DUMP
+
+;--- divmoddi4.ll
+target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
+target triple = "armv7-none-unknown-eabi"
+
+; Adding it to llvm.used does not appears to have any effect!
+; @llvm.used = appending global [1 x ptr] [ptr @__divmoddi4], section "llvm.metadata"
+
+; Stub version of the real __divmoddi4
+define i64 @__divmoddi4(i64 %a, i64 %b, ptr writeonly %rem) #0 align 32 {
+entry:
+  %sub = sub i64 %a, %b
+  store i64 0, ptr %rem, align 8
+  ret i64 %sub
+}
+
+attributes #0 = { mustprogress nofree noinline norecurse nosync nounwind willreturn memory(argmem: write) "frame-pointer"="non-leaf" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a15" }
+
+;--- aeabi_ldivmod.s
+.syntax unified
+.p2align 2
+.arm
+.globl __aeabi_ldivmod
+.type __aeabi_ldivmod,%function
+__aeabi_ldivmod:
+        push {r6, lr}
+        sub sp, sp, #16
+        add r6, sp, #8
+        str r6, [sp]
+        bl __divmoddi4
+        ldr r2, [sp, #8]
+        ldr r3, [sp, #12]
+        add sp, sp, #16
+        pop {r6, pc}
+.size __aeabi_ldivmod, . - __aeabi_ldivmod
+
+;--- main-implicit.ll
+target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
+target triple = "armv7-none-unknown-eabi"
+
+define dso_local i64 @_start(i64 %num, i64 %denom) local_unnamed_addr #0 {
+entry:
+  %div = sdiv i64 %num, %denom
+  %ret = add i64 %div, 2
+  ret i64 %ret
+}
+
+attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a15" }
+
+;--- main-explicit.ll
+target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
+target triple = "armv7-none-unknown-eabi"
+
+declare { i64, i64 } @__aeabi_ldivmod(i64, i64)
+
+define dso_local noundef i64 @_start(i64 noundef %num, i64 noundef %denom) local_unnamed_addr #0 {
+entry:
+  %quotrem = call { i64, i64 } @__aeabi_ldivmod(i64 %num, i64 %denom)
+  %div = extractvalue { i64, i64 } %quotrem, 0
+  %ret = add i64 %div, 2
+  ret i64 %ret
+}
+
+attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a15" }
+
+;--- main-declared.ll
+target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
+target triple = "armv7-none-unknown-eabi"
+
+declare { i64, i64 } @__aeabi_ldivmod(i64, i64)
+
+define dso_local i64 @_start(i64 %num, i64 %denom) local_unnamed_addr #0 {
+entry:
+  %div = sdiv i64 %num, %denom
+  %ret = add i64 %div, 2
+  ret i64 %ret
+}
+
+attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a15" }
\ No newline at end of file

Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
@arichardson
Copy link
Member Author

Thanks for all the suggestions, updated now.

@arichardson arichardson merged commit 7f275e0 into main Feb 18, 2025
8 checks passed
@arichardson arichardson deleted the users/arichardson/spr/elflto-add-baseline-test-for-invalid-relocations-against-runtime-calls branch February 18, 2025 20:00
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Feb 18, 2025
…st runtime calls

This can happen when using a LTO build of compiler-rt for ARM and the
program uses 64-bit division.
The 64-bit division function in compiler-rt (__aeabi_ldivmod) is written
in assembly and calls the C function __divmoddi4, which works fine in
non-LTO links. However, when building with LTO the call inside
__aeabi_ldivmod is replaced with a jump to address zero, which then
crashes the program.

Building with -pie generates an error instead of a jump to address zero,
and surprisingly just declaring the __aeabi_ldivmod function (but not
calling it) in the input IR also avoids this issue.

Reported as llvm/llvm-project#127284

Co-authored-by: Fangrui Song <[email protected]>

Reviewed By: MaskRay

Pull Request: llvm/llvm-project#127286
wldfngrs pushed a commit to wldfngrs/llvm-project that referenced this pull request Feb 19, 2025
…calls

This can happen when using a LTO build of compiler-rt for ARM and the
program uses 64-bit division.
The 64-bit division function in compiler-rt (__aeabi_ldivmod) is written
in assembly and calls the C function __divmoddi4, which works fine in
non-LTO links. However, when building with LTO the call inside
__aeabi_ldivmod is replaced with a jump to address zero, which then
crashes the program.

Building with -pie generates an error instead of a jump to address zero,
and surprisingly just declaring the __aeabi_ldivmod function (but not
calling it) in the input IR also avoids this issue.

Reported as llvm#127284

Co-authored-by: Fangrui Song <[email protected]>

Reviewed By: MaskRay

Pull Request: llvm#127286
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants