Skip to content

[RISCV][WIP] Branch to Absolute Address #133555

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

lenary
Copy link
Member

@lenary lenary commented Mar 29, 2025

Today, if you feed e.g. beq a0, a1, 60 into LLVM's assembler, you get back something like this: (i.e., the 60 is directly used in the encoding).

00000000 <.text>:
       0: <encoding> beq a0, a1, 60 <.text+60>

If you feed the same into binutils, you get back something like this (i.e., the 60 is treated like an absolute address):

00000000 <.text>:
       0: <encoding> bne a0, a1, 8 <.text+0x8>
       4: <encoding> j 4 <.text+0x4>
             R_RISCV_JAL *ABS*+0x3c

Craig reminded me of this recently and Ana mentioned that there had been previous discussions about this, but did not recall the resolution.


There are drawbacks to how binutils works. It cannot really ever say a specific immediate is "wrong", because only the linker resolves these, whereas in LLVM we can catch some issues like that a bit earlier, but obviously we're catching issues in something subtly different.


This PR is a pile of hacks to make LLVM's assembler work like binutils. None of this code is particularly nice, but I hope some of it might be a good place to start discussing whether we want to make this change? (I didn't find a previous discussion of this in the issue tracker, but maybe I didn't look hard enough)

There are almost certainly nicer ways to do this, but this is the way I managed today. There are almost certainly some bugs here too, especially in the printing code.

These are some prospective hacks hacks to make `beq r1, r2, <addr>` work
as it does in binutils - that is, `<addr>` is treated as absolute,
rather than relative to the current instruction.

None of this code is particularly nice, but I hope some of it might be a
good place to start discussing whether we want to make this change?
@lenary lenary requested review from asb, MaskRay, topperc and wangpc-pp March 29, 2025 02:42
Copy link

github-actions bot commented Mar 29, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@asb
Copy link
Contributor

asb commented Mar 29, 2025

Craig reminded me of this recently and Ana mentioned that there had been previous discussions about this, but did not recall the resolution.

My recollection is there were no objections to making the change, but nobody was hugely motivated to do so. Looks like you've solved the latter problem!

IIRC there's inconsistency both within LLVM for branch to absolute address, and even across LLVM and binutils for the same targets. But matching binutils and updating tests to use . + for the old behaviour makes sense to me.

My expectation is that making the change shouldn't be disruptive externally because the main users who rely on the current behaviour would be us with the MC tests, and potentially other external test suites (e.g. using the assembler to help with instruction decoder tests). For the latter, for binutils compatibility they would have had to be using . + anyway. If anyone thinks the impact may be broader, please do speak up!

@MaskRay
Copy link
Member

MaskRay commented Mar 29, 2025

GNU Assembler's x86 and riscv ports treat call constant as branching to an absolute address, which makes sense to me. Other ports (including aarch64,powerpc,systemz) should ideally be fixed.

I agree that we have a pile of hacks to support the same mnemonic with different operand forms. Ideally they could be simplified.

% echo 'call 60; call 60' | ~/Dev/binutils-gdb/out/debug/gas/as-new - && fob -dr a.out

a.out:  file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <.text>:
       0: e8 00 00 00 00                callq   0x5 <.text+0x5>
                0000000000000001:  R_X86_64_PC32        *ABS*+0x38
       5: e8 00 00 00 00                callq   0xa <.text+0xa>
                0000000000000006:  R_X86_64_PC32        *ABS*+0x38

% echo 'bl 60; bl 60' | ~/Dev/binutils-gdb/out/aarch64/gas/as-new - && ~/Dev/binutils-gdb/out/aarch64/binutils/objdump -dr

a.out:     file format elf64-littleaarch64


Disassembly of section .text:

0000000000000000 <.text>:
   0:   9400000f        bl      3c <.text+0x3c>
   4:   9400000f        bl      40 <.text+0x40>

% echo 'bl 60; bl 60' | ~/Dev/binutils-gdb/out/ppc64le/gas/as-new - && ~/Dev/binutils-gdb/out/ppc64le/binutils/objdump -dr

a.out:     file format elf64-powerpcle


Disassembly of section .text:

0000000000000000 <.text>:
   0:   3d 00 00 48     bl      0x3c
   4:   3d 00 00 48     bl      0x40

% echo 'brasl %r14, 60; brasl %r14, 60' | ~/Dev/binutils-gdb/out/s390x/gas/as-new - && ~/Dev/binutils-gdb/out/s390x/binutils/objdump -dr

a.out:     file format elf64-s390

Disassembly of section .text:

0000000000000000 <.text>:
   0:   c0 e5 00 00 00 1e       brasl   %r14,0x3c
   6:   c0 e5 00 00 00 1e       brasl   %r14,0x42

% echo 'call 60; call 60' | ~/Dev/binutils-gdb/out/riscv64/gas/as-new -mno-relax - && ~/Dev/binutils-gdb/out/riscv64/binutils/objdump -dr

a.out:     file format elf64-littleriscv


Disassembly of section .text:

0000000000000000 <.text>:
   0:   00000097                auipc   ra,0x0
                        0: R_RISCV_CALL_PLT     *ABS*+0x3c
   4:   000080e7                jalr    ra # 0 <.text>
   8:   00000097                auipc   ra,0x0
                        8: R_RISCV_CALL_PLT     *ABS*+0x3c
   c:   000080e7                jalr    ra # 8 <.text+0x8>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants