-
Notifications
You must be signed in to change notification settings - Fork 212
[AADWARF32] Allocate a register number for CPSR #334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The document says of CPSR, along with the VFP (and FPA) control
registers, "It is considered unlikely that these will be needed for
producing a stack back-trace in a debugger."
However, CPSR _can_ be required for producing a correct stack
backtrace. This occurs due to conditional return sequences, for
example
```
function:
PUSH {r4, r5, r6, lr}
SUB sp, sp, ARM-software#64
// ... do stuff ...
CMP this, that // we will return early if they are equal
ADDEQ sp, sp, ARM-software#64
POPEQ {r4, r5, r6, pc}
// ... now, if we didn't return, continue using our stack frame
```
In between ADDEQ and POPEQ, the state of the stack depends on the
flags in CPSR. If the Z flag is set, then the ADDEQ has happened, and
the POPEQ is about to; if Z is clear, neither one has happened. In
this example the function has no frame pointer, so the CFA is defined
as an offset from sp, and _what_ offset depends on whether we just
added 64 to sp.
This style of conditional return has always been possible, since it
depends only on the earliest features of the Arm instruction set. The
accepted idiom in my experience has always been to write stack frame
information that is valid for one case but not the other. Introducing
a DWARF register number for CPSR makes it possible to write stack
frame information that is valid in both cases. For example, you could
write an expression along these lines, which uses the 4 bits at the
top of CPSR to decide which bit of the constant `bitmask` to test, and
then each Arm condition code is representable as a different bitmask:
```
DW_OP_breg_13 // fetch sp
DW_OP_constu #bitmask // bit mask for the particular condition
DW_OP_bregx #CPSR // fetch CPSR
DW_OP_const1u ARM-software#28
DW_OP_shr // make (CPSR >> 28), just the NZCV bits
DW_OP_shr // shift bit mask right by the NZCV value
DW_OP_const1u ARM-software#1
DW_OP_and // AND with 1 to isolate low bit
DW_OP_bra #label-offset-else // if nonzero, branch over the next constant
DW_OP_constu #offset2 // load one possible value to add to sp
DW_OP_skip #label-offset-end // and skip over the other constant
label-offset-else:
DW_OP_constu #offset1 // load the other value to add to sp
label-offset-end:
DW_OP_minus // add either offset2 or offset1 to sp
```
Of course, debuggers will take time to catch up. But a more
interesting use case for being able to precisely describe stack
situations like this is automatic static checkers for handwritten
assembly language with handwritten call frame directives, which verify
the semantics of the instructions against the call frame updates next
to them. A debugger might almost never stop at the difficult location
above, but a static checker will traverse it every time, and needs a
way to avoid getting confused.
walkerkd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last comment on the example code should probably be:
"subtract either offset2 or offset1 from sp"
|
Adding the register seems reasonable. We're not expecting to be adding large numbers of new registers in AArch32. Although out of scope for AADWARF32, I'm guessing you will you need additional cfi directive(s) to construct the DWARF expression? |
|
Since DWARF 3 there have been the CFI instructions DW_CFA_expression, DW_CFA_val_expression and DW_CFA_def_cfa_expression which can take DWARF expression like the one mentioned in the original comment. However the Linux .eh_frame specification currently only currently defines support for DW_CFA_expression (it uses the DWARF 2 specification as the base standard with some DWARF 3 extensions) : https://refspecs.linuxbase.org/LSB_4.0.0/LSB-Core-generic/LSB-Core-generic/dwarfext.html#DWARFEHENCODING |
|
I think that's fine, because I don't know of any reason why It only comes up in But in |
|
I see no value for this in the stack unwinding, since the CPSR isn't preserved over a call and, in general, there's no way of knowing what result a comparison will produce at some future time. A frame that was popped in a different way that is dependent on the condition codes would be unusual in extreme and I don't think dwarf could describe that. On the other hand, this might be useful in some expression cases for debuggers, but perhaps we would need to be clear that it does not form part of the set of registers described by unwinding. |
I hadn't thought that far, but yes, I suppose so. I don't see any In fact even if you could write something using an ordinary expression syntax, along the lines of then it would still look pretty horrible! For practical use you'd want a custom Arm-specific directive that let you directly write a condition code like GT or LO and would construct the right expression. |
|
@rearnsha raises a valid point here in that as the CPSR is not caller preserved, any expression using the CPSR register would be unevaluatable in any but the top stack frames as the CPSR register location would be "undefined". |
|
Ah, yes, I see what you mean. Should I add some text along the lines of "CPSR is only for use in expressions, when the stack layout depends on it, and you aren't allowed to write a CFI rule to say where it's stored, because it never is"? Mind you, this is surely also very like r0–r3: those too aren't supposed to have CFI rules saying where the caller's version lives, but you can use them in CFI anyway, e.g. you could (if you really wanted to) make the CFA based on r3. |
|
It's correct that r0-r3 (and IP) can't be used in unwinding (with the AAPCS conventions); the same applies to d0-d7 and d16-d31. |
|
Surely the only ABI variants we need to care about for these purposes are ones specified in this collection of ABI documents itself. People making up their own variants outside this specification (if any) shouldn't be surprised when this spec doesn't allocate them the DWARF register numbers they need 🙂 So we can add a register number for FPSR if and when we introduce an ABI variant that specifies it as callee-saved. But one for CPSR is useful now, even though we don't specify it as callee-saved, because in the deepest stack frame it's possible for the stack layout to depend on it. I can't see how FPSR, or other status bits such as the MVE predicate register (or even the Q flag), could easily have the same property, because only the NZCV flags in CPSR can be used to conditionalize instructions that change the state of the stack (pops, pushes, modifications to sp). To do conditional stack modification based on any other flags, you'd have to transfer those flags into the CPSR, or test them in some other way that ends up in a conditional branch (like putting them into an integer register and doing a TBNZ), and then the stack layout is back to depending on PC and/or CPSR again. |
So at the point of the comparison, only the FPSR contains the correct value. |
|
But at the point where only the FPSR contains the result of the comparison, the stack layout doesn't depend on it. The stack layout is independent of the comparison result until after the Directly after the |
|
And when I reach The CPSR only tells me what it's going to be in one instruction time. But I don't need dwarf information to tell me that, I could work it out from the instruction and the current register set itself. So what's your point exactly? which could be compiled to (assuming now the compiler can describe |
No – not only that. At the point in between the If Z is set, then it was also set before the
In that case, how would you prefer that a situation like this be represented in |
The document currently says of CPSR, along with the VFP (and FPA) control registers, "It is considered unlikely that these will be needed for producing a stack back-trace in a debugger."
However, CPSR can be required for producing a correct stack backtrace. This occurs due to conditional return sequences, for example
In between ADDEQ and POPEQ, the state of the stack depends on the flags in CPSR. If the Z flag is set, then the ADDEQ has happened, and the POPEQ is about to; if Z is clear, neither one has happened. In this example the function has no frame pointer, so the CFA is defined as an offset from sp, and what offset depends on whether we just added 64 to sp.
This style of conditional return has always been possible, since it depends only on the earliest features of the Arm instruction set. The accepted idiom in my experience has always been to write stack frame information that is valid for one case but not the other. Introducing a DWARF register number for CPSR makes it possible to write stack frame information that is valid in both cases. For example, you could write an expression along these lines, which uses the 4 bits at the top of CPSR to decide which bit of the constant
bitmaskto test, and then each Arm condition code is representable as a different bitmask:Of course, debuggers will take time to catch up. But a more interesting use case for being able to precisely describe stack situations like this is automatic static checkers for handwritten assembly language with handwritten call frame directives, which verify the semantics of the instructions against the call frame updates next to them. A debugger might almost never stop at the difficult location above, but a static checker will traverse it every time, and needs a way to avoid getting confused.