Skip to content

Jit/NoJit builds #631

Open
Open
@markshannon

Description

@markshannon

Not all platforms will support a JIT compiler, and even for those that do, building without the JIT is useful for fast build time and for testing.

The optimizer design allows us to jump from tier1 code into optimized (tier2) code at arbitrary points, and back from tier2 code to exit to tier1 code, but it does so with calls. Which is a problem for a couple of reasons:

  • Calls are very lopsided. We can pass an arbitrary amount of data from caller to callee, but only one datum (the return value) back
  • Within the optimized (tier2) code we are also going to want to jump from one piece of optimized code to others (e.g. trace stitching) which will blow the stack if we use calls.
  • Non-tail calls are expensive, at least compared to jumps, due to spilling.

So we need a transfer mechanism that allows us to pass as much information as we need, ideally in registers, and that won't blow the stack.

JIT build

For a JIT build, we can use a custom calling convention and use tail calls everywhere. We need this for the JIT itself, so it make sense to build the interpreter to use the same conventions.

Non-JIT build

For the non-JIT build, we should implement the tier1 and tier 2 interpreters in a single giant function.
Transfer of control should be implemented as gotos and information is passed in (C) local variables.

Types of transfer

  • Tier 1 dispatch
  • Tier 2 dispatch
  • Enter tier 2 from tier 1 for specialization (https://dl.acm.org/doi/10.1145/3617651.3622979) (not patchable)
  • Enter tier 2 from tier 1 when entering an executor (ENTER_EXECUTOR) (patchable)
  • Resuming tier 1 execution from tier 2
  • Jumping from one tier 2 executor to another (not patchable)
  • Jumping from one tier 2 executor to another (patchable)

Maybe others?

What does this look like in bytecodes.c

My preferred approach would be that each of the above transfers is expressed as a macro-like expression, that is understood by the code generator and replaced will the relevant C code. Using actual C macros tends to get confusing.

Implementing this in the interpreter.

Code examples assume no computed gotos. Those are left as an exercise for the reader 🙂

_Py_CODEUNIT *next_instr becomes union { _Py_CODEUNIT * tier1; PyUopInstruction *tier2; } next_instr

  • We already have a macro for tier 1 dispatch, DISPATCH() although it is mostly implicit
  • Tier 2 dispatch is entirely implicit, I'm just listing it for completeness
  • Entering tier 2 from tier 1 for specialization is goto tier2_dispatch; tier2_dispatch: switch (next_instr->tier2.opcode) {
  • Enter tier 2 from tier 1 when entering an executor requires increfing the executor, then doing the above jump.
  • Resuming tier 1 from tier 2, is also a jump: goto tier1_dispatch; tier1_dispatch: switch (next_instr->tier1.op.code) {

Patchable jumps need to pass their own address to the next piece of code.
We can pass this in a register for JIT code, for the interpreter we can pass it in memory.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions