Jit/NoJit builds

Not all platforms will support a JIT compiler, and even for those that do, building without the JIT is useful for fast build time and for testing.

The [optimizer design](https://github.com/faster-cpython/ideas/discussions/380) allows us to jump from tier1 code into optimized (tier2) code at arbitrary points, and back from tier2 code to exit to tier1 code, but it does so with calls. Which is a problem for a couple of reasons:
* Calls are very lopsided. We can pass an arbitrary amount of data from caller to callee, but only one datum (the return value) back
* Within the optimized (tier2) code we are also going to want to jump from one piece of optimized code to others (e.g. trace stitching) which will blow the stack if we use calls.
* Non-tail calls are expensive, at least compared to jumps, due to spilling.

So we need a transfer mechanism that allows us to pass as much information as we need, ideally in registers, and that won't blow the stack.

### JIT build

For a JIT build, we can use a custom calling convention and use tail calls everywhere. We need this for the JIT itself, so it make sense to build the interpreter to use the same conventions.

### Non-JIT build

For the non-JIT build, we should implement the tier1 and tier 2 interpreters in a single giant function.
Transfer of control should be implemented as `goto`s and information is passed in (C) local variables.

### Types of transfer

* Tier 1 dispatch
* Tier 2 dispatch
* Enter tier 2 from tier 1 for specialization (https://dl.acm.org/doi/10.1145/3617651.3622979) (not patchable)
* Enter tier 2 from tier 1 when entering an executor (`ENTER_EXECUTOR`) (patchable)
* Resuming tier 1 execution from tier 2
* Jumping from one tier 2 executor to another (not patchable)
* Jumping from one tier 2 executor to another (patchable)

Maybe others?

### What does this look like in `bytecodes.c`

My preferred approach would be that each of the above transfers is expressed as a macro-like expression, that is understood by the code generator and replaced will the relevant C code. Using actual C macros tends to get confusing.

### Implementing this in the interpreter.

Code examples assume no computed gotos. Those are left as an exercise for the reader 🙂 

` _Py_CODEUNIT *next_instr` becomes `union { _Py_CODEUNIT * tier1; PyUopInstruction *tier2; } next_instr`

* We already have a macro for tier 1 dispatch, `DISPATCH()` although it is mostly implicit
* Tier 2 dispatch is entirely implicit, I'm just listing it for completeness
* Entering tier 2 from tier 1 for specialization is `goto tier2_dispatch; tier2_dispatch: switch (next_instr->tier2.opcode) {`
* Enter tier 2 from tier 1 when entering an executor requires increfing the executor, then doing the above jump.
* Resuming tier 1 from tier 2, is also a jump: `goto tier1_dispatch; tier1_dispatch: switch (next_instr->tier1.op.code) {`

Patchable jumps need to pass their own address to the next piece of code.
We can pass this in a register for JIT code, for the interpreter we can pass it in memory. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Jit/NoJit builds #631

JIT build

Non-JIT build

Types of transfer

What does this look like in `bytecodes.c`

Implementing this in the interpreter.

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Jit/NoJit builds #631

Description

JIT build

Non-JIT build

Types of transfer

What does this look like in bytecodes.c

Implementing this in the interpreter.

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

What does this look like in `bytecodes.c`