Description
To support trace stitching we need data for exits embedded into trace objects.
First of all we need an exit struct:
https://github.com/faster-cpython/ideas/blob/main/3.13/engine.md#making-cold-exits-small lists the fields we need.
typedef struct _exit_data {
_PyExecutorObject *executor; // The executor to jump to
uint32_t target; // offset in the code object of the target code unit.
int16_t hotness; // How hot this exit is. Start at a negative number and count up?
} _PyExitData;
Since all executors are now micro-ops based, we can get rid of the distinction between _PyExecutorObject
and _PyUOpExecutorObject
.
The new combined executor struct would look something like:
typedef struct _PyExecutorObject {
PyObject_VAR_HEAD
_PyVMData vm_data;
union {
void *jit_code; // Will be a function pointer
_PyUOpInstruction *trace;
};
_PyExitData exits[1];
} _PyExecutorObject;
We will pre-allocate an array of executors for cold exits, one for each possible index into the exits
array. Since there cannot be more exits than the maximum number of uops
, the size of the array will be _Py_UOP_MAX_TRACE_LENGTH
, currently 512.
Cold exits will be implemented as a single micro-op _COLD_EXIT
with oparg
set to the index in the array.
_COLD_EXIT
needs to do the following:
- Find the exit:
exit = current_executor->exit[oparg]
- Increment
exit->hotness
- If
exit->hotness == 0
then trigger generator on a new executor to replaceexit->executor
. - Otherwise drop into tier 1:
next_instr = PyCode_CODE(code) + exit->target; ENTER_TIER1();
The JIT will need to compile the code for these cold exits on start up, but we can static allocate the data structures.