-
-
Notifications
You must be signed in to change notification settings - Fork 32.5k
bpo-26219: per opcode cache for LOAD_GLOBAL #12884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This patch implements opcache only for LOAD_GLOBAL to ease review. Based on Yury's opcache3.patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting infrastructure. I've a few comments below.
// TODO: LOAD_METHOD, LOAD_ATTR | ||
if (opcode == LOAD_GLOBAL) { | ||
co->co_opt_opcodemap[i] = ++opts; | ||
if (opts > 254) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps use 16-bit indices instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To maximize "performance benefit / memory usage", I prefer using 1 byte for each instructions, instead of two bytes.
Include/code.h
Outdated
// to cache object: co_opcache[co_opcache_map[next_instr/sizeof(_Py_CODEUNIT)]] | ||
unsigned char *co_opcache_map; | ||
_PyOpcache *co_opcache; | ||
int32_t co_opcache_flag; // used to determine when create cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(1) It looks like this is a usage count, rather than a flag. Maybe call it something like co_run_count?
(2) I don't think you need a full 32 bit int -- which means that it could fit into the same word as the following field. (and still leave room for some actual flags or something later, even on 32 bit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes -- currently. I don't think current algorithm to determine when enable cache
is stable, it can be reconsidered several times. So I don't think we need name represents
concrete usage of the variable for now.
@@ -28,6 +28,9 @@ PyAPI_FUNC(void) _PyEval_SignalAsyncExc( | |||
PyAPI_FUNC(void) _PyEval_ReInitThreads( | |||
_PyRuntimeState *runtime); | |||
|
|||
/* Private function */ | |||
void _PyEval_Fini(void); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is really private, why include it in the header, instead of just putting the prototype in the .c file?
Also, why call call it a generic ...Eval_Fini as opposed to ...code_stats ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because other private functions are put in "internal" headers too.
Include/internal/pycore_code.h
Outdated
#endif | ||
|
||
typedef struct { | ||
PyObject *ptr; /* Cached pointer (stealed reference) */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stolen reference? I'm more used to seeing that around something that can be called ... do you just mean that the object pointed to SHOULD not get an incref for this? (And is that to avoid circular references?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I was wrong. It is borrowed reference, not stolen.
co->co_opcache_flag = 0; | ||
co->co_opcache_size = 0; | ||
co->co_opcache_map = NULL; | ||
co->co_opcache = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Purely as a style question ... why did you change the order of the fields? (as opposed to initializing in definition order)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't changed order in here... I changed only order of struct for compactness and
readability. I didn't changed here because (a) I just missed it and (b) order in here is
not so important than order in struct.
|
|
This patch implements per opcode cache mechanism, and use it in only LOAD_GLOBAL opcode. Based on Yury's opcache3.patch in bpo-26219.
This patch implements opcache only for LOAD_GLOBAL to ease review.
Based on Yury's opcache3.patch.
https://bugs.python.org/issue26219