Skip to content

Make PyObject and other runtime structures optionally fully opaque #6

@steve-s

Description

@steve-s

Full opaqueness in stable ABI would be one of the most useful features for alternative Pythons and I believe also for any larger changes inside CPython VM. The most problematic part is the exposure of the PyObject head:

struct {
    Py_ssize_t ob_refcnt;
    PyTypeObject *ob_type;
};

There several problems with making PyObject fully opaque:

  • PyObject_HEAD is embedded in user defined types, but only the right size of it is necessary for that
  • macros such as Py_INCREF, PY_TYPE directly access the fields
  • user code may directly access the fields
  • without direct access to those performance critical fields, i.e., when Py_INCREF is a function call, the extensions may be slower

Plan for gradual migration to optional opaqueness in stable ABI (stable NI):

  1. Optional function call for tagged pointers, but keep the fast direct access version for regular pointers:
    • Keep the definition of PyObject, but allow implementation to tag some pointers (lest significant bit of the pointer is set)
    • Change Py_INCREF, Py_TYPE macros such that they work as before unless the pointer is tagged in which case they call to runtime
    • No pointers will be really tagged at this point (direct access to ob_type continues to work)
  2. Migrate extensions to avoid direct access, but use macros/inline functions, nothing changes from performance perspective
    • check of the tag should be extremely fast, it would happen just before the pointer is about to be dereferenced anyway
    • note: here we are talking only about extensions that target or want to target stable ABI
  3. Make PyObject opaque
    • the macros would cast it to non-public struct that gives them access to the fields (so the original PyObject struct is still part of the ABI, but not part of API)
    • PyObject_HEAD would be changed to be a struct of the right size, but would not expose any useful fields (e.g., byte array of the right size)
  4. Users cannot access the fields anyone, Python runtimes can opt-in to function calls by tagging the pointer
    • (memory pointed by the) tagged pointers do not need to follow any particular memory layout
    • the tagged pointers do not actually have to be pointers at all

In summary:

  • optional pointer tagging will become part of the ABI (almost fully backward compatible change modulo code that tags pointers internally)
  • new functions, such as Py_IncRef_Tagged would become part of the ABI
  • PyObject layout will not be part of API, but will stay as part of ABI, Py_INCREF stays as fast as before
  • none of this would apply when user does not choose to build for stable ABI (i.e., tagged pointers would not be expected in non-stable builds)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions