-
Notifications
You must be signed in to change notification settings - Fork 15.6k
[mlir][Python] create MLIRPythonSupport #171775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
12e0568 to
7c6e126
Compare
3edcf16 to
4bd036a
Compare
4bd036a to
07b7463
Compare
5a4f9ad to
6a00e1d
Compare
dadb1f1 to
bc44575
Compare
c2f601b to
3b24d83
Compare
3b24d83 to
7b521da
Compare
jpienaar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cmake change seems fine, it is non-trivial and does depend on specifics nanobind side a bit. But not sure that's avoidable/can be approved.
| } | ||
|
|
||
| if (argTypes.size() != argLocs.size()) | ||
| throw nanobind::value_error(("Expected " + Twine(argTypes.size()) + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would formatv approach look here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah this doesn't work because you start getting complaints like
ld64.lld: error: undefined symbol: typeinfo for llvm::support::detail::format_adapter
It would all be much nicer if we could just configure |
f22a225 to
ea8c091
Compare
ea8c091 to
4cc1cc1
Compare
What
This PR adds a shared library
MLIRPythonSupportwhich contains all of the CRTP classes ikePyConcreteValue,PyConcreteType,PyConcreteAttribute, as well as other useful code likeDefaulting*and etc enabling their reuse in downstream projects. Downstream projects can now doinstead of using the discordant alternative
mlir_type_subclass/mlir_attr_subclass(same goes forPyConcreteValue/mlir_value_subclass).Why
This PR is mostly code motion (along with CMake) but before I describe the changes I want to state the goals/benefits:
Dialect*extensions here) are a two-tier system;a. core extensions enjoy first class support as far as type inference1, type stub generation, and ease of implementation, while dialect extensions have poorer support, incorrect type stub generation much more tedious (boilerplate) implementation;
b. Crucially, this two-tiered system is reflected in the fact that the two sets of types/attributes are not in the same Python object hierarchy. To wit:
isinstance(..., Type)andisinstance(..., Attribute)are not supported for the dialect extensions2;c. Since these types are not exposed in public headers, downstream users (dialect extensions or not) cannot write functions that overload on e.g.
PyFloat8*Type- that's quite a useful feature!python->cpporcpp->python) requires work in addition to nanobind's own casting/construction pipeline;a. When going from
python->cpp, we extract the capsule object from the Python object and then extract from the capsule theMlir*opaque struct/ptr. This side isn't so onerous;b. When going from
cpp->pythonwe call long-hand call PythonimportAPIs and construct the Python object using_CAPICreate. Note, there at least 2attrcalls incurred in addition to_CAPICreate; this is already much more efficiently handled by nanobind itself!_mlirmodule when they are created (the types themselves, not even instances of those types). This blocks type stub generation for dialect extensions (i.e., the reason we currently only generate type stubs for_mlir).How
Prior this was not done/possible because of "ODR" issues but I have resolved those issues; the basic idea for how we solve this is "move things we want to share into shared libraries":
PyConcreteValue,PyConcreteType,PyConcreteAttribute) intoMLIRPythonSupport;IRModule.h(renamed toIRCore.h) becausePyConcreteValue,PyConcreteType,PyConcreteAttributedepend on them. This makes for a bigger PR than one would hope for but ultimately I think we should give people access to these classes to use as they see fit (specifically inherit from, but also liberally use in bindings signatures instead of the opaqueMlir*struct wrappers).MLIRPythonSupportbut we leave thebindcalls in the_mlirextension. Users should not need to bind/re-bind these bindings (and they couldn't anyway - nanobind will warn if you try to bind the same class twice).MLIR_BINDINGS_PYTHON_DOMAINwhich is determined by a compile time define (and tied toMLIR_BINDINGS_PYTHON_NB_DOMAIN). This is necessary in order to prevent conflicts on both symbol name and typeid (necessary for nanobind to not double register binded types) between multiple bindings libraries (e.g.,torch-mlir, andjax). Note nanobind doesn't supportmodule_locallike pybind11. It does supportNB_DOMAINbut that is not sufficient for disambiguating typeids across projects (to wit: we currently defineNB_DOMAINand it was still necessary to move everything to a nested namespace);MLIRPythonSupport).Testing
Three tests are added here
PythonTestModuleNanobindis ported to usePyConcreteType<PyTestType>instead ofmlir_type_subclassandPyConcreteAttribute<PyTestAttr>instead ofmlir_atrr_subclass, verifying this works for non-core extensions in-tree;StandaloneExtensionNanobindis ported to usestruct PyCustomType : mlir::python::MLIR_BINDINGS_PYTHON_DOMAIN::PyConcreteType<PyCustomType>instead ofmlir_type_subclassverifying this works for non-core extensions out-of-tree;StandaloneExtensionNanobind'ssmoketestis extended to also load another bindings package (namelymlir) verifyingMLIR_BINDINGS_PYTHON_DOMAINsuccessfully disambiguates symbols and typeids.I have also tested this downstream: llvm/eudsl#287 as well run the following builder bots:
mlir-gcc: https://lab.llvm.org/buildbot/#/builders/116/builds/22692
I have also tested against IREE: iree-org/iree#21916
Integration
It is highly recommended to set the CMake var
MLIR_BINDINGS_PYTHON_NB_DOMAIN(which will also determineMLIR_BINDINGS_PYTHON_DOMAIN) to something unique for each downstream. This can also be passed explicitly toadd_mlir_python_modulesif your project builds multiple bindings packages. I added aWARNINGto this effect inAddMLIRPython.cmake.Notes
This is the first PR of a few - the follow-ups will move more "support" types into respective headers (e.g.,
RankedTensorTypeintoIRTypes.hso thatTestIntegerRankedTensorTypecan be ported).Currently this PR has all the impls (members, methods, etc) inDone.IRCore.hfor easier review (so that the diff registers as much as possible as code motion). After review I will move all the impls intoIRCore.cpp.Footnotes
Python values being typed correctly when exiting from cpp; ↩
The workaround we implemented was a class method for the dialect bindings called
Class.isinstance(...); ↩Specifically when the modules are imported using
importlib, which occurs with nanobind's stubgen; ↩