-
Notifications
You must be signed in to change notification settings - Fork 29
Description
Maintenance of MLJModels has become increasingly burdensome for several reasons. Perhaps the biggest problem is that it's centralised approach to providing model API implementations ("glue code") means:
-
Testing takes a long time. Tolerable, just.
-
[extras] must include a very large number of packages which invariably cause fatal version conflicts with the packges in [deps] during CI. (As I understand it, during a test, the [deps] are essentially pinned when [extras] are loaded.) Less tolerable.
-
With the existing package manager, we have no way to specify bounds on the algorithm-providing packages (the ones in [extras]). The latest release compatible with [deps] always get's loaded. If just one (of these many) packages makes a breaking change to the "glue code", then MLJModels CI fails.
While JuliaLang/Pkg.jl#1285 may help with the second and third issue, I don't think that is close to being resolved. We have also observed elsewhere that code loading using Requires.jl can be slower than otherwise. (And there is #243)
While the plan has always been for all algorithm-providing packages to implement their MLJ interfaces natively, this is not going to happen quickly. In the meantime, it would be good to address the issues above.
In discussions of the core team, @tlienart has suggested the following remedy: Move the glue code for each package X into its own repository Xglue, with its own testing, and make X an ordinary dependency of Xglue. (The package Xglue would be purely a "utility" package and essentially invisible to general users.)
Such migrations could be performed incrementally and I believe each migration would trigger only a patch release, basically because the model registry (which tracks where to find glue code) is part of MLJModels (and so is opaque to MLJ).
I think this a good idea and propose beginning this disintegation; See TODO list.
cc @DilumAluthge