Skip to content

Commit 61f12f9

Browse files
authored
Merge pull request #1120 from JuliaAI/dev
For a 0.20.4 release
2 parents 1a1d10f + 6a57430 commit 61f12f9

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+451
-9621
lines changed

ORGANIZATION.md

Lines changed: 63 additions & 80 deletions
Original file line numberDiff line numberDiff line change
@@ -8,102 +8,85 @@ connections do not currently exist but are planned/proposed.*
88
Repositories of some possible interest outside of MLJ, or beyond
99
its conventional use, are marked with a ⟂ symbol:
1010

11-
* [MLJ.jl](https://github.com/JuliaAI/MLJ.jl) is the
12-
general user's point-of-entry for choosing, loading, composing,
13-
evaluating and tuning machine learning models. It pulls in most code
14-
from other repositories described below. MLJ also hosts the [MLJ
15-
manual](src/docs) which documents functionality across the
16-
repositories, with the exception of ScientificTypesBase, and
17-
MLJScientific types which host their own documentation. (The MLJ
18-
manual and MLJTutorials do provide overviews of scientific types.)
19-
20-
* [MLJModelInterface.jl](https://github.com/JuliaAI/MLJModelInterface.jl)
21-
is a lightweight package imported by packages implementing MLJ's
22-
interface for their machine learning models. It's only dependencies
23-
are ScientificTypesBase.jl (which depends only on the standard
24-
library module `Random`) and
25-
[StatisticalTraits.jl](https://github.com/JuliaAI/StatisticalTraits.jl)
26-
(which depends only on ScientificTypesBase.jl).
11+
* [MLJ.jl](https://github.com/JuliaAI/MLJ.jl) is the general user's point-of-entry for
12+
choosing, loading, composing, evaluating and tuning machine learning models. It pulls in
13+
most code from other repositories described below. MLJ also hosts the [MLJ
14+
manual](src/docs) which documents functionality across the repositories, although some
15+
pages point to documentation hosted locally by a particular package.
16+
17+
18+
* [MLJModelInterface.jl](https://github.com/JuliaAI/MLJModelInterface.jl) is a lightweight
19+
package imported by packages implementing MLJ's interface for their machine learning
20+
models. It's only dependencies are ScientificTypesBase.jl (which depends only on the
21+
standard library module `Random`) and
22+
[StatisticalTraits.jl](https://github.com/JuliaAI/StatisticalTraits.jl) (which depends
23+
only on ScientificTypesBase.jl).
2724

28-
* (⟂)
29-
[MLJBase.jl](https://github.com/JuliaAI/MLJBase.jl) is
30-
a large repository with two main purposes: (i) to give "dummy"
31-
methods defined in MLJModelInterface their intended functionality
32-
(which depends on third party packages, such as
25+
* (⟂) [MLJBase.jl](https://github.com/JuliaAI/MLJBase.jl) is a large repository with two
26+
main purposes: (i) to give "dummy" methods defined in MLJModelInterface their intended
27+
functionality (which depends on third party packages, such as
3328
[Tables.jl](https://github.com/JuliaData/Tables.jl),
34-
[Distributions.jl](https://github.com/JuliaStats/Distributions.jl)
35-
and
36-
[CategoricalArrays.jl](https://github.com/JuliaData/CategoricalArrays.jl));
37-
and (ii) provide functionality essential to the MLJ user that has
38-
not been relegated to its own "satellite" repository for some
39-
reason. See the [MLJBase.jl
40-
readme](https://github.com/JuliaAI/MLJBase.jl) for a
41-
detailed description of MLJBase's contents.
29+
[Distributions.jl](https://github.com/JuliaStats/Distributions.jl) and
30+
[CategoricalArrays.jl](https://github.com/JuliaData/CategoricalArrays.jl)); and (ii)
31+
provide functionality essential to the MLJ user that has not been relegated to its own
32+
"satellite" repository for some reason. See the [MLJBase.jl
33+
readme](https://github.com/JuliaAI/MLJBase.jl) for a detailed description of MLJBase's
34+
contents.
4235

43-
* [StatisticalMeasures.jl](https://github.com/JuliaAI/StatisticalMeasures.jl) provifes
36+
* [StatisticalMeasures.jl](https://github.com/JuliaAI/StatisticalMeasures.jl) provides
4437
performance measures (metrics) such as losses and scores.
4538

46-
* [MLJModels.jl](https://github.com/JuliaAI/MLJModels.jl)
47-
hosts the *MLJ model registry*, which contains metadata on all the
48-
models the MLJ user can search and load from MLJ. Moreover, it
49-
provides the functionality for **loading model code** from MLJ on
50-
demand. Finally, it furnishes some commonly used transformers for
51-
data pre-processing, such as `ContinuousEncoder` and `Standardizer`.
39+
* [MLJModels.jl](https://github.com/JuliaAI/MLJModels.jl) hosts the *MLJ model registry*,
40+
which contains metadata on all the models the MLJ user can search and load from
41+
MLJ. Moreover, it provides the functionality for **loading model code** from MLJ on
42+
demand. Finally, it furnishes some commonly used transformers for data pre-processing,
43+
such as `ContinuousEncoder` and `Standardizer`.
5244

53-
* [MLJTuning.jl](https://github.com/JuliaAI/MLJTuning.jl)
54-
provides MLJ's `TunedModel` wrapper for hyper-parameter
55-
optimization, including the extendable API for tuning strategies,
56-
and selected in-house implementations, such as `Grid` and
57-
`RandomSearch`.
45+
* [MLJTuning.jl](https://github.com/JuliaAI/MLJTuning.jl) provides MLJ's `TunedModel`
46+
wrapper for hyper-parameter optimization, including the extendable API for tuning
47+
strategies, and selected in-house implementations, such as `Grid` and `RandomSearch`.
5848

59-
* [MLJEnsembles.jl](https://github.com/JuliaAI/MLJEnsembles.jl)
60-
provides MLJ's `EnsembleModel` wrapper, for creating homogenous
61-
model ensembles.
49+
* [MLJEnsembles.jl](https://github.com/JuliaAI/MLJEnsembles.jl) provides MLJ's
50+
`EnsembleModel` wrapper, for creating homogeneous model ensembles.
6251

63-
* [MLJIteration.jl](https://github.com/JuliaAI/MLJIteration.jl)
64-
provides the `IteratedModel` wrapper for controlling iterative
65-
models (snapshots, early stopping criteria, etc)
52+
* [MLJIteration.jl](https://github.com/JuliaAI/MLJIteration.jl) provides the
53+
`IteratedModel` wrapper for controlling iterative models (snapshots, early stopping
54+
criteria, etc)
6655

67-
* (⟂)
68-
[OpenML.jl](https://github.com/JuliaAI/OpenML.jl) provides
69-
integration with the [OpenML](https://www.openml.org) data science
70-
exchange platform
56+
* [MLJFlow.jl](https://github.com/JuliaAI/MLJFlow.jl) provides integration with the
57+
platform-agnostic machine learning tracking tool [MLflow](https://mlflow.org).
7158

72-
* (⟂)
73-
[MLJLinearModels.jl](https://github.com/JuliaAI/MLJLinearModels.jl)
74-
is an experimental package for a wide range of julia-native penalized linear models
75-
such as Lasso, Elastic-Net, Robust regression, LAD regression,
76-
etc.
59+
* (⟂) [OpenML.jl](https://github.com/JuliaAI/OpenML.jl) provides integration with the
60+
[OpenML](https://www.openml.org) data science exchange platform
61+
62+
* (⟂) [MLJLinearModels.jl](https://github.com/JuliaAI/MLJLinearModels.jl) provides a wide
63+
range of julia-native penalized linear models such as Lasso, Elastic-Net, Robust
64+
regression, LAD regression, etc.
7765

78-
* [MLJFlux.jl](https://github.com/FluxML/MLJFlux.jl) an experimental
79-
package for gradient-descent models, such as traditional
80-
neural-networks, built with
66+
* [MLJFlux.jl](https://github.com/FluxML/MLJFlux.jl) an experimental package for
67+
gradient-descent models, such as traditional neural-networks, built with
8168
[Flux.jl](https://github.com/FluxML/Flux.jl), in MLJ.
8269

83-
* (⟂)
84-
[ScientificTypesBase.jl](https://github.com/JuliaAI/ScientificTypesBase.jl)
85-
is an ultra lightweight package providing "scientific" types,
86-
such as `Continuous`, `OrderedFactor`, `Image` and `Table`. It's
87-
purpose is to formalize conventions around the scientific
88-
interpretation of ordinary machine types, such as `Float32` and
70+
* (⟂) [ScientificTypesBase.jl](https://github.com/JuliaAI/ScientificTypesBase.jl) is an
71+
ultra lightweight package providing "scientific" types, such as `Continuous`,
72+
`OrderedFactor`, `Image` and `Table`. It's purpose is to formalize conventions around
73+
the scientific interpretation of ordinary machine types, such as `Float32` and
8974
`DataFrame`.
9075

91-
* (⟂)
92-
[ScientificTypes.jl](https://github.com/JuliaAI/ScientificTypes.jl)
93-
articulates the particular convention for the scientific interpretation of
94-
data that MLJ adopts
76+
* (⟂) [ScientificTypes.jl](https://github.com/JuliaAI/ScientificTypes.jl) articulates the
77+
particular convention for the scientific interpretation of data that MLJ adopts
9578

96-
* (⟂)
97-
[StatisticalTraits.jl](https://github.com/JuliaAI/StatisticalTraits.jl)
98-
An ultra lightweight package defining fall-back implementations for
99-
a collection of traits possessed by statistical objects, principally
100-
models and measures (metrics).
79+
* (⟂) [StatisticalTraits.jl](https://github.com/JuliaAI/StatisticalTraits.jl) An ultra
80+
lightweight package defining fall-back implementations for a collection of traits
81+
possessed by statistical objects, principally models and measures (metrics).
10182

102-
* (⟂)
103-
[DataScienceTutorials](https://github.com/JuliaAI/DataScienceTutorials.jl)
104-
collects tutorials on how to use MLJ, which are deployed
83+
* (⟂) [DataScienceTutorials](https://github.com/JuliaAI/DataScienceTutorials.jl) collects
84+
tutorials on how to use MLJ, which are deployed
10585
[here](https://JuliaAI.github.io/DataScienceTutorials.jl/)
10686

107-
* [MLJTestIntegration](https://github.com/JuliaAI/MLJTestIntegration.jl)
108-
provides tests for implementations of the MLJ model interface, and
109-
integration tests for the entire MLJ ecosystem
87+
* [MLJTestInterface](https://github.com/JuliaAI/MLJTestInterface.jl) provides tests for
88+
implementations of the MLJ model interface
89+
90+
* [MLJTestIntegration](https://github.com/JuliaAI/MLJTestIntegration.jl) provides tests
91+
for the entire MLJ ecosystem. (Called when you run `ENV["MLJ_TEST_INTEGRATION"]="true";
92+
Pkg.test("MLJ")`.

Project.toml

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "MLJ"
22
uuid = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7"
33
authors = ["Anthony D. Blaom <[email protected]>"]
4-
version = "0.20.3"
4+
version = "0.20.4"
55

66
[deps]
77
CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
@@ -34,7 +34,7 @@ Distributions = "0.21,0.22,0.23, 0.24, 0.25"
3434
MLJBalancing = "0.1"
3535
MLJBase = "1"
3636
MLJEnsembles = "0.4"
37-
MLJFlow = "0.4"
37+
MLJFlow = "0.4.2"
3838
MLJIteration = "0.6"
3939
MLJModels = "0.16"
4040
MLJTestIntegration = "0.5.0"
@@ -84,8 +84,9 @@ PartitionedLS = "19f41c5e-8610-11e9-2f2a-0d67e7c5027f"
8484
SIRUS = "cdeec39e-fb35-4959-aadb-a1dd5dede958"
8585
SelfOrganizingMaps = "ba4b7379-301a-4be0-bee6-171e4e152787"
8686
StableRNGs = "860ef19b-820b-49d6-a774-d7a799459cd3"
87+
Suppressor = "fd094767-a336-5f1f-9728-57cf17d0bbfb"
8788
SymbolicRegression = "8254be44-1295-4e6a-a16d-46603ac705cb"
8889
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
8990

9091
[targets]
91-
test = ["BetaML", "CatBoost", "EvoLinear", "EvoTrees", "Imbalance", "InteractiveUtils", "LightGBM", "MLJClusteringInterface", "MLJDecisionTreeInterface", "MLJFlux", "MLJGLMInterface", "MLJLIBSVMInterface", "MLJLinearModels", "MLJMultivariateStatsInterface", "MLJNaiveBayesInterface", "MLJScikitLearnInterface", "MLJTSVDInterface", "MLJTestInterface", "MLJTestIntegration", "MLJText", "MLJXGBoostInterface", "Markdown", "NearestNeighborModels", "OneRule", "OutlierDetectionNeighbors", "OutlierDetectionPython", "ParallelKMeans", "PartialLeastSquaresRegressor", "PartitionedLS", "SelfOrganizingMaps", "SIRUS", "SymbolicRegression", "StableRNGs", "Test"]
92+
test = ["BetaML", "CatBoost", "EvoLinear", "EvoTrees", "Imbalance", "InteractiveUtils", "LightGBM", "MLJClusteringInterface", "MLJDecisionTreeInterface", "MLJFlux", "MLJGLMInterface", "MLJLIBSVMInterface", "MLJLinearModels", "MLJMultivariateStatsInterface", "MLJNaiveBayesInterface", "MLJScikitLearnInterface", "MLJTSVDInterface", "MLJTestInterface", "MLJTestIntegration", "MLJText", "MLJXGBoostInterface", "Markdown", "NearestNeighborModels", "OneRule", "OutlierDetectionNeighbors", "OutlierDetectionPython", "ParallelKMeans", "PartialLeastSquaresRegressor", "PartitionedLS", "SelfOrganizingMaps", "SIRUS", "SymbolicRegression", "StableRNGs", "Suppressor","Test"]

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,14 +42,15 @@ framework?** Start [here](https://JuliaAI.github.io/MLJ.jl/dev/quick_start_guide
4242

4343
MLJ was initially created as a Tools, Practices and Systems project at
4444
the [Alan Turing Institute](https://www.turing.ac.uk/)
45-
in 2019. Current funding is provided by a [New Zealand Strategic
45+
in 2019. Funding has also been provided by a [New Zealand Strategic
4646
Science Investment
4747
Fund](https://www.mbie.govt.nz/science-and-technology/science-and-innovation/funding-information-and-opportunities/investment-funds/strategic-science-investment-fund/ssif-funded-programmes/university-of-auckland/)
4848
awarded to the University of Auckland.
4949

5050
MLJ has been developed with the support of the following organizations:
5151

5252
<div align="center">
53+
<img src="material/DFKI.png" width = 100/>
5354
<img src="material/Turing_logo.png" width = 100/>
5455
<img src="material/UoA_logo.png" width = 100/>
5556
<img src="material/IQVIA_logo.png" width = 100/>

docs/src/about_mlj.md

100755100644
Lines changed: 11 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# About MLJ
22

3-
MLJ (Machine Learning in Julia) is a toolbox written in Julia
3+
MLJ (Machine Learning in Julia) is a toolbox written in Julia
44
providing a common interface and meta-algorithms for selecting,
55
tuning, evaluating, composing and comparing [over 180 machine learning
66
models](@ref model_list) written in Julia and other languages. In
@@ -22,8 +22,7 @@ The first code snippet below creates a new Julia environment
2222
[Installation](@ref) for more on creating a Julia environment for use
2323
with MLJ.
2424

25-
Julia installation instructions are
26-
[here](https://julialang.org/downloads/).
25+
Julia installation instructions are [here](https://julialang.org/downloads/).
2726

2827
```julia
2928
using Pkg
@@ -44,7 +43,7 @@ Loading and instantiating a gradient tree-boosting model:
4443
using MLJ
4544
Booster = @load EvoTreeRegressor # loads code defining a model type
4645
booster = Booster(max_depth=2) # specify hyper-parameter at construction
47-
booster.nrounds=50 # or mutate afterwards
46+
booster.nrounds = 50 # or mutate afterwards
4847
```
4948

5049
This model is an example of an iterative model. As it stands, the
@@ -92,7 +91,7 @@ it "self-tuning":
9291
```julia
9392
self_tuning_pipe = TunedModel(model=pipe,
9493
tuning=RandomSearch(),
95-
ranges = max_depth_range,
94+
ranges=max_depth_range,
9695
resampling=CV(nfolds=3, rng=456),
9796
measure=l1,
9897
acceleration=CPUThreads(),
@@ -105,12 +104,12 @@ Loading a selection of features and labels from the Ames
105104
House Price dataset:
106105

107106
```julia
108-
X, y = @load_reduced_ames;
107+
X, y = @load_reduced_ames
109108
```
110109
Evaluating the "self-tuning" pipeline model's performance using 5-fold
111110
cross-validation (implies multiple layers of nested resampling):
112111

113-
```julia
112+
```julia-repl
114113
julia> evaluate(self_tuning_pipe, X, y,
115114
measures=[l1, l2],
116115
resampling=CV(nfolds=5, rng=123),
@@ -155,8 +154,7 @@ Extract:
155154

156155
* Consistent interface to handle probabilistic predictions.
157156

158-
* Extensible [tuning
159-
interface](https://github.com/JuliaAI/MLJTuning.jl),
157+
* Extensible [tuning interface](https://github.com/JuliaAI/MLJTuning.jl),
160158
to support a growing number of optimization strategies, and designed
161159
to play well with model composition.
162160

@@ -229,19 +227,19 @@ installed in a new
229227
[environment](https://julialang.github.io/Pkg.jl/v1/environments/) to
230228
avoid package conflicts. You can do this with
231229

232-
```julia
230+
```julia-repl
233231
julia> using Pkg; Pkg.activate("my_MLJ_env", shared=true)
234232
```
235233

236234
Installing MLJ is also done with the package manager:
237235

238-
```julia
236+
```julia-repl
239237
julia> Pkg.add("MLJ")
240238
```
241239

242240
**Optional:** To test your installation, run
243241

244-
```julia
242+
```julia-repl
245243
julia> Pkg.test("MLJ")
246244
```
247245

@@ -252,7 +250,7 @@ environment to make model-specific code available. This
252250
happens automatically when you use MLJ's interactive load command
253251
`@iload`, as in
254252

255-
```julia
253+
```julia-repl
256254
julia> Tree = @iload DecisionTreeClassifier # load type
257255
julia> tree = Tree() # instance
258256
```

docs/src/adding_models_for_general_use.md

100755100644
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,4 @@ suitable for addition to the MLJ Model Registry, consult the [MLJModelInterface.
55
documentation](https://juliaai.github.io/MLJModelInterface.jl/dev/).
66

77
For quick-and-dirty user-defined models see [Simple User Defined
8-
Models](simple_user_defined_models.md).
8+
Models](simple_user_defined_models.md).

docs/src/api.md

100755100644
File mode changed.

0 commit comments

Comments
 (0)