-
Notifications
You must be signed in to change notification settings - Fork 43
Description
In line with #416, I propose we move UnivariateFinite
out to a new package called CategoricalDistributions.jl
.
If this were okay with the current host of MLJBase.jl (I need to check this @vollmersj) it might make sense for this package to live at JuliaData (host of CategoricalArrays.jl) or JuliaStats (host of Distrtibutions.jl). I wonder what curators of those organisations think of that idea?
@nalimilan @bkamins @andreasnoack @devmotion @matbesancon
Recall that UnivariateFinite
consists of the following:
-
A composite type
UnivariateFinite{S,V,R,P<:Real}
for encoding the probability distribution associated with a finite labelled set of points, as opposed to the distributionCategorical
from Distributions.jl, whose sample space is always a collection of integers. The sample space of aUnivariateFinite
instance is aCategoricalPool
object from CategoricalArrays.jl. -
Implementation of relevant parts of the Distributions.jl API, including
rand
,pdf
,logpdf
support
,params
,mode
, andfit
(which fits to aCatgoricalVector
). -
A wrapper
UnivariateFiniteArray
for arrays of such objects (sharing a common sample space / pool). This type, implementing theAbstractArray
API, is optimised for fast indexing, and for broadcasting ofpdf
, andlogpdf
(which turned out to be essential in our applications to machine learning). -
A fairly elaborate constructor for
UnivariateFiniteArray
objects from matrices of probabilities. See this docstring
Technical note. I'm hoping this migration should be fairly painless but there is one issue to be aware of: Currently the UnivariateFinite
constructor stub lives in MLJModelInterface but the type and all real functionality lives in MLJBase (which depends on MLJModelInterface). The reason for this was to keep MLJModelInterface (the sole dependency of third party packages inplementing MLJ's model API) super lightweight. So this needs sorting out.