@@ -500,11 +500,14 @@ function tag_with_docstring(model_name::Symbol, description::String, bottom_matt
500
500
Note that if you pass complex data `::Complex{L}`, then the loss
501
501
type will automatically be set to `L`.
502
502
- `selection_method::Function`: Function to selection expression from
503
- the Pareto frontier for use in `predict`. See `SymbolicRegression.MLJInterfaceModule.choose_best`
504
- for an example. This function should return a single integer specifying
505
- the index of the expression to use. By default, `choose_best` maximizes
503
+ the Pareto frontier for use in `predict`.
504
+ See `SymbolicRegression.MLJInterfaceModule.choose_best` for an example.
505
+ This function should return a single integer specifying
506
+ the index of the expression to use. By default, this maximizes
506
507
the score (a pound-for-pound rating) of expressions reaching the threshold
507
- of 1.5x the minimum loss. To fix the index at `5`, you could just write `Returns(5)`.
508
+ of 1.5x the minimum loss. To override this at prediction time, you can pass
509
+ a named tuple with keys `data` and `idx` to `predict`. See the Operations
510
+ section for details.
508
511
- `dimensions_type::AbstractDimensions`: The type of dimensions to use when storing
509
512
the units of the data. By default this is `DynamicQuantities.SymbolicDimensions`.
510
513
"""
@@ -515,7 +518,7 @@ function tag_with_docstring(model_name::Symbol, description::String, bottom_matt
515
518
- `predict(mach, Xnew)`: Return predictions of the target given features `Xnew`, which
516
519
should have same scitype as `X` above. The expression used for prediction is defined
517
520
by the `selection_method` function, which can be seen by viewing `report(mach).best_idx`.
518
- - `predict(mach, (; data=Xnew, idx=i))`: Return predictions of the target given features
521
+ - `predict(mach, (data=Xnew, idx=i))`: Return predictions of the target given features
519
522
`Xnew`, which should have same scitype as `X` above. By passing a named tuple with keys
520
523
`data` and `idx`, you are able to specify the equation you wish to evaluate in `idx`.
521
524
@@ -578,7 +581,8 @@ eval(
578
581
Note that unlike other regressors, symbolic regression stores a list of
579
582
trained models. The model chosen from this list is defined by the function
580
583
`selection_method` keyword argument, which by default balances accuracy
581
- and complexity.
584
+ and complexity. You can override this at prediction time by passing a named
585
+ tuple with keys `data` and `idx`.
582
586
583
587
""" ,
584
588
r" ^ " => " " ,
@@ -590,7 +594,8 @@ eval(
590
594
The fields of `fitted_params(mach)` are:
591
595
592
596
- `best_idx::Int`: The index of the best expression in the Pareto frontier,
593
- as determined by the `selection_method` function.
597
+ as determined by the `selection_method` function. Override in `predict` by passing
598
+ a named tuple with keys `data` and `idx`.
594
599
- `equations::Vector{Node{T}}`: The expressions discovered by the search, represented
595
600
in a dominating Pareto frontier (i.e., the best expressions found for
596
601
each complexity). `T` is equal to the element type
@@ -701,7 +706,8 @@ eval(
701
706
Note that unlike other regressors, symbolic regression stores a list of lists of
702
707
trained models. The models chosen from each of these lists is defined by the function
703
708
`selection_method` keyword argument, which by default balances accuracy
704
- and complexity.
709
+ and complexity. You can override this at prediction time by passing a named
710
+ tuple with keys `data` and `idx`.
705
711
706
712
""" ,
707
713
r" ^ " => " " ,
@@ -713,7 +719,8 @@ eval(
713
719
The fields of `fitted_params(mach)` are:
714
720
715
721
- `best_idx::Vector{Int}`: The index of the best expression in each Pareto frontier,
716
- as determined by the `selection_method` function.
722
+ as determined by the `selection_method` function. Override in `predict` by passing
723
+ a named tuple with keys `data` and `idx`.
717
724
- `equations::Vector{Vector{Node{T}}}`: The expressions discovered by the search, represented
718
725
in a dominating Pareto frontier (i.e., the best expressions found for
719
726
each complexity). The outer vector is indexed by target variable, and the inner
@@ -727,7 +734,8 @@ eval(
727
734
The fields of `report(mach)` are:
728
735
729
736
- `best_idx::Vector{Int}`: The index of the best expression in each Pareto frontier,
730
- as determined by the `selection_method` function.
737
+ as determined by the `selection_method` function. Override in `predict` by passing
738
+ a named tuple with keys `data` and `idx`.
731
739
- `equations::Vector{Vector{Node{T}}}`: The expressions discovered by the search, represented
732
740
in a dominating Pareto frontier (i.e., the best expressions found for
733
741
each complexity). The outer vector is indexed by target variable, and the inner
0 commit comments