You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/ml-guide.md
+54Lines changed: 54 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -157,6 +157,60 @@ There are now several algorithms in the Pipelines API which are not in the lower
157
157
*[Feature Extraction, Transformation, and Selection](ml-features.html)
158
158
*[Ensembles](ml-ensembles.html)
159
159
160
+
## Linear Methods with Elastic Net Regularization
161
+
162
+
[Elastic net](http://users.stat.umn.edu/~zouxx019/Papers/elasticnet.pdf) is a hybrid of L1 and L2 regularization. Mathematically it is defined as a linear combination of the L1-norm and the L2-norm:
By setting $\alpha$ properly, it contains both L1 and L2 regularization as special cases. We implement both linear regression and logistict regression with elastic net regularization.
167
+
168
+
**Examples**
169
+
170
+
<divclass="codetabs">
171
+
172
+
<divdata-lang="scala"markdown="1">
173
+
The following code snippet illustrates how to load a sample dataset, execute a
174
+
training algorithm on this training data using a static method in the algorithm
175
+
object, and make predictions with the resulting model to compute the training
176
+
error.
177
+
178
+
{% highlight scala %}
179
+
180
+
{% endhighlight %}
181
+
182
+
</div>
183
+
184
+
<divdata-lang="java"markdown="1">
185
+
All of MLlib's methods use Java-friendly types, so you can import and call them there the same
186
+
way you do in Scala. The only caveat is that the methods take Scala RDD objects, while the
187
+
Spark Java API uses a separate `JavaRDD` class. You can convert a Java RDD to a Scala one by
188
+
calling `.rdd()` on your `JavaRDD` object. A self-contained application example
189
+
that is equivalent to the provided example in Scala is given bellow:
190
+
191
+
{% highlight java %}
192
+
193
+
{% endhighlight %}
194
+
</div>
195
+
196
+
<divdata-lang="python"markdown="1">
197
+
The following example shows how to load a sample dataset, build Logistic Regression model,
198
+
and make predictions with the resulting model to compute the training error.
199
+
200
+
Note that the Python API does not yet support model save/load but will in the future.
201
+
202
+
{% highlight python %}
203
+
204
+
{% endhighlight %}
205
+
206
+
</div>
207
+
208
+
</div>
209
+
210
+
### Optimization
211
+
212
+
The optimization algorithm underlies the implementation is called [Orthant-Wise Limited-memory QuasiNewton](http://research-srv.microsoft.com/en-us/um/people/jfgao/paper/icml07scalable.pdf)
213
+
(OWL-QN). It is an extension of L-BFGS that can effectively handle L1 regularization and elastic net.
L2-regularized problems are generally easier to solve than L1-regularized due to smoothness.
109
112
However, L1 regularization can help promote sparsity in weights leading to smaller and more interpretable models, the latter of which can be useful for feature selection.
110
-
It is not recommended to train models without any regularization,
113
+
[Elastic net](http://users.stat.umn.edu/~zouxx019/Papers/elasticnet.pdf) is a combination of L1 and L2 regularization. It is not recommended to train models without any regularization,
111
114
especially when the number of training examples is small.
0 commit comments