-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-9471] [ML] Multilayer Perceptron #7621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test build #38237 has finished for PR 7621 at commit
|
Test build #38239 has finished for PR 7621 at commit
|
Test build #38242 has finished for PR 7621 at commit
|
* Implements Layer instantiation. | ||
* | ||
*/ | ||
private[ann] trait Layer extends Serializable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be public. Some people may want to customize it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mengxr suggested to make it private in this release to be able to modify it if needed and make public in the next one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see. This is understandable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the benefit of separating the Layer and LayerModel this way. Can we have a unified Layer, just by moving the getInstance methods to LayerModel and rename it to Layer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a separate lightweight instance for holding the Layer properties, so we can pass it easily through the network to executors on each iteration. We don't want to pass LayerModels because they contain weights and weights are transmitted by the GradientDescent or LBFGS broadcast/treeAggregate routines. Does it answer your question?
layers: Array[Int], | ||
weights: Vector) | ||
extends PredictionModel[Vector, MultilayerPerceptronClassifierModel] | ||
with Serializable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Support model save/load?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know if there exist a generic model loader/saver in Spark ML? I can think only about using sc.parallelize(Seq(model), 1).saveAsObjectFile("model")
that does not look good, honestly speaking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do that in a follow-up PR to keep this PR minimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see.
Test build #38358 has finished for PR 7621 at commit
|
* @return delta | ||
*/ | ||
def prevDelta(nextDelta: BDM[Double], input: BDM[Double]): BDM[Double] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible that prevDelta needs to get some properties from the next layer. (at least for CNN). I'm not sure if it's a special (rare) requirement. Of course there's some workaround, yet it will be handy if there's a reference to the next layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate on which properties are needed from the next layer? @witgo pointed that weights initialization in AffineLayer depends on the next FunctionalLayer layer https://github.com/apache/spark/pull/7621/files#r35435404. The latter can be handled in the ANN fabric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the number of feature maps in the next layer
@avulanov Could you update the PR title to be more specific, e.g., "[SPARK-2352][ML] Multilayer Perceptron"? |
with Logging { | ||
|
||
/** @group setParam */ | ||
def setInputCol(value: String): this.type = set(inputCol, value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to use inputCol
and outputCol
instead of featuresCol
and labelCol
?
Test build #39037 has finished for PR 7621 at commit
|
…d DataStacker class
@mengxr Thank you for your review! All comments are addressed. Should I change the name of my git branch (it is SPARK-2352-ann)? |
Thanks! The branch name doesn't matter:) I will make another pass today. |
Test build #39045 has finished for PR 7621 at commit
|
LGTM |
*/ | ||
def setSeed(value: Long): this.type = set(seed, value) | ||
|
||
setDefault(maxIter -> 100, tol -> 1e-4, layers -> Array(1, 1), blockSize -> 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove default value for layers
. Shall we change default block size to a power of 2, e.g., 128
?
@avulanov I haven't finished a full pass. But feel free to update the PR to address current comments. |
@mengxr Thanks again! Done. |
Test build #39126 has finished for PR 7621 at commit
|
LGTM. Merged into master. There are some minor issues we can address during the QA period. Thanks @avulanov for the implementation and performance tests, and everyone who helped!! |
@avulanov There is no way to access probabilities of each class on the MultilayerPerceptronClassificationModel before the predict method decode it as a label ? It would be interesting to have such feature but instead the model is a blackbox with everything private. Any reason for that ? |
@sbrouil Indeed, it worth implementing this feature. We have already discussed it in the mailing list https://mail-archives.apache.org/mod_mbox/spark-user/201511.mbox/%3C9D5B00849D2CDA4386BDA89E83F69E6C194CF351@G9W0737.americas.hpqcorp.net%3E |
Summary
This pull request contains the following feature for ML:
This implementation is based on our initial pull request with @bgreeven: #1290 and inspired by very insightful suggestions from @mengxr and @witgo (I would like to thank all other people from the mentioned thread for useful discussions). The original code was extensively tested and benchmarked. Since then, I've addressed two main requirements that prevented the code from merging into the main branch:
Layer
andLayerModel
. They are used for constructing layers of ANN. New layers can be added by extending theLayer
andLayerModel
traits. These traits are private in this release in order to save path to improve them based on community feedbackOther implementations based on the proposed interface
@mengxr and @dbtsai kindly agreed to perform code review.