-
Notifications
You must be signed in to change notification settings - Fork 28.7k
SPARK-6548 Adding stddev to DataFrame functions #6297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This reverts commit c40701a.
This reverts commit 3e7d889.
This reverts commit 634d6a7.
This reverts commit 125f3ae.
This reverts commit dfaa971.
This reverts commit ace454d.
This reverts commit 9c84695. Conflicts: docs/running-on-yarn.md
This reverts commit a399aa6. Conflicts: docs/running-on-yarn.md
@@ -292,6 +293,7 @@ class SqlParser extends AbstractSparkSQLParser with DataTypeParser { | |||
| AVG ~ "(" ~> expression <~ ")" ^^ { case exp => Average(exp) } | |||
| MIN ~ "(" ~> expression <~ ")" ^^ { case exp => Min(exp) } | |||
| MAX ~ "(" ~> expression <~ ")" ^^ { case exp => Max(exp) } | |||
| STDDEV ~ "(" ~> expression <~ ")" ^^ { case exp => Stddev(exp)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have changed how these plug in. You'll need to change the FunctionRegistry now.
Please don't test it yet, need to make change to accomodate API change introduced by other JIRA. |
Test build #38399 has finished for PR 6297 at commit
|
@JihongMA Will you get time to implement the function based on the new API? It will be good if we can merge it before the 1.5 deadline for new features (end of this month). |
Test build #41730 has finished for PR 6297 at commit
|
Test build #41732 has finished for PR 6297 at commit
|
Test build #41748 has finished for PR 6297 at commit
|
Test build #42006 has finished for PR 6297 at commit
|
R style check failure is caused by commit of SPARK-8951 |
Test build #42062 has finished for PR 6297 at commit
|
override def inputTypes: Seq[AbstractDataType] = Seq(TypeCollection(NumericType, NullType)) | ||
|
||
private val resultType = child.dataType match { | ||
case DecimalType.Fixed(p, s) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should always return Double, because Sqrt() only works with Double, also other databases just return Double/float.
Test build #42366 has finished for PR 6297 at commit
|
LGTM, merging this into master, thanks! |
Adding STDDEV support for DataFrame using 1-pass online /parallel algorithm to compute variance. Please review the code change.