-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-4075][SPARK-4434] Fix the URI validation logic for Application Jar name. #3326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test build #23503 has started for PR 3326 at commit
|
Test build #23503 has finished for PR 3326 at commit
|
Test PASSed. |
@@ -24,6 +24,7 @@ class ClientSuite extends FunSuite with Matchers { | |||
test("correctly validates driver jar URL's") { | |||
ClientArguments.isValidJarUrl("http://someHost:8080/foo.jar") should be (true) | |||
ClientArguments.isValidJarUrl("file://some/path/to/a/jarFile.jar") should be (true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not your change, but this should not be a valid jar URL right? My understanding is that one or three slashes are valid, but not two.
Hey @sarutak also are you resubmitting your original change for SPARK-4434? |
O.K, I'll resubmit original change including this test case and fixing to handle double slashed |
In deploy.ClientArguments.isValidJarUrl, the url is checked as follows. def isValidJarUrl(s: String): Boolean = s.matches("(.+):(.+)jar") So, it allows like 'hdfs:file.jar' (no authority). Author: Kousuke Saruta <[email protected]> Closes apache#2925 from sarutak/uri-syntax-check-improvement and squashes the following commits: cf06173 [Kousuke Saruta] Improved URI syntax checking
Test build #23520 has started for PR 3326 at commit
|
Test build #23520 has finished for PR 3326 at commit
|
Test PASSed. |
Thanks, can you rename the title of the PR now that this is not just tests anymore? |
Thanks for pointing out. I've now modified. |
@@ -73,7 +75,7 @@ private[spark] class ClientArguments(args: Array[String]) { | |||
|
|||
if (!ClientArguments.isValidJarUrl(_jarUrl)) { | |||
println(s"Jar url '${_jarUrl}' is not in valid format.") | |||
println(s"Must be a jar file path in URL format (e.g. hdfs://XX.jar, file://XX.jar)") | |||
println(s"Must be a jar file path in URL format (e.g. hdfs://XX.jar, file:///XX.jar)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're changing the file URL might as well change the hdfs one.
Just nits w.r.t. naming of things, otherwise looks ok. |
Test build #23538 has started for PR 3326 at commit
|
Test build #23538 has finished for PR 3326 at commit
|
Test FAILed. |
Test build #23539 has started for PR 3326 at commit
|
Test build #23539 has finished for PR 3326 at commit
|
Test PASSed. |
Test build #23552 has started for PR 3326 at commit
|
ClientArguments.isValidJarUrl("file://some/path/to/a/jarFile.jar") should be (true) | ||
|
||
// file scheme with authority and path is valid. | ||
ClientArguments.isValidJarUrl("file://somehost/path/to/a/jarFile.jar") should be (true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, why is this valid?
I think, double-slashed file scheme is valid in according to the specification of URI. Or, should we invalid double-slashed file scheme ( |
@andrewor14 all the URIs you mention as not valid are actually valid.
So those are all valid URIs, which is why no exception is thrown when you construct the URI object. |
I see. By "valid" here we mean the URI provides some combination of the scheme, the path and the authority. However, when you do two-slashes on
I suppose then it is the user's responsibility to not make this mistake. We can't prevent this because "path" could be a valid authority or hostname here and we won't be able to tell the difference. |
That's because you assume wrong. :-) "//" is the indication that the next component is the authority. You can omit the authority by not specifying "//". "file" is not special. It works just like any other URI (http, hdfs or otherwise). |
Test build #23552 has finished for PR 3326 at commit
|
Test PASSed. |
Ok, I'm merging this into master and 1.2. Not going to merge this into 1.1 because the benefit here is not worth potentially causing another regression. |
… Jar name. This PR adds a regression test for SPARK-4434. Author: Kousuke Saruta <[email protected]> Closes apache#3326 from sarutak/add-triple-slash-testcase and squashes the following commits: 82bc9cc [Kousuke Saruta] Fixed wrong grammar in comment 9149027 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into add-triple-slash-testcase c1c80ca [Kousuke Saruta] Fixed style 4f30210 [Kousuke Saruta] Modified comments 9e09da2 [Kousuke Saruta] Fixed URI validation for jar file d4b99ef [Kousuke Saruta] [SPARK-4075] [Deploy] Jar url validation is not enough for Jar file ac79906 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into add-triple-slash-testcase 6d4f47e [Kousuke Saruta] Added a test case as a regression check for SPARK-4434
… Jar name. This PR adds a regression test for SPARK-4434. Author: Kousuke Saruta <[email protected]> Closes #3326 from sarutak/add-triple-slash-testcase and squashes the following commits: 82bc9cc [Kousuke Saruta] Fixed wrong grammar in comment 9149027 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into add-triple-slash-testcase c1c80ca [Kousuke Saruta] Fixed style 4f30210 [Kousuke Saruta] Modified comments 9e09da2 [Kousuke Saruta] Fixed URI validation for jar file d4b99ef [Kousuke Saruta] [SPARK-4075] [Deploy] Jar url validation is not enough for Jar file ac79906 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into add-triple-slash-testcase 6d4f47e [Kousuke Saruta] Added a test case as a regression check for SPARK-4434 (cherry picked from commit bfebfd8) Signed-off-by: Andrew Or <[email protected]>
Hey @sarutak mind closing this? It's already merged |
Thanks for notification. I close this PR. |
This PR adds a regression test for SPARK-4434.