-
Notifications
You must be signed in to change notification settings - Fork 28.8k
[SPARK-3912][Streaming] Fixed flakyFlumeStreamSuite #2773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
QA tests have started for PR 2773 at commit
|
QA tests have finished for PR 2773 at commit
|
Test PASSed. |
Jenkins, test this please. |
QA tests have started for PR 2773 at commit
|
QA tests have finished for PR 2773 at commit
|
Test PASSed. |
while (outputBuffer.size < input.size && System.currentTimeMillis() - startTime < maxWaitTimeMillis) { | ||
logInfo("output.size = " + outputBuffer.size + ", input.size = " + input.size) | ||
Thread.sleep(100) | ||
eventually(timeout(10 seconds), interval(100 milliseconds)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to retry this test multiple times? The usual case where the test fails is mainly because of bind issues, correct? Since findFreePort (sort of) takes care of that..this does not seem to help.
There is a small race condition that can be taken care of, using eventually though - where the free port is taken before the bind, in which case we can use a new free port, by calling findFreePort inside the eventually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I found that a lot of time the test was failing because of the uncertainty on when the Flume receiver is ready to receive the new connection. Even after the connection gets accepted, sending data does not return Status.OK (and the data that streaming receives has empty strings). I am not sure what is the reason behind this but this seems like a fairly robust way to send all the data once in a single shot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, sounds good!
+1. This looks good. |
@harishreedharan @pwendell
See JIRA for diagnosis of the problem
https://issues.apache.org/jira/browse/SPARK-3912
The solution was to reimplement it.