-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-11078] Ensure spilling tests actually spill #9124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit does several things: - remove noisy warning in GrantEverythingMemoryManager - remove duplciate code in ExternalSorterSuite - add a force spill threshold to make it easier to verify spilling - ensure spilling tests actually spill in ExternalSorterSuite
Test build #43747 has finished for PR 9124 at commit
|
Test build #1904 has finished for PR 9124 at commit
|
sc = new SparkContext("local-cluster[1,1,1024]", "test", conf) | ||
|
||
def createCombiner(i: String): ArrayBuffer[String] = ArrayBuffer[String](i) | ||
def mergeValue(buffer: ArrayBuffer[String], i: String): ArrayBuffer[String] = buffer += i | ||
def mergeCombiners(buffer1: ArrayBuffer[String], buffer2: ArrayBuffer[String]) | ||
: ArrayBuffer[String] = buffer1 ++= buffer2 | ||
: ArrayBuffer[String] = buffer1 ++= buffer2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
def mergeCombiners(
buffer1: ArrayBuffer[String],
buffer2: ArrayBuffer[String]): ArrayBuffer[String]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops I screwed that up
LGTM over all |
OK, merging. |
Test build #43812 has finished for PR 9124 at commit
|
Author: Andrew Or <[email protected]> Closes apache#9124 from andrewor14/spilling-tests.
@andrewor14 looks like we still have some failures of the form...
Do you think it's possible that somehow there's a race here in reporting the metrics? so that things are spilling but not reported to the listener by the time the number of spilled tasks is checked? I could try putting in an |
I see, that's possible. The right thing to do here is to add a |
#9084 uncovered that many tests that test spilling don't actually spill. This is a follow-up patch to fix that to ensure our unit tests actually catch potential bugs in spilling. The size of this patch is inflated by the refactoring of
ExternalSorterSuite
, which had a lot of duplicate code and logic.