-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-5845][Shuffle] Time to cleanup spilled shuffle files not included in shuffle write time #4965
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test build #28438 has started for PR 4965 at commit
|
Test build #28438 has finished for PR 4965 at commit
|
Test FAILed. |
Test build #28449 has started for PR 4965 at commit
|
retest this please |
Test build #28449 has finished for PR 4965 at commit
|
Test FAILed. |
@@ -88,7 +88,13 @@ private[spark] class SortShuffleWriter[K, V, C]( | |||
} finally { | |||
// Clean up our sorter, which may have its own intermediate files | |||
if (sorter != null) { | |||
val startTime = System.nanoTime() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CC @kayousterhout
Just checking, this is meant to be in nanos and not milliseconds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Sean - other usages of writeMetrics.incShuffleWriteTime also use nanoTime(). Please see BlockObjectWriter::callWithTiming() and ExternalSorter::writePartitionedFile.
retest this please |
Test build #28475 has started for PR 4965 at commit
|
Test build #28475 has finished for PR 4965 at commit
|
Test PASSed. |
context.taskMetrics().shuffleWriteMetrics.getOrElse({ | ||
metrics : ShuffleWriteMetrics => | ||
metrics.incShuffleWriteTime(System.nanoTime()-startTime) | ||
},Nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. As a matter of style I think it would be better to...
context.taskMetrics.shuffleWriteMetrics.foreach(
_.incShuffleWriteTime(System.nanoTime - startTime))
Which is what ExternalSorter
does. This looks like the correct bit to time.
Test build #28538 has started for PR 4965 at commit
|
Test build #28538 has finished for PR 4965 at commit
|
Test PASSed. |
I've added a timer in the right place to fix this inaccuracy.