Skip to content

[SPARK-5845][Shuffle] Time to cleanup spilled shuffle files not included in shuffle write time #4965

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

ilganeli
Copy link

I've added a timer in the right place to fix this inaccuracy.

@SparkQA
Copy link

SparkQA commented Mar 10, 2015

Test build #28438 has started for PR 4965 at commit 9434b50.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 10, 2015

Test build #28438 has finished for PR 4965 at commit 9434b50.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28438/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Mar 11, 2015

Test build #28449 has started for PR 4965 at commit b946d08.

  • This patch merges cleanly.

@ilganeli
Copy link
Author

retest this please

@SparkQA
Copy link

SparkQA commented Mar 11, 2015

Test build #28449 has finished for PR 4965 at commit b946d08.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28449/
Test FAILed.

@@ -88,7 +88,13 @@ private[spark] class SortShuffleWriter[K, V, C](
} finally {
// Clean up our sorter, which may have its own intermediate files
if (sorter != null) {
val startTime = System.nanoTime()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC @kayousterhout
Just checking, this is meant to be in nanos and not milliseconds?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Sean - other usages of writeMetrics.incShuffleWriteTime also use nanoTime(). Please see BlockObjectWriter::callWithTiming() and ExternalSorter::writePartitionedFile.

@ilganeli
Copy link
Author

retest this please

@SparkQA
Copy link

SparkQA commented Mar 11, 2015

Test build #28475 has started for PR 4965 at commit 3e059b0.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 11, 2015

Test build #28475 has finished for PR 4965 at commit 3e059b0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28475/
Test PASSed.

@ilganeli ilganeli changed the title [SPARK-5845] Time to cleanup spilled shuffle files not included in shuffle write time [SPARK-5845][Core] Time to cleanup spilled shuffle files not included in shuffle write time Mar 12, 2015
@ilganeli ilganeli changed the title [SPARK-5845][Core] Time to cleanup spilled shuffle files not included in shuffle write time [SPARK-5845][Shuffle] Time to cleanup spilled shuffle files not included in shuffle write time Mar 12, 2015
context.taskMetrics().shuffleWriteMetrics.getOrElse({
metrics : ShuffleWriteMetrics =>
metrics.incShuffleWriteTime(System.nanoTime()-startTime)
},Nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. As a matter of style I think it would be better to...

context.taskMetrics.shuffleWriteMetrics.foreach(
          _.incShuffleWriteTime(System.nanoTime - startTime))

Which is what ExternalSorter does. This looks like the correct bit to time.

@SparkQA
Copy link

SparkQA commented Mar 12, 2015

Test build #28538 has started for PR 4965 at commit bfabf88.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 13, 2015

Test build #28538 has finished for PR 4965 at commit bfabf88.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28538/
Test PASSed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants