[SPARK-18890][CORE] Move task serialization from the TaskSetManager to the CoarseGrainedSchedulerBackend #15505

witgo · 2016-10-16T08:30:15Z

What changes were proposed in this pull request?

Performance Testing:

The code:

val rdd = sc.parallelize(0 until 100).repartition(100000)
rdd.localCheckpoint().count()
rdd.sum()
(1 to 10).foreach{ i=>
  val serializeStart = System.currentTimeMillis()
  rdd.sum()
  val serializeFinish = System.currentTimeMillis()
  println(f"Test $i: ${(serializeFinish - serializeStart) / 1000D}%1.2f")
}

and spark-defaults.conf file:

spark.master                                      yarn-client
spark.executor.instances                          20
spark.driver.memory                               64g
spark.executor.memory                             30g
spark.executor.cores                              5
spark.default.parallelism                         100 
spark.sql.shuffle.partitions                      100
spark.serializer                                  org.apache.spark.serializer.KryoSerializer
spark.driver.maxResultSize                        0
spark.ui.enabled                                  false 
spark.driver.extraJavaOptions                     -XX:+UseG1GC -XX:+UseStringDeduplication -XX:G1HeapRegionSize=16M -XX:MetaspaceSize=512M 
spark.executor.extraJavaOptions                   -XX:+UseG1GC -XX:+UseStringDeduplication -XX:G1HeapRegionSize=16M -XX:MetaspaceSize=256M 
spark.cleaner.referenceTracking.blocking          true
spark.cleaner.referenceTracking.blocking.shuffle  true

The test results are as follows:

SPARK-17931	`db0ddce`
9.427 s	9.566 s

How was this patch tested?

Existing tests.

SparkQA · 2016-10-16T12:42:29Z

Test build #67033 has finished for PR 15505 at commit b51d00c.

This patch fails from timeout after a configured wait of 250m.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-16T16:32:31Z

Test build #67035 has finished for PR 15505 at commit 771949e.

This patch fails from timeout after a configured wait of 250m.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-17T08:42:36Z

Test build #67062 has finished for PR 15505 at commit bee165a.

This patch fails MiMa tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-17T10:42:34Z

Test build #67056 has finished for PR 15505 at commit 8a6062d.

This patch fails from timeout after a configured wait of 250m.
This patch merges cleanly.
This patch adds no public classes.

wzhfy · 2016-10-18T02:52:03Z

There are many unnecessary changes, can you recover them to minimize diff? That'll be easier for others to review. :)

witgo · 2016-10-18T03:58:17Z

@wzhfy
Ok, the code has been modified

SparkQA · 2016-10-18T03:59:34Z

Test build #67109 has finished for PR 15505 at commit d956ff5.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-18T04:09:34Z

Test build #67110 has finished for PR 15505 at commit ca9da40.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-18T16:37:39Z

Test build #67126 has finished for PR 15505 at commit 80eed8f.

This patch fails from timeout after a configured wait of 250m.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-19T05:37:36Z

Test build #67159 has finished for PR 15505 at commit 589f3bb.

This patch fails from timeout after a configured wait of 250m.
This patch merges cleanly.
This patch adds no public classes.

witgo · 2016-10-19T06:50:42Z

cc @rxin

SparkQA · 2016-10-19T12:17:34Z

Test build #67179 has finished for PR 15505 at commit 84488c4.

This patch fails from timeout after a configured wait of 250m.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-19T13:22:35Z

Test build #67184 has finished for PR 15505 at commit 8a6b37e.

This patch fails from timeout after a configured wait of 250m.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-20T09:01:37Z

Test build #67242 has finished for PR 15505 at commit 07b0581.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-11-04T18:31:57Z

Test build #68141 has finished for PR 15505 at commit ace0114.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

rxin · 2016-11-04T19:07:15Z

This is pretty big and will miss 2.1.

But cc @kayousterhout / @squito / @JoshRosen

wzhfy · 2016-11-07T01:37:45Z

core/src/main/scala/org/apache/spark/executor/Executor.scala

@@ -486,7 +481,7 @@ private[spark] class Executor(
   * Download any missing dependencies if we receive a new set of files and JARs from the
   * SparkContext. Also adds any new JARs we fetched to the class loader.
   */
-  private def updateDependencies(newFiles: HashMap[String, Long], newJars: HashMap[String, Long]) {
+  private def updateDependencies(newFiles: Map[String, Long], newJars: Map[String, Long]) {


Is this change necessary?

This is not necessary, I will change it back.

@wzhfy The source of this parameter is sc.addedFiles and sc.addedJars, Their types are mutable.Map[String, Long] , This change is reasonable.

witgo · 2016-11-21T08:42:33Z

ping @kayousterhout / @squito / @JoshRosen

squito

Just did a very brief review -- the idea here makes a lot of sense. The only big problem I see with the patch now is the tests that have been eliminated, some of those we definitely need to bring back. But I need to do a longer pass to think about how the pieces fit together.

squito · 2016-11-22T17:19:09Z

core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala

-    _serializedTask: ByteBuffer)
-  extends Serializable {
+private[spark] class TaskDescription private(
+  val taskId: Long,


nit: double indent (4 spaces) for constructor params

squito · 2016-11-22T17:19:30Z

core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala

+  private var taskProps: Properties) {
+
+  def this(taskId: Long,
+    attemptNumber: Int,


nit: each parameter on its own line (even the first one), and double indent all params

squito · 2016-11-22T17:20:32Z

core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala

@@ -139,29 +139,6 @@ class TaskSchedulerImplSuite extends SparkFunSuite with LocalSparkContext with B
    assert(!failedTaskSet)
  }

-  test("Scheduler does not crash when tasks are not serializable") {


doesn't look like there is any replacement for this test, right? We certainly want to keep this check in some form.

Yes.
For rdd and closures, there are already related test cases, see DAGSchedulerSuite.scala#L506
For task, user can use a custom partition, and the partition instance may not be serializable. This has no test cases I will add one.

The test case has been added.

If I understand right, you are saying there are now test cases which cover all of the little individual pieces inside a Task, so we don't need an explicit test for the serializing the entire task.

I'd still prefer a test for serializing the entire task (to prevent future regressions), but I guess its hard to do b/c now the task actually gets serialized by the SchedulerBackends.

squito · 2016-11-22T17:21:32Z

core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala

@@ -592,47 +579,6 @@ class TaskSetManagerSuite extends SparkFunSuite with LocalSparkContext with Logg
    assert(manager.resourceOffer("execB", "host2", RACK_LOCAL).get.index === 1)
  }

-  test("do not emit warning when serialized task is small") {


same thing on replacements for these tests. we definitely want to keep the test on not-serializability. The others should probably stay as well, unless there is some reason why its extremely hard to do.

For ShuffleMapTask and ResultTask, since they do not contain rdd instance, they are very small.
Detection of the size of the serialized RDD should be more reasonable. I added the corresponding code in the DAGScheduler class, but did not add the test case.

This may be a stretch, but Task could be big, right? Eg. if there was a long chain of partitions, and for some reason they had a lot of data? That seems unlikely, but I also have no idea if it ever happens or not. It seems easy enough to add the check back in -- is there any reason not to?

Yeah I agree with @squito that we should keep this test / check. I've seen many huge task sizes due to a developer mistake, so think we should absolutely keep warning about it.

Ok, I will add this test case back.

squito · 2016-11-22T17:23:51Z

...t/scala/org/apache/spark/scheduler/cluster/mesos/MesosFineGrainedSchedulerBackendSuite.scala

+      taskMetrics.incMemoryBytesSpilled(10)
+      override def runTask(tc: TaskContext): Int = 0
+    }
+    val taskDesc = new TaskDescription(1L, 0, "s1", "n1", 0,


nit: with so many args, all with pretty generic types, its really helpful to name each one, eg

TaskDescription( taskId=1L, attemptNumber = 0, ...

(here and elsewhere)

Okay I'll change it

SparkQA · 2017-03-01T18:12:59Z

Test build #73694 has finished for PR 15505 at commit 335b7b9.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

…chedulerBackend

This commit also does all task serializion in the encode() method, so now the encode() method just takes the TaskDescription as an input parameter.

SparkQA · 2017-03-02T02:32:43Z

Test build #73721 has finished for PR 15505 at commit b2b1eec.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

witgo · 2017-03-02T03:27:45Z

@kayousterhout It takes some time to update the test report.

kayousterhout · 2017-03-02T03:33:44Z

@witgo OK I'll hold off on doing another pass on the code until you have the test results.

witgo · 2017-03-02T12:26:54Z

@kayousterhout
Test results have been updated:

SPARK-17931	`db0ddce`
9.427 s	9.566 s

witgo · 2017-03-02T12:46:49Z

I don't know which PR causes the run time of this test case to be reduced from 21.764 s to 9.566 s.

kayousterhout · 2017-03-03T00:55:09Z

@witgo I don't think the ~1.5% improvement in runtime merits the added complexity of this change. I could be convinced to merge this if it simplified the code or the ability to reason about the code, but unfortunately this makes things somewhat more complicated because of the new logic about aborting task sets.

mridulm · 2017-03-03T02:14:55Z

@kayousterhout I am surprised it is not more, but I agree that the added complexity for such low returns is not worth it.

witgo · 2017-03-03T06:24:43Z

Yes, maybe a multithreaded serialization task code can have a better performance, let me close the PR

witgo · 2017-03-03T07:07:24Z

SPARK-18890_20170303 `s code is older but the test case running time is 5.2 s

libratiger · 2017-05-16T16:06:20Z

I agree with Kay that putting in a smaller change first is better, assuming it still has the performance gains. That doesn't preclude any further optimizations that are bigger changes.

I'm a little surprised that the serializing tasks has much of an impact, given how little data is getting serialized. But if it really is, I feel like there is a much bigger optimization we're completely missing. Why are we repeating the work of serialization for each task in a taskset? The serialized data is almost exactly the same for every task. they only differ in the partition id (an int) and the preferred locations (which aren't even used by the executor at all).

Task serialization already leverages the idea of having info across all the tasks in the Broadcast for the task binary. We just need to use that same idea for all the rest of the task data that is sent to the executor. Then the only difference between the serialized task data sent to executors is the int for the partitionId. You'd serialize into a bytebuffer once, and then your per-task "serialization" becomes copying the buffer and modifying that int directly.

@squito I like this idea very much. I just encounte the de-serialization time is too long (about more than 10s for some tasks). Is there any PR try to solve this? If there is no related PR, I would like open an issue and try to solve this.:-D

squito · 2017-05-30T16:08:45Z

@djvulee this is https://issues.apache.org/jira/browse/SPARK-19108. Note that there is some discussion there about this being a bit harder than what I originally thought, though I think its still worth exploring where task serialization is an issue.

witgo force-pushed the SPARK-17931 branch from 771949e to 8a6062d Compare October 17, 2016 06:31

witgo changed the title ~~[WIP][SPARK-17931]taskScheduler has some unneeded serialization~~ [SPARK-17931]taskScheduler has some unneeded serialization Oct 17, 2016

witgo changed the title ~~[SPARK-17931]taskScheduler has some unneeded serialization~~ [SPARK-17931][CORE] taskScheduler has some unneeded serialization Oct 17, 2016

witgo force-pushed the SPARK-17931 branch from bee165a to d956ff5 Compare October 18, 2016 03:56

witgo force-pushed the SPARK-17931 branch from ca9da40 to 80eed8f Compare October 18, 2016 12:24

witgo force-pushed the SPARK-17931 branch from 80eed8f to 589f3bb Compare October 19, 2016 01:25

witgo force-pushed the SPARK-17931 branch from 589f3bb to 84488c4 Compare October 19, 2016 08:04

witgo changed the title ~~[SPARK-17931][CORE] taskScheduler has some unneeded serialization~~ [WIP][SPARK-17931][CORE] taskScheduler has some unneeded serialization Oct 19, 2016

witgo force-pushed the SPARK-17931 branch from 8a6b37e to 07b0581 Compare October 20, 2016 06:23

witgo force-pushed the SPARK-17931 branch from 07b0581 to ace0114 Compare November 4, 2016 16:08

witgo changed the title ~~[WIP][SPARK-17931][CORE] taskScheduler has some unneeded serialization~~ [SPARK-17931][CORE] taskScheduler has some unneeded serialization Nov 4, 2016

wzhfy reviewed Nov 7, 2016

View reviewed changes

squito suggested changes Nov 22, 2016

View reviewed changes

witgo force-pushed the SPARK-17931 branch from 68607f7 to 335b7b9 Compare March 1, 2017 13:54

witgo and others added 14 commits March 2, 2017 08:22

Move task serialization from the TaskSetManager to the CoarseGrainedS…

4e08821

…chedulerBackend

review commits

cb738e5

add test "Scheduler aborts stages that have unserializable partition"

13ab2bf

refactor

543a454

create all the serialized tasks to make sure they all work

6f04206

review commits

0c8eed7

add lock on the scheduler object

0cdfeb0

Consolidate TaskDescrition constructors.

9ebaba2

This commit also does all task serializion in the encode() method, so now the encode() method just takes the TaskDescription as an input parameter.

Refactor the taskDesc serialization code

7d6e7a6

Add ut: serialization task errors do not affect each other

fa2c349

askWithRetry => askSync

58f9b13

fix the import ordering in TaskDescription.scala

550ec11

review commits

9b80c25

fix NPE in CoarseGrainedSchedulerBackend.scala

b2b1eec

witgo force-pushed the SPARK-17931 branch from 335b7b9 to b2b1eec Compare March 2, 2017 00:23

witgo closed this Mar 3, 2017

[SPARK-18890][CORE] Move task serialization from the TaskSetManager to the CoarseGrainedSchedulerBackend #15505

[SPARK-18890][CORE] Move task serialization from the TaskSetManager to the CoarseGrainedSchedulerBackend #15505

Uh oh!

Conversation

witgo commented Oct 16, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Oct 16, 2016

Uh oh!

SparkQA commented Oct 16, 2016

Uh oh!

SparkQA commented Oct 17, 2016

Uh oh!

SparkQA commented Oct 17, 2016

Uh oh!

wzhfy commented Oct 18, 2016

Uh oh!

witgo commented Oct 18, 2016

Uh oh!

SparkQA commented Oct 18, 2016

Uh oh!

SparkQA commented Oct 18, 2016

Uh oh!

SparkQA commented Oct 18, 2016

Uh oh!

SparkQA commented Oct 19, 2016

Uh oh!

witgo commented Oct 19, 2016

Uh oh!

SparkQA commented Oct 19, 2016

Uh oh!

SparkQA commented Oct 19, 2016

Uh oh!

SparkQA commented Oct 20, 2016

Uh oh!

SparkQA commented Nov 4, 2016

Uh oh!

rxin commented Nov 4, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

witgo Nov 7, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

witgo commented Nov 21, 2016

Uh oh!

squito left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

squito Dec 19, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

witgo commented Oct 16, 2016 •

edited

Loading

witgo Nov 7, 2016 •

edited

Loading

squito Dec 19, 2016 •

edited

Loading

witgo commented Mar 2, 2017 •

edited

Loading

libratiger commented May 16, 2017 •

edited

Loading