Skip to content

Commit 4a2311c

Browse files
committed
update comments
1 parent 1081a3f commit 4a2311c

File tree

2 files changed

+10
-4
lines changed

2 files changed

+10
-4
lines changed

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/PlanQueryStage.scala

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,9 @@ import org.apache.spark.sql.types.StructType
3030
* Divide the spark plan into multiple QueryStages. For each Exchange in the plan, it adds a
3131
* QueryStage and a QueryStageInput. If reusing Exchange is enabled, it finds duplicated exchanges
3232
* and uses the same QueryStage for all the references. Note this rule must be run after
33-
* EnsureRequirements rule.
33+
* EnsureRequirements rule. The rule divides the plan into multiple sub-trees as QueryStageInput
34+
* is a leaf node. Transforming the plan after applying this rule will only transform node in a
35+
* sub-tree.
3436
*/
3537
case class PlanQueryStage(conf: SQLConf) extends Rule[SparkPlan] {
3638

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/QueryStageInput.scala

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,13 @@ import org.apache.spark.sql.catalyst.plans.physical.{HashPartitioning, Partition
2525
import org.apache.spark.sql.execution._
2626

2727
/**
28-
* QueryStageInput is the leaf node of a QueryStage and is used to hide its child stage. It gets
29-
* the result of its child stage and serves it as the input of the QueryStage. A QueryStage knows
30-
* its child stages by collecting all the QueryStageInputs.
28+
* QueryStageInput is the leaf node of a QueryStage and serves as its input. It is responsible for
29+
* changing the output partition based on the need of its QueryStage. It gets the ShuffledRowRDD
30+
* from its child stage and creates a new ShuffledRowRDD with different partitions by specifying
31+
* an optional array of partition start indices. For example, a ShuffledQueryStage can be reused
32+
* by two different QueryStages. One QueryStageInput can let the first task read partition 0 to 3,
33+
* while in another stage, the QueryStageInput can let the first task read partition 0 to 1.
34+
* A QueryStage knows its child stages by collecting all the QueryStageInputs.
3135
*/
3236
abstract class QueryStageInput extends LeafExecNode {
3337

0 commit comments

Comments
 (0)