Skip to content

Commit 729b7ef

Browse files
author
Ilya Ganelin
committed
Added config option for stageFailure count and documentation
1 parent e0f8b55 commit 729b7ef

File tree

2 files changed

+9
-1
lines changed

2 files changed

+9
-1
lines changed

core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@ class DAGScheduler(
9797
private[scheduler] val failedStages = new HashSet[Stage]
9898

9999
// The maximum number of times to retry a stage before aborting
100-
val maxStageFailures = 5
100+
val maxStageFailures = sc.conf.getInt("spark.stage.maxFailures", 5)
101101

102102
// To avoid cyclical stage failures (see SPARK-5945) we limit the number of times that a stage
103103
// may be retried. However, it only makes sense to limit the number of times that a stage fails

docs/configuration.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1153,6 +1153,14 @@ Apart from these, the following properties are also available, and may be useful
11531153
Should be greater than or equal to 1. Number of allowed retries = this value - 1.
11541154
</td>
11551155
</tr>
1156+
<tr>
1157+
<td><code>spark.stage.maxFailures</code></td>
1158+
<td>5</td>
1159+
<td>
1160+
Number of individual stage failures before aborting the stage and not retrying it.
1161+
Should be greater than or equal to 1. Number of allowed retries = this value - 1.
1162+
</td>
1163+
</tr>
11561164
</table>
11571165

11581166
#### Dynamic Allocation

0 commit comments

Comments
 (0)