Skip to content

Commit bf56995

Browse files
committed
Merge pull request apache#462 from mateiz/conf-file-fix
Remove Typesafe Config usage and conf files to fix nested property names With Typesafe Config we had the subtle problem of no longer allowing nested property names, which are used for a few of our properties: http://apache-spark-developers-list.1001551.n3.nabble.com/Config-properties-broken-in-master-td208.html This PR is for branch 0.9 but should be added into master too. (cherry picked from commit 34e911c) Signed-off-by: Patrick Wendell <[email protected]>
1 parent aa981e4 commit bf56995

File tree

6 files changed

+41
-71
lines changed

6 files changed

+41
-71
lines changed

core/src/main/scala/org/apache/spark/SparkConf.scala

Lines changed: 7 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -20,27 +20,25 @@ package org.apache.spark
2020
import scala.collection.JavaConverters._
2121
import scala.collection.mutable.HashMap
2222

23-
import com.typesafe.config.ConfigFactory
2423
import java.io.{ObjectInputStream, ObjectOutputStream, IOException}
2524

2625
/**
2726
* Configuration for a Spark application. Used to set various Spark parameters as key-value pairs.
2827
*
2928
* Most of the time, you would create a SparkConf object with `new SparkConf()`, which will load
30-
* values from both the `spark.*` Java system properties and any `spark.conf` on your application's
31-
* classpath (if it has one). In this case, system properties take priority over `spark.conf`, and
32-
* any parameters you set directly on the `SparkConf` object take priority over both of those.
29+
* values from any `spark.*` Java system properties set in your application as well. In this case,
30+
* parameters you set directly on the `SparkConf` object take priority over system properties.
3331
*
3432
* For unit tests, you can also call `new SparkConf(false)` to skip loading external settings and
35-
* get the same configuration no matter what is on the classpath.
33+
* get the same configuration no matter what the system properties are.
3634
*
3735
* All setter methods in this class support chaining. For example, you can write
3836
* `new SparkConf().setMaster("local").setAppName("My app")`.
3937
*
4038
* Note that once a SparkConf object is passed to Spark, it is cloned and can no longer be modified
4139
* by the user. Spark does not support modifying the configuration at runtime.
4240
*
43-
* @param loadDefaults whether to load values from the system properties and classpath
41+
* @param loadDefaults whether to also load values from Java system properties
4442
*/
4543
class SparkConf(loadDefaults: Boolean) extends Cloneable with Logging {
4644

@@ -50,11 +48,9 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable with Logging {
5048
private val settings = new HashMap[String, String]()
5149

5250
if (loadDefaults) {
53-
ConfigFactory.invalidateCaches()
54-
val typesafeConfig = ConfigFactory.systemProperties()
55-
.withFallback(ConfigFactory.parseResources("spark.conf"))
56-
for (e <- typesafeConfig.entrySet().asScala if e.getKey.startsWith("spark.")) {
57-
settings(e.getKey) = e.getValue.unwrapped.toString
51+
// Load any spark.* system properties
52+
for ((k, v) <- System.getProperties.asScala if k.startsWith("spark.")) {
53+
settings(k) = v
5854
}
5955
}
6056

core/src/test/resources/spark.conf

Lines changed: 0 additions & 8 deletions
This file was deleted.

core/src/test/scala/org/apache/spark/SparkConfSuite.scala

Lines changed: 28 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -20,35 +20,23 @@ package org.apache.spark
2020
import org.scalatest.FunSuite
2121

2222
class SparkConfSuite extends FunSuite with LocalSparkContext {
23-
// This test uses the spark.conf in core/src/test/resources, which has a few test properties
24-
test("loading from spark.conf") {
25-
val conf = new SparkConf()
26-
assert(conf.get("spark.test.intTestProperty") === "1")
27-
assert(conf.get("spark.test.stringTestProperty") === "hi")
28-
// NOTE: we don't use list properties yet, but when we do, we'll have to deal with this syntax
29-
assert(conf.get("spark.test.listTestProperty") === "[a, b]")
30-
}
31-
32-
// This test uses the spark.conf in core/src/test/resources, which has a few test properties
33-
test("system properties override spark.conf") {
23+
test("loading from system properties") {
3424
try {
35-
System.setProperty("spark.test.intTestProperty", "2")
25+
System.setProperty("spark.test.testProperty", "2")
3626
val conf = new SparkConf()
37-
assert(conf.get("spark.test.intTestProperty") === "2")
38-
assert(conf.get("spark.test.stringTestProperty") === "hi")
27+
assert(conf.get("spark.test.testProperty") === "2")
3928
} finally {
40-
System.clearProperty("spark.test.intTestProperty")
29+
System.clearProperty("spark.test.testProperty")
4130
}
4231
}
4332

4433
test("initializing without loading defaults") {
4534
try {
46-
System.setProperty("spark.test.intTestProperty", "2")
35+
System.setProperty("spark.test.testProperty", "2")
4736
val conf = new SparkConf(false)
48-
assert(!conf.contains("spark.test.intTestProperty"))
49-
assert(!conf.contains("spark.test.stringTestProperty"))
37+
assert(!conf.contains("spark.test.testProperty"))
5038
} finally {
51-
System.clearProperty("spark.test.intTestProperty")
39+
System.clearProperty("spark.test.testProperty")
5240
}
5341
}
5442

@@ -124,4 +112,25 @@ class SparkConfSuite extends FunSuite with LocalSparkContext {
124112
assert(sc.master === "local[2]")
125113
assert(sc.appName === "My other app")
126114
}
115+
116+
test("nested property names") {
117+
// This wasn't supported by some external conf parsing libraries
118+
try {
119+
System.setProperty("spark.test.a", "a")
120+
System.setProperty("spark.test.a.b", "a.b")
121+
System.setProperty("spark.test.a.b.c", "a.b.c")
122+
val conf = new SparkConf()
123+
assert(conf.get("spark.test.a") === "a")
124+
assert(conf.get("spark.test.a.b") === "a.b")
125+
assert(conf.get("spark.test.a.b.c") === "a.b.c")
126+
conf.set("spark.test.a.b", "A.B")
127+
assert(conf.get("spark.test.a") === "a")
128+
assert(conf.get("spark.test.a.b") === "A.B")
129+
assert(conf.get("spark.test.a.b.c") === "a.b.c")
130+
} finally {
131+
System.clearProperty("spark.test.a")
132+
System.clearProperty("spark.test.a.b")
133+
System.clearProperty("spark.test.a.b.c")
134+
}
135+
}
127136
}

docs/configuration.md

Lines changed: 2 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,8 @@ Spark provides three locations to configure the system:
1818
Spark properties control most application settings and are configured separately for each application.
1919
The preferred way to set them is by passing a [SparkConf](api/core/index.html#org.apache.spark.SparkConf)
2020
class to your SparkContext constructor.
21-
Alternatively, Spark will also load them from Java system properties (for compatibility with old versions
22-
of Spark) and from a [`spark.conf` file](#configuration-files) on your classpath.
21+
Alternatively, Spark will also load them from Java system properties, for compatibility with old versions
22+
of Spark.
2323

2424
SparkConf lets you configure most of the common properties to initialize a cluster (e.g., master URL and
2525
application name), as well as arbitrary key-value pairs through the `set()` method. For example, we could
@@ -468,30 +468,6 @@ Apart from these, the following properties are also available, and may be useful
468468
The application web UI at `http://<driver>:4040` lists Spark properties in the "Environment" tab.
469469
This is a useful place to check to make sure that your properties have been set correctly.
470470

471-
## Configuration Files
472-
473-
You can also configure Spark properties through a `spark.conf` file on your Java classpath.
474-
Because these properties are usually application-specific, we recommend putting this fine *only* on your
475-
application's classpath, and not in a global Spark classpath.
476-
477-
The `spark.conf` file uses Typesafe Config's [HOCON format](https://github.com/typesafehub/config#json-superset),
478-
which is a superset of Java properties files and JSON. For example, the following is a simple config file:
479-
480-
{% highlight awk %}
481-
# Comments are allowed
482-
spark.executor.memory = 512m
483-
spark.serializer = org.apache.spark.serializer.KryoSerializer
484-
{% endhighlight %}
485-
486-
The format also allows hierarchical nesting, as follows:
487-
488-
{% highlight awk %}
489-
spark.akka {
490-
threads = 8
491-
timeout = 200
492-
}
493-
{% endhighlight %}
494-
495471
# Environment Variables
496472

497473
Certain Spark settings can be configured through environment variables, which are read from the `conf/spark-env.sh`

project/SparkBuild.scala

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -277,7 +277,6 @@ object SparkBuild extends Build {
277277
"com.codahale.metrics" % "metrics-graphite" % "3.0.0",
278278
"com.twitter" %% "chill" % "0.3.1",
279279
"com.twitter" % "chill-java" % "0.3.1",
280-
"com.typesafe" % "config" % "1.0.2",
281280
"com.clearspring.analytics" % "stream" % "2.5.1"
282281
)
283282
)

python/pyspark/conf.py

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -61,14 +61,12 @@ class SparkConf(object):
6161
6262
Most of the time, you would create a SparkConf object with
6363
C{SparkConf()}, which will load values from C{spark.*} Java system
64-
properties and any C{spark.conf} on your Spark classpath. In this
65-
case, system properties take priority over C{spark.conf}, and any
66-
parameters you set directly on the C{SparkConf} object take priority
67-
over both of those.
64+
properties as well. In this case, any parameters you set directly on
65+
the C{SparkConf} object take priority over system properties.
6866
6967
For unit tests, you can also call C{SparkConf(false)} to skip
7068
loading external settings and get the same configuration no matter
71-
what is on the classpath.
69+
what the system properties are.
7270
7371
All setter methods in this class support chaining. For example,
7472
you can write C{conf.setMaster("local").setAppName("My app")}.
@@ -82,7 +80,7 @@ def __init__(self, loadDefaults=True, _jvm=None):
8280
Create a new Spark configuration.
8381
8482
@param loadDefaults: whether to load values from Java system
85-
properties and classpath (True by default)
83+
properties (True by default)
8684
@param _jvm: internal parameter used to pass a handle to the
8785
Java VM; does not need to be set by users
8886
"""

0 commit comments

Comments
 (0)