You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Users that wish to use previous Hadoop versions will need to configure Swift driver manually.
20
-
Current Swift driver requires Swift to use Keystone authentication method. There are recent efforts
21
-
to support temp auth [Hadoop-10420](https://issues.apache.org/jira/browse/HADOOP-10420).
16
+
<code>SparkContext.hadoopConfiguration</code>.
17
+
Current Swift driver requires Swift to use Keystone authentication method.
22
18
23
19
# Configuring Swift
24
20
Proxy server of Swift should include <code>list_endpoints</code> middleware. More information
@@ -27,9 +23,9 @@ available
27
23
28
24
# Dependencies
29
25
30
-
Spark should be compiled with <code>hadoop-openstack-2.3.0.jar</code> that is distributted with
31
-
Hadoop 2.3.0. For the Maven builds, the <code>dependencyManagement</code> section of Spark's main
32
-
<code>pom.xml</code> should include:
26
+
The Spark application should include <code>hadoop-openstack</code> dependency.
27
+
For example, for Maven support, add the following to the <code>pom.xml</code> file:
28
+
33
29
{% highlight xml %}
34
30
<dependencyManagement>
35
31
...
@@ -42,19 +38,6 @@ Hadoop 2.3.0. For the Maven builds, the <code>dependencyManagement</code> sectio
42
38
</dependencyManagement>
43
39
{% endhighlight %}
44
40
45
-
In addition, both <code>core</code> and <code>yarn</code> projects should add
46
-
<code>hadoop-openstack</code> to the <code>dependencies</code> section of their
47
-
<code>pom.xml</code>:
48
-
{% highlight xml %}
49
-
<dependencies>
50
-
...
51
-
<dependency>
52
-
<groupId>org.apache.hadoop</groupId>
53
-
<artifactId>hadoop-openstack</artifactId>
54
-
</dependency>
55
-
...
56
-
</dependencies>
57
-
{% endhighlight %}
58
41
59
42
# Configuration Parameters
60
43
@@ -171,99 +154,3 @@ Notice that
171
154
We suggest to keep those parameters in <code>core-sites.xml</code> for testing purposes when running Spark
172
155
via <code>spark-shell</code>.
173
156
For job submissions they should be provided via <code>sparkContext.hadoopConfiguration</code>.
174
-
175
-
# Usage examples
176
-
177
-
Assume Keystone's authentication URL is <code>http://127.0.0.1:5000/v2.0/tokens</code> and Keystone contains tenant <code>test</code>, user <code>tester</code> with password <code>testing</code>. In our example we define <code>PROVIDER=SparkTest</code>. Assume that Swift contains container <code>logs</code> with an object <code>data.log</code>. To access <code>data.log</code> from Spark the <code>swift://</code> scheme should be used.
178
-
179
-
180
-
## Running Spark via spark-shell
181
-
182
-
Make sure that <code>core-sites.xml</code> contains <code>fs.swift.service.SparkTest.tenant</code>, <code>fs.swift.service.SparkTest.username</code>,
183
-
<code>fs.swift.service.SparkTest.password</code>. Run Spark via <code>spark-shell</code> and access Swift via <code>swift://</code> scheme.
184
-
185
-
{% highlight scala %}
186
-
val sfdata = sc.textFile("swift://logs.SparkTest/data.log")
187
-
sfdata.count()
188
-
{% endhighlight %}
189
-
190
-
191
-
## Sample Application
192
-
193
-
In this case <code>core-sites.xml</code> need not contain <code>fs.swift.service.SparkTest.tenant</code>, <code>fs.swift.service.SparkTest.username</code>,
194
-
<code>fs.swift.service.SparkTest.password</code>. Example of Java usage:
0 commit comments