Skip to content

Commit c9914cf

Browse files
srowenHyukjinKwon
authored andcommitted
[MINOR][DOCS] Add note about Spark network security
## What changes were proposed in this pull request? In response to a recent question, this reiterates that network access to a Spark cluster should be disabled by default, and that access to its hosts and services from outside a private network should be added back explicitly. Also, some minor touch-ups while I was at it. ## How was this patch tested? N/A Author: Sean Owen <[email protected]> Closes #21947 from srowen/SecurityNote.
1 parent c5fe412 commit c9914cf

File tree

2 files changed

+29
-9
lines changed

2 files changed

+29
-9
lines changed

docs/security.md

Lines changed: 18 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -278,7 +278,7 @@ To enable authorization in the SHS, a few extra options are used:
278278
<table class="table">
279279
<tr><th>Property Name</th><th>Default</th><th>Meaning</th></tr>
280280
<tr>
281-
<td>spark.history.ui.acls.enable</td>
281+
<td><code>spark.history.ui.acls.enable</code></td>
282282
<td>false</td>
283283
<td>
284284
Specifies whether ACLs should be checked to authorize users viewing the applications in
@@ -292,15 +292,15 @@ To enable authorization in the SHS, a few extra options are used:
292292
</td>
293293
</tr>
294294
<tr>
295-
<td>spark.history.ui.admin.acls</td>
295+
<td><code>spark.history.ui.admin.acls</code></td>
296296
<td>None</td>
297297
<td>
298298
Comma separated list of users that have view access to all the Spark applications in history
299299
server.
300300
</td>
301301
</tr>
302302
<tr>
303-
<td>spark.history.ui.admin.acls.groups</td>
303+
<td><code>spark.history.ui.admin.acls.groups</code></td>
304304
<td>None</td>
305305
<td>
306306
Comma separated list of groups that have view access to all the Spark applications in history
@@ -501,6 +501,7 @@ can be accomplished by setting `spark.ssl.useNodeLocalConf` to `true`. In that c
501501
provided by the user on the client side are not used.
502502

503503
### Mesos mode
504+
504505
Mesos 1.3.0 and newer supports `Secrets` primitives as both file-based and environment based
505506
secrets. Spark allows the specification of file-based and environment variable based secrets with
506507
`spark.mesos.driver.secret.filenames` and `spark.mesos.driver.secret.envkeys`, respectively.
@@ -562,8 +563,12 @@ Security.
562563

563564
# Configuring Ports for Network Security
564565

565-
Spark makes heavy use of the network, and some environments have strict requirements for using tight
566-
firewall settings. Below are the primary ports that Spark uses for its communication and how to
566+
Generally speaking, a Spark cluster and its services are not deployed on the public internet.
567+
They are generally private services, and should only be accessible within the network of the
568+
organization that deploys Spark. Access to the hosts and ports used by Spark services should
569+
be limited to origin hosts that need to access the services.
570+
571+
Below are the primary ports that Spark uses for its communication and how to
567572
configure those ports.
568573

569574
## Standalone mode only
@@ -597,6 +602,14 @@ configure those ports.
597602
<td><code>SPARK_MASTER_PORT</code></td>
598603
<td>Set to "0" to choose a port randomly. Standalone mode only.</td>
599604
</tr>
605+
<tr>
606+
<td>External Service</td>
607+
<td>Standalone Master</td>
608+
<td>6066</td>
609+
<td>Submit job to cluster via REST API</td>
610+
<td><code>spark.master.rest.port</code></td>
611+
<td>Use <code>spark.master.rest.enabled</code> to enable/disable this service. Standalone mode only.</td>
612+
</tr>
600613
<tr>
601614
<td>Standalone Master</td>
602615
<td>Standalone Worker</td>

docs/spark-standalone.md

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -362,8 +362,15 @@ You can run Spark alongside your existing Hadoop cluster by just launching it as
362362

363363
# Configuring Ports for Network Security
364364

365-
Spark makes heavy use of the network, and some environments have strict requirements for using
366-
tight firewall settings. For a complete list of ports to configure, see the
365+
Generally speaking, a Spark cluster and its services are not deployed on the public internet.
366+
They are generally private services, and should only be accessible within the network of the
367+
organization that deploys Spark. Access to the hosts and ports used by Spark services should
368+
be limited to origin hosts that need to access the services.
369+
370+
This is particularly important for clusters using the standalone resource manager, as they do
371+
not support fine-grained access control in a way that other resource managers do.
372+
373+
For a complete list of ports to configure, see the
367374
[security page](security.html#configuring-ports-for-network-security).
368375

369376
# High Availability
@@ -376,7 +383,7 @@ By default, standalone scheduling clusters are resilient to Worker failures (ins
376383

377384
Utilizing ZooKeeper to provide leader election and some state storage, you can launch multiple Masters in your cluster connected to the same ZooKeeper instance. One will be elected "leader" and the others will remain in standby mode. If the current leader dies, another Master will be elected, recover the old Master's state, and then resume scheduling. The entire recovery process (from the time the first leader goes down) should take between 1 and 2 minutes. Note that this delay only affects scheduling _new_ applications -- applications that were already running during Master failover are unaffected.
378385

379-
Learn more about getting started with ZooKeeper [here](http://zookeeper.apache.org/doc/current/zookeeperStarted.html).
386+
Learn more about getting started with ZooKeeper [here](https://zookeeper.apache.org/doc/current/zookeeperStarted.html).
380387

381388
**Configuration**
382389

@@ -419,6 +426,6 @@ In order to enable this recovery mode, you can set SPARK_DAEMON_JAVA_OPTS in spa
419426

420427
**Details**
421428

422-
* This solution can be used in tandem with a process monitor/manager like [monit](http://mmonit.com/monit/), or just to enable manual recovery via restart.
429+
* This solution can be used in tandem with a process monitor/manager like [monit](https://mmonit.com/monit/), or just to enable manual recovery via restart.
423430
* While filesystem recovery seems straightforwardly better than not doing any recovery at all, this mode may be suboptimal for certain development or experimental purposes. In particular, killing a master via stop-master.sh does not clean up its recovery state, so whenever you start a new Master, it will enter recovery mode. This could increase the startup time by up to 1 minute if it needs to wait for all previously-registered Workers/clients to timeout.
424431
* While it's not officially supported, you could mount an NFS directory as the recovery directory. If the original Master node dies completely, you could then start a Master on a different node, which would correctly recover all previously registered Workers/applications (equivalent to ZooKeeper recovery). Future applications will have to be able to find the new Master, however, in order to register.

0 commit comments

Comments
 (0)