mapPartitions Api #19335

listenLearning · 2017-09-25T03:27:18Z

No description provided.

…jars for reusing CliSessionState ## What changes were proposed in this pull request? Set isolated to false while using builtin hive jars and `SessionState.get` returns a `CliSessionState` instance. ## How was this patch tested? 1 Unit Tests 2 Manually verified: `hive.exec.strachdir` was only created once because of reusing cliSessionState ```java ➜ spark git:(SPARK-21428) ✗ bin/spark-sql --conf spark.sql.hive.metastore.jars=builtin log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 17/07/16 23:59:27 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/07/16 23:59:27 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore 17/07/16 23:59:27 INFO ObjectStore: ObjectStore, initialize called 17/07/16 23:59:28 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored 17/07/16 23:59:28 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored 17/07/16 23:59:29 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order" 17/07/16 23:59:30 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 17/07/16 23:59:30 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 17/07/16 23:59:31 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 17/07/16 23:59:31 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 17/07/16 23:59:31 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY 17/07/16 23:59:31 INFO ObjectStore: Initialized ObjectStore 17/07/16 23:59:31 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0 17/07/16 23:59:31 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException 17/07/16 23:59:32 INFO HiveMetaStore: Added admin role in metastore 17/07/16 23:59:32 INFO HiveMetaStore: Added public role in metastore 17/07/16 23:59:32 INFO HiveMetaStore: No user is added in admin role, since config is empty 17/07/16 23:59:32 INFO HiveMetaStore: 0: get_all_databases 17/07/16 23:59:32 INFO audit: ugi=Kent ip=unknown-ip-addr cmd=get_all_databases 17/07/16 23:59:32 INFO HiveMetaStore: 0: get_functions: db=default pat=* 17/07/16 23:59:32 INFO audit: ugi=Kent ip=unknown-ip-addr cmd=get_functions: db=default pat=* 17/07/16 23:59:32 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table. 17/07/16 23:59:32 INFO SessionState: Created local directory: /var/folders/k2/04p4k4ws73l6711h_mz2_tq00000gn/T/beea7261-221a-4711-89e8-8b12a9d37370_resources 17/07/16 23:59:32 INFO SessionState: Created HDFS directory: /tmp/hive/Kent/beea7261-221a-4711-89e8-8b12a9d37370 17/07/16 23:59:32 INFO SessionState: Created local directory: /var/folders/k2/04p4k4ws73l6711h_mz2_tq00000gn/T/Kent/beea7261-221a-4711-89e8-8b12a9d37370 17/07/16 23:59:32 INFO SessionState: Created HDFS directory: /tmp/hive/Kent/beea7261-221a-4711-89e8-8b12a9d37370/_tmp_space.db 17/07/16 23:59:32 INFO SparkContext: Running Spark version 2.3.0-SNAPSHOT 17/07/16 23:59:32 INFO SparkContext: Submitted application: SparkSQL::10.0.0.8 17/07/16 23:59:32 INFO SecurityManager: Changing view acls to: Kent 17/07/16 23:59:32 INFO SecurityManager: Changing modify acls to: Kent 17/07/16 23:59:32 INFO SecurityManager: Changing view acls groups to: 17/07/16 23:59:32 INFO SecurityManager: Changing modify acls groups to: 17/07/16 23:59:32 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(Kent); groups with view permissions: Set(); users with modify permissions: Set(Kent); groups with modify permissions: Set() 17/07/16 23:59:33 INFO Utils: Successfully started service 'sparkDriver' on port 51889. 17/07/16 23:59:33 INFO SparkEnv: Registering MapOutputTracker 17/07/16 23:59:33 INFO SparkEnv: Registering BlockManagerMaster 17/07/16 23:59:33 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 17/07/16 23:59:33 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 17/07/16 23:59:33 INFO DiskBlockManager: Created local directory at /private/var/folders/k2/04p4k4ws73l6711h_mz2_tq00000gn/T/blockmgr-9cfae28a-01e9-4c73-a1f1-f76fa52fc7a5 17/07/16 23:59:33 INFO MemoryStore: MemoryStore started with capacity 366.3 MB 17/07/16 23:59:33 INFO SparkEnv: Registering OutputCommitCoordinator 17/07/16 23:59:33 INFO Utils: Successfully started service 'SparkUI' on port 4040. 17/07/16 23:59:33 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.0.0.8:4040 17/07/16 23:59:33 INFO Executor: Starting executor ID driver on host localhost 17/07/16 23:59:33 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 51890. 17/07/16 23:59:33 INFO NettyBlockTransferService: Server created on 10.0.0.8:51890 17/07/16 23:59:33 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 17/07/16 23:59:33 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.0.0.8, 51890, None) 17/07/16 23:59:33 INFO BlockManagerMasterEndpoint: Registering block manager 10.0.0.8:51890 with 366.3 MB RAM, BlockManagerId(driver, 10.0.0.8, 51890, None) 17/07/16 23:59:33 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.0.0.8, 51890, None) 17/07/16 23:59:33 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.0.0.8, 51890, None) 17/07/16 23:59:34 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/Users/Kent/Documents/spark/spark-warehouse'). 17/07/16 23:59:34 INFO SharedState: Warehouse path is 'file:/Users/Kent/Documents/spark/spark-warehouse'. 17/07/16 23:59:34 INFO HiveUtils: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes. 17/07/16 23:59:34 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.2) is /user/hive/warehouse 17/07/16 23:59:34 INFO HiveMetaStore: 0: get_database: default 17/07/16 23:59:34 INFO audit: ugi=Kent ip=unknown-ip-addr cmd=get_database: default 17/07/16 23:59:34 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.2) is /user/hive/warehouse 17/07/16 23:59:34 INFO HiveMetaStore: 0: get_database: global_temp 17/07/16 23:59:34 INFO audit: ugi=Kent ip=unknown-ip-addr cmd=get_database: global_temp 17/07/16 23:59:34 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException 17/07/16 23:59:34 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.2) is /user/hive/warehouse 17/07/16 23:59:34 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint spark-sql> ``` cc cloud-fan gatorsmile Author: Kent Yao <[email protected]> Author: hzyaoqin <[email protected]> Closes #18648 from yaooqinn/SPARK-21428.

## What changes were proposed in this pull request? When running IntelliJ, we are unable to capture the exception of memory leak detection. > org.apache.spark.executor.Executor: Managed memory leak detected Explicitly setting `spark.unsafe.exceptionOnMemoryLeak` in SparkConf when building the SparkSession, instead of reading it from system properties. ## How was this patch tested? N/A Author: gatorsmile <[email protected]> Closes #18967 from gatorsmile/setExceptionOnMemoryLeak.

## What changes were proposed in this pull request? This pr sorted output attributes on their name and exprId in `AttributeSet.toSeq` to make the order consistent. If the order is different, spark possibly generates different code and then misses cache in `CodeGenerator`, e.g., `GenerateColumnAccessor` generates code depending on an input attribute order. ## How was this patch tested? Added tests in `AttributeSetSuite` and manually checked if the cache worked well in the given query of the JIRA. Author: Takeshi Yamamuro <[email protected]> Closes #18959 from maropu/SPARK-18394.

## What changes were proposed in this pull request? Add Kerberos Support to Mesos. This includes kinit and --keytab support, but does not include delegation token renewal. ## How was this patch tested? Manually against a Secure DC/OS Apache HDFS cluster. Author: ArtRand <[email protected]> Author: Michael Gummelt <[email protected]> Closes #18519 from mgummelt/SPARK-16742-kerberos.

…s null as string type ## What changes were proposed in this pull request? ``` scala scala> Seq(("""{"Hyukjin": 224, "John": 1225}""")).toDS.selectExpr("json_tuple(value, trim(null))").show() ... java.lang.NullPointerException at ... ``` Currently the `null` field name will throw NullPointException. As a given field name null can't be matched with any field names in json, we just output null as its column value. This PR achieves it by returning a very unlikely column name `__NullFieldName` in evaluation of the field names. ## How was this patch tested? Added unit test. Author: Jen-Ming Chung <[email protected]> Closes #18930 from jmchung/SPARK-21677.

## What changes were proposed in this pull request? Decimal is a logical type of AVRO. We need to ensure the support of Hive's AVRO serde works well in Spark ## How was this patch tested? N/A Author: gatorsmile <[email protected]> Closes #18977 from gatorsmile/addAvroTest.

…it is called statically to convert something into TimestampType ## What changes were proposed in this pull request? https://issues.apache.org/jira/projects/SPARK/issues/SPARK-21739 This issue is caused by introducing TimeZoneAwareExpression. When the **Cast** expression converts something into TimestampType, it should be resolved with setting `timezoneId`. In general, it is resolved in LogicalPlan phase. However, there are still some places that use Cast expression statically to convert datatypes without setting `timezoneId`. In such cases, `NoSuchElementException: None.get` will be thrown for TimestampType. This PR is proposed to fix the issue. We have checked the whole project and found two such usages(i.e., in`TableReader` and `HiveTableScanExec`). ## How was this patch tested? unit test Author: donnyzone <[email protected]> Closes #18960 from DonnyZone/spark-21739.

## What changes were proposed in this pull request? Dataset.sample requires a boolean flag withReplacement as the first argument. However, most of the time users simply want to sample some records without replacement. This ticket introduces a new sample function that simply takes in the fraction and seed. ## How was this patch tested? Tested manually. Not sure yet if we should add a test case for just this wrapper ... Author: Reynold Xin <[email protected]> Closes #18988 from rxin/SPARK-21778.

…Count and sizeInBytes ## What changes were proposed in this pull request? Added support for ANALYZE TABLE [db_name].tablename PARTITION (partcol1[=val1], partcol2[=val2], ...) COMPUTE STATISTICS [NOSCAN] SQL command to calculate total number of rows and size in bytes for a subset of partitions. Calculated statistics are stored in Hive Metastore as user-defined properties attached to partition objects. Property names are the same as the ones used to store table-level statistics: spark.sql.statistics.totalSize and spark.sql.statistics.numRows. When partition specification contains all partition columns with values, the command collects statistics for a single partition that matches the specification. When some partition columns are missing or listed without their values, the command collects statistics for all partitions which match a subset of partition column values specified. For example, table t has 4 partitions with the following specs: * Partition1: (ds='2008-04-08', hr=11) * Partition2: (ds='2008-04-08', hr=12) * Partition3: (ds='2008-04-09', hr=11) * Partition4: (ds='2008-04-09', hr=12) 'ANALYZE TABLE t PARTITION (ds='2008-04-09', hr=11)' command will collect statistics only for partition 3. 'ANALYZE TABLE t PARTITION (ds='2008-04-09')' command will collect statistics for partitions 3 and 4. 'ANALYZE TABLE t PARTITION (ds, hr)' command will collect statistics for all four partitions. When the optional parameter NOSCAN is specified, the command doesn't count number of rows and only gathers size in bytes. The statistics gathered by ANALYZE TABLE command can be fetched using DESC EXTENDED [db_name.]tablename PARTITION command. ## How was this patch tested? Added tests. Author: Masha Basmanova <[email protected]> Closes #18421 from mbasmanova/mbasmanova-analyze-partition.

…leak ## What changes were proposed in this pull request? This is a follow-up of #18955 , to fix a bug that we break whole stage codegen for `Limit`. ## How was this patch tested? existing tests. Author: Wenchen Fan <[email protected]> Closes #18993 from cloud-fan/bug.

## What changes were proposed in this pull request? Fix typos ## How was this patch tested? Existing tests Author: Andrew Ash <[email protected]> Closes #18996 from ash211/patch-2.

## What changes were proposed in this pull request? Adds the recently added `summary` method to the python dataframe interface. ## How was this patch tested? Additional inline doctests. Author: Andrew Ray <[email protected]> Closes #18762 from aray/summary-py.

## What changes were proposed in this pull request? [SPARK-17701](https://github.com/apache/spark/pull/18600/files#diff-b9f96d092fb3fea76bcf75e016799678L77) removed `metadata` function, this PR removed the Docker-based Integration module that has been relevant to `SparkPlan.metadata`. ## How was this patch tested? manual tests Author: Yuming Wang <[email protected]> Closes #19000 from wangyum/SPARK-21709.

…SurvivalRegression ## What changes were proposed in this pull request? The line SchemaUtils.appendColumn(schema, $(predictionCol), IntegerType) did not modify the variable schema, hence only the last line had any effect. A temporary variable is used to correctly append the two columns predictionCol and probabilityCol. ## How was this patch tested? Manually. Please review http://spark.apache.org/contributing.html before opening a pull request. Author: Cédric Pelvet <[email protected]> Closes #18980 from sharp-pixel/master.

…SQL documentation build ## What changes were proposed in this pull request? This PR proposes to install `mkdocs` by `pip install` if missing in the path. Mainly to fix Jenkins's documentation build failure in `spark-master-docs`. See https://amplab.cs.berkeley.edu/jenkins/job/spark-master-docs/3580/console. It also adds `mkdocs` as requirements in `docs/README.md`. ## How was this patch tested? I manually ran `jekyll build` under `docs` directory after manually removing `mkdocs` via `pip uninstall mkdocs`. Also, tested this in the same way but on CentOS Linux release 7.3.1611 (Core) where I built Spark few times but never built documentation before and `mkdocs` is not installed. ``` ... Moving back into docs dir. Moving to SQL directory and building docs. Missing mkdocs in your path, trying to install mkdocs for SQL documentation generation. Collecting mkdocs Downloading mkdocs-0.16.3-py2.py3-none-any.whl (1.2MB) 100% |████████████████████████████████| 1.2MB 574kB/s Requirement already satisfied: PyYAML>=3.10 in /usr/lib64/python2.7/site-packages (from mkdocs) Collecting livereload>=2.5.1 (from mkdocs) Downloading livereload-2.5.1-py2-none-any.whl Collecting tornado>=4.1 (from mkdocs) Downloading tornado-4.5.1.tar.gz (483kB) 100% |████████████████████████████████| 491kB 1.4MB/s Collecting Markdown>=2.3.1 (from mkdocs) Downloading Markdown-2.6.9.tar.gz (271kB) 100% |████████████████████████████████| 276kB 2.4MB/s Collecting click>=3.3 (from mkdocs) Downloading click-6.7-py2.py3-none-any.whl (71kB) 100% |████████████████████████████████| 71kB 2.8MB/s Requirement already satisfied: Jinja2>=2.7.1 in /usr/lib/python2.7/site-packages (from mkdocs) Requirement already satisfied: six in /usr/lib/python2.7/site-packages (from livereload>=2.5.1->mkdocs) Requirement already satisfied: backports.ssl_match_hostname in /usr/lib/python2.7/site-packages (from tornado>=4.1->mkdocs) Collecting singledispatch (from tornado>=4.1->mkdocs) Downloading singledispatch-3.4.0.3-py2.py3-none-any.whl Collecting certifi (from tornado>=4.1->mkdocs) Downloading certifi-2017.7.27.1-py2.py3-none-any.whl (349kB) 100% |████████████████████████████████| 358kB 2.1MB/s Collecting backports_abc>=0.4 (from tornado>=4.1->mkdocs) Downloading backports_abc-0.5-py2.py3-none-any.whl Requirement already satisfied: MarkupSafe>=0.23 in /usr/lib/python2.7/site-packages (from Jinja2>=2.7.1->mkdocs) Building wheels for collected packages: tornado, Markdown Running setup.py bdist_wheel for tornado ... done Stored in directory: /root/.cache/pip/wheels/84/83/cd/6a04602633457269d161344755e6766d24307189b7a67ff4b7 Running setup.py bdist_wheel for Markdown ... done Stored in directory: /root/.cache/pip/wheels/bf/46/10/c93e17ae86ae3b3a919c7b39dad3b5ccf09aeb066419e5c1e5 Successfully built tornado Markdown Installing collected packages: singledispatch, certifi, backports-abc, tornado, livereload, Markdown, click, mkdocs Successfully installed Markdown-2.6.9 backports-abc-0.5 certifi-2017.7.27.1 click-6.7 livereload-2.5.1 mkdocs-0.16.3 singledispatch-3.4.0.3 tornado-4.5.1 Generating markdown files for SQL documentation. Generating HTML files for SQL documentation. INFO - Cleaning site directory INFO - Building documentation to directory: .../spark/sql/site Moving back into docs dir. Making directory api/sql cp -r ../sql/site/. api/sql Source: .../spark/docs Destination: .../spark/docs/_site Generating... done. Auto-regeneration: disabled. Use --watch to enable. ``` Author: hyukjinkwon <[email protected]> Closes #18984 from HyukjinKwon/sql-doc-mkdocs.

… paths are successfully removed ## What changes were proposed in this pull request? Fix a typo in test. ## How was this patch tested? Jenkins tests. Author: Liang-Chi Hsieh <[email protected]> Closes #19005 from viirya/SPARK-21721-followup.

… power of 2 ## Problem When an RDD (particularly with a low item-per-partition ratio) is repartitioned to numPartitions = power of 2, the resulting partitions are very uneven-sized, due to using fixed seed to initialize PRNG, and using the PRNG only once. See details in https://issues.apache.org/jira/browse/SPARK-21782 ## What changes were proposed in this pull request? Instead of directly using `0, 1, 2,...` seeds to initialize `Random`, hash them with `scala.util.hashing.byteswap32()`. ## How was this patch tested? `build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.rdd.RDDSuite test` Author: Sergey Serebryakov <[email protected]> Closes #18990 from megaserg/repartition-skew.

…ats ..." ## What changes were proposed in this pull request? Reduce 'Skipping partitions' message to debug ## How was this patch tested? Existing tests Author: Sean Owen <[email protected]> Closes #19010 from srowen/SPARK-21718.

Add Python API for `FeatureHasher` transformer. ## How was this patch tested? New doc test. Author: Nick Pentreath <[email protected]> Closes #18970 from MLnick/SPARK-21468-pyspark-hasher.

## What changes were proposed in this pull request? The previous PR(#19000) removed filter pushdown verification, This PR add them back. ## How was this patch tested? manual tests Author: Yuming Wang <[email protected]> Closes #19002 from wangyum/SPARK-21790-follow-up.

…in Hive metastore. For Hive tables, the current "replace the schema" code is the correct path, except that an exception in that path should result in an error, and not in retrying in a different way. For data source tables, Spark may generate a non-compatible Hive table; but for that to work with Hive 2.1, the detection of data source tables needs to be fixed in the Hive client, to also consider the raw tables used by code such as `alterTableSchema`. Tested with existing and added unit tests (plus internal tests with a 2.1 metastore). Author: Marcelo Vanzin <[email protected]> Closes #18849 from vanzin/SPARK-21617.

## What changes were proposed in this pull request? MLlib ```LinearRegression/LogisticRegression/LinearSVC``` always standardize the data during training to improve the rate of convergence regardless of _standardization_ is true or false. If _standardization_ is false, we perform reverse standardization by penalizing each component differently to get effectively the same objective function when the training dataset is not standardized. We should keep these comments in the code to let developers understand how we handle it correctly. ## How was this patch tested? Existing tests, only adding some comments in code. Author: Yanbo Liang <[email protected]> Closes #18992 from yanboliang/SPARK-19762.

## What changes were proposed in this pull request? Based on #18282 by rgbkrk this PR attempts to update to the current released cloudpickle and minimize the difference between Spark cloudpickle and "stock" cloud pickle with the goal of eventually using the stock cloud pickle. Some notable changes: * Import submodules accessed by pickled functions (cloudpipe/cloudpickle#80) * Support recursive functions inside closures (cloudpipe/cloudpickle#89, cloudpipe/cloudpickle#90) * Fix ResourceWarnings and DeprecationWarnings (cloudpipe/cloudpickle#88) * Assume modules with __file__ attribute are not dynamic (cloudpipe/cloudpickle#85) * Make cloudpickle Python 3.6 compatible (cloudpipe/cloudpickle#72) * Allow pickling of builtin methods (cloudpipe/cloudpickle#57) * Add ability to pickle dynamically created modules (cloudpipe/cloudpickle#52) * Support method descriptor (cloudpipe/cloudpickle#46) * No more pickling of closed files, was broken on Python 3 (cloudpipe/cloudpickle#32) * ** Remove non-standard __transient__check (cloudpipe/cloudpickle#110)** -- while we don't use this internally, and have no tests or documentation for its use, downstream code may use __transient__, although it has never been part of the API, if we merge this we should include a note about this in the release notes. * Support for pickling loggers (yay!) (cloudpipe/cloudpickle#96) * BUG: Fix crash when pickling dynamic class cycles. (cloudpipe/cloudpickle#102) ## How was this patch tested? Existing PySpark unit tests + the unit tests from the cloudpickle project on their own. Author: Holden Karau <[email protected]> Author: Kyle Kelley <[email protected]> Closes #18734 from holdenk/holden-rgbkrk-cloudpickle-upgrades.

…plementation ## What changes were proposed in this pull request? SPARK-21100 introduced a new `summary` method to the Scala/Java Dataset API that included expanded statistics (vs `describe`) and control over which statistics to compute. Currently in the R API `summary` acts as an alias for `describe`. This patch updates the R API to call the new `summary` method in the JVM that includes additional statistics and ability to select which to compute. This does not break the current interface as the present `summary` method does not take additional arguments like `describe` and the output was never meant to be used programmatically. ## How was this patch tested? Modified and additional unit tests. Author: Andrew Ray <[email protected]> Closes #18786 from aray/summary-r.

## What changes were proposed in this pull request? We do not have any Hive-specific parser. It does not make sense to keep a parser-specific test suite `HiveDDLCommandSuite.scala` in the Hive package. This PR is to remove it. ## How was this patch tested? N/A Author: gatorsmile <[email protected]> Closes #19015 from gatorsmile/combineDDL.

…bmit code There're two code in Launcher and SparkSubmit will will explicitly list all the Spark submodules, newly added kvstore module is missing in this two parts, so submitting a minor PR to fix this. Author: jerryshao <[email protected]> Closes #19014 from jerryshao/missing-kvstore.

…F(UserDefinedAggregateFunction) ## What changes were proposed in this pull request? This PR is to enable users to create persistent Scala UDAF (that extends UserDefinedAggregateFunction). ```SQL CREATE FUNCTION myDoubleAvg AS 'test.org.apache.spark.sql.MyDoubleAvg' ``` Before this PR, Spark UDAF only can be registered through the API `spark.udf.register(...)` ## How was this patch tested? Added test cases Author: gatorsmile <[email protected]> Closes #18700 from gatorsmile/javaUDFinScala.

…schemas inferred/controlled by Spark SQL ## What changes were proposed in this pull request? For Hive-serde tables, we always respect the schema stored in Hive metastore, because the schema could be altered by the other engines that share the same metastore. Thus, we always trust the metastore-controlled schema for Hive-serde tables when the schemas are different (without considering the nullability and cases). However, in some scenarios, Hive metastore also could INCORRECTLY overwrite the schemas when the serde and Hive metastore built-in serde are different. The proposed solution is to introduce a table-specific option for such scenarios. For a specific table, users can make Spark always respect Spark-inferred/controlled schema instead of trusting metastore-controlled schema. By default, we trust Hive metastore-controlled schema. ## How was this patch tested? Added a cross-version test case Author: gatorsmile <[email protected]> Closes #19003 from gatorsmile/respectSparkSchema.

…td contains zero ## What changes were proposed in this pull request? fix bug of MLOR do not work correctly when featureStd contains zero We can reproduce the bug through such dataset (features including zero variance), will generate wrong result (all coefficients becomes 0) ``` val multinomialDatasetWithZeroVar = { val nPoints = 100 val coefficients = Array( -0.57997, 0.912083, -0.371077, -0.16624, -0.84355, -0.048509) val xMean = Array(5.843, 3.0) val xVariance = Array(0.6856, 0.0) // including zero variance val testData = generateMultinomialLogisticInput( coefficients, xMean, xVariance, addIntercept = true, nPoints, seed) val df = sc.parallelize(testData, 4).toDF().withColumn("weight", lit(1.0)) df.cache() df } ``` ## How was this patch tested? testcase added. Author: WeichenXu <[email protected]> Closes #18896 from WeichenXu123/fix_mlor_stdvalue_zero_bug.

…mator ## What changes were proposed in this pull request? Added call to copy values of Params from Estimator to Model after fit in PySpark ML. This will copy values for any params that are also defined in the Model. Since currently most Models do not define the same params from the Estimator, also added method to create new Params from looking at the Java object if they do not exist in the Python object. This is a temporary fix that can be removed once the PySpark models properly define the params themselves. ## How was this patch tested? Refactored the `check_params` test to optionally check if the model params for Python and Java match and added this check to an existing fitted model that shares params between Estimator and Model. Author: Bryan Cutler <[email protected]> Closes #17849 from BryanCutler/pyspark-models-own-params-SPARK-10931.

…th nullable int columns ## What changes were proposed in this pull request? When calling `DataFrame.toPandas()` (without Arrow enabled), if there is a `IntegralType` column (`IntegerType`, `ShortType`, `ByteType`) that has null values the following exception is thrown: ValueError: Cannot convert non-finite values (NA or inf) to integer This is because the null values first get converted to float NaN during the construction of the Pandas DataFrame in `from_records`, and then it is attempted to be converted back to to an integer where it fails. The fix is going to check if the Pandas DataFrame can cause such failure when converting, if so, we don't do the conversion and use the inferred type by Pandas. Closes #18945 ## How was this patch tested? Added pyspark test. Author: Liang-Chi Hsieh <[email protected]> Closes #19319 from viirya/SPARK-21766.

…st/load bug ## What changes were proposed in this pull request? Currently the param of CrossValidator/TrainValidationSplit persist/loading is hardcoding, which is different with other ML estimators. This cause persist bug for new added `parallelism` param. I refactor related code, avoid hardcoding persist/load param. And in the same time, it solve the `parallelism` persisting bug. This refactoring is very useful because we will add more new params in #19208 , hardcoding param persisting/loading making the thing adding new params very troublesome. ## How was this patch tested? Test added. Author: WeichenXu <[email protected]> Closes #19278 from WeichenXu123/fix-tuning-param-bug.

## What changes were proposed in this pull request? Fix for setup of `SPARK_JARS_DIR` on Windows as it looks for `%SPARK_HOME%\RELEASE` file instead of `%SPARK_HOME%\jars` as it should. RELEASE file is not included in the `pip` build of PySpark. ## How was this patch tested? Local install of PySpark on Anaconda 4.4.0 (Python 3.6.1). Author: Jakub Nowacki <[email protected]> Closes #19310 from jsnowacki/master.

… page. ## What changes were proposed in this pull request? The 'job ids' list style needs to be changed in the SQL page. There are two reasons: 1. If a job id is a line, there are a lot of job ids, then the table row height will be high. As shown below: ![3](https://user-images.githubusercontent.com/26266482/30732242-2fb11442-9fa4-11e7-98ea-80a98f280243.png) 2. should be consistent with the 'JDBC / ODBC Server' page style, I am in this way to modify the style. As shown below: ![2](https://user-images.githubusercontent.com/26266482/30732257-3c550820-9fa4-11e7-9d8e-467d3011e0ac.png) My changes are as follows: ![6](https://user-images.githubusercontent.com/26266482/30732318-8f61d8b8-9fa4-11e7-8af5-037ed12b13c9.png) ![5](https://user-images.githubusercontent.com/26266482/30732284-5b6a6c00-9fa4-11e7-8db9-3a2291f37ae6.png) ## How was this patch tested? manual tests Please review http://spark.apache.org/contributing.html before opening a pull request. Author: guoxiaolong <[email protected]> Closes #19320 from guoxiaolongzte/SPARK-22099.

…r the specific VM array size limitations ## What changes were proposed in this pull request? Try to avoid allocating an array bigger than Integer.MAX_VALUE - 8, which is the actual max size on some JVMs, in several places ## How was this patch tested? Existing tests Author: Sean Owen <[email protected]> Closes #19266 from srowen/SPARK-22033.

…amps in partition column ## What changes were proposed in this pull request? This PR proposes to resolve the type conflicts in strings and timestamps in partition column values. It looks we need to set the timezone as it needs a cast between strings and timestamps. ```scala val df = Seq((1, "2015-01-01 00:00:00"), (2, "2014-01-01 00:00:00"), (3, "blah")).toDF("i", "str") val path = "/tmp/test.parquet" df.write.format("parquet").partitionBy("str").save(path) spark.read.parquet(path).show() ``` **Before** ``` java.util.NoSuchElementException: None.get at scala.None$.get(Option.scala:347) at scala.None$.get(Option.scala:345) at org.apache.spark.sql.catalyst.expressions.TimeZoneAwareExpression$class.timeZone(datetimeExpressions.scala:46) at org.apache.spark.sql.catalyst.expressions.Cast.timeZone$lzycompute(Cast.scala:172) at org.apache.spark.sql.catalyst.expressions.Cast.timeZone(Cast.scala:172) at org.apache.spark.sql.catalyst.expressions.Cast$$anonfun$castToString$3$$anonfun$apply$16.apply(Cast.scala:208) at org.apache.spark.sql.catalyst.expressions.Cast$$anonfun$castToString$3$$anonfun$apply$16.apply(Cast.scala:208) at org.apache.spark.sql.catalyst.expressions.Cast.org$apache$spark$sql$catalyst$expressions$Cast$$buildCast(Cast.scala:201) at org.apache.spark.sql.catalyst.expressions.Cast$$anonfun$castToString$3.apply(Cast.scala:207) at org.apache.spark.sql.catalyst.expressions.Cast.nullSafeEval(Cast.scala:533) at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:331) at org.apache.spark.sql.execution.datasources.PartitioningUtils$$anonfun$org$apache$spark$sql$execution$datasources$PartitioningUtils$$resolveTypeConflicts$1.apply(PartitioningUtils.scala:481) at org.apache.spark.sql.execution.datasources.PartitioningUtils$$anonfun$org$apache$spark$sql$execution$datasources$PartitioningUtils$$resolveTypeConflicts$1.apply(PartitioningUtils.scala:480) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) ``` **After** ``` +---+-------------------+ | i| str| +---+-------------------+ | 2|2014-01-01 00:00:00| | 1|2015-01-01 00:00:00| | 3| blah| +---+-------------------+ ``` ## How was this patch tested? Unit tests added in `ParquetPartitionDiscoverySuite` and manual tests. Author: hyukjinkwon <[email protected]> Closes #19331 from HyukjinKwon/SPARK-22109.

…torage Change-Id: I88c272444ca734dc2cbc2592607c11287b90a383 ## What changes were proposed in this pull request? The documentation on File DStreams is enhanced to 1. Detail the exact timestamp logic for examining directories and files. 1. Detail how object stores different from filesystems, and so how using them as a source of data should be treated with caution, possibly publishing data to the store differently (direct PUTs as opposed to stage + rename) ## How was this patch tested? n/a Author: Steve Loughran <[email protected]> Closes #17743 from steveloughran/cloud/SPARK-20448-document-dstream-blobstore.

… with arguments and examples for trim function ## What changes were proposed in this pull request? This PR proposes to enhance the documentation for `trim` functions in the function description session. - Add more `usage`, `arguments` and `examples` for the trim function - Adjust space in the `usage` session After the changes, the trim function documentation will look like this: - `trim` ```trim(str) - Removes the leading and trailing space characters from str. trim(BOTH trimStr FROM str) - Remove the leading and trailing trimStr characters from str trim(LEADING trimStr FROM str) - Remove the leading trimStr characters from str trim(TRAILING trimStr FROM str) - Remove the trailing trimStr characters from str Arguments: str - a string expression trimStr - the trim string characters to trim, the default value is a single space BOTH, FROM - these are keywords to specify trimming string characters from both ends of the string LEADING, FROM - these are keywords to specify trimming string characters from the left end of the string TRAILING, FROM - these are keywords to specify trimming string characters from the right end of the string Examples: > SELECT trim(' SparkSQL '); SparkSQL > SELECT trim('SL', 'SSparkSQLS'); parkSQ > SELECT trim(BOTH 'SL' FROM 'SSparkSQLS'); parkSQ > SELECT trim(LEADING 'SL' FROM 'SSparkSQLS'); parkSQLS > SELECT trim(TRAILING 'SL' FROM 'SSparkSQLS'); SSparkSQ ``` - `ltrim` ```ltrim ltrim(str) - Removes the leading space characters from str. ltrim(trimStr, str) - Removes the leading string contains the characters from the trim string Arguments: str - a string expression trimStr - the trim string characters to trim, the default value is a single space Examples: > SELECT ltrim(' SparkSQL '); SparkSQL > SELECT ltrim('Sp', 'SSparkSQLS'); arkSQLS ``` - `rtrim` ```rtrim rtrim(str) - Removes the trailing space characters from str. rtrim(trimStr, str) - Removes the trailing string which contains the characters from the trim string from the str Arguments: str - a string expression trimStr - the trim string characters to trim, the default value is a single space Examples: > SELECT rtrim(' SparkSQL '); SparkSQL > SELECT rtrim('LQSa', 'SSparkSQLS'); SSpark ``` This is the trim characters function jira: [trim function](https://issues.apache.org/jira/browse/SPARK-14878) ## How was this patch tested? Manually tested ``` spark-sql> describe function extended trim; 17/09/22 17:03:04 INFO CodeGenerator: Code generated in 153.026533 ms Function: trim Class: org.apache.spark.sql.catalyst.expressions.StringTrim Usage: trim(str) - Removes the leading and trailing space characters from `str`. trim(BOTH trimStr FROM str) - Remove the leading and trailing `trimStr` characters from `str` trim(LEADING trimStr FROM str) - Remove the leading `trimStr` characters from `str` trim(TRAILING trimStr FROM str) - Remove the trailing `trimStr` characters from `str` Extended Usage: Arguments: * str - a string expression * trimStr - the trim string characters to trim, the default value is a single space * BOTH, FROM - these are keywords to specify trimming string characters from both ends of the string * LEADING, FROM - these are keywords to specify trimming string characters from the left end of the string * TRAILING, FROM - these are keywords to specify trimming string characters from the right end of the string Examples: > SELECT trim(' SparkSQL '); SparkSQL > SELECT trim('SL', 'SSparkSQLS'); parkSQ > SELECT trim(BOTH 'SL' FROM 'SSparkSQLS'); parkSQ > SELECT trim(LEADING 'SL' FROM 'SSparkSQLS'); parkSQLS > SELECT trim(TRAILING 'SL' FROM 'SSparkSQLS'); SSparkSQ ``` ``` spark-sql> describe function extended ltrim; Function: ltrim Class: org.apache.spark.sql.catalyst.expressions.StringTrimLeft Usage: ltrim(str) - Removes the leading space characters from `str`. ltrim(trimStr, str) - Removes the leading string contains the characters from the trim string Extended Usage: Arguments: * str - a string expression * trimStr - the trim string characters to trim, the default value is a single space Examples: > SELECT ltrim(' SparkSQL '); SparkSQL > SELECT ltrim('Sp', 'SSparkSQLS'); arkSQLS ``` ``` spark-sql> describe function extended rtrim; Function: rtrim Class: org.apache.spark.sql.catalyst.expressions.StringTrimRight Usage: rtrim(str) - Removes the trailing space characters from `str`. rtrim(trimStr, str) - Removes the trailing string which contains the characters from the trim string from the `str` Extended Usage: Arguments: * str - a string expression * trimStr - the trim string characters to trim, the default value is a single space Examples: > SELECT rtrim(' SparkSQL '); SparkSQL > SELECT rtrim('LQSa', 'SSparkSQLS'); SSpark ``` Author: Kevin Yu <[email protected]> Closes #19329 from kevinyu98/spark-14878-5.

…thod in AggregatedDialect ## What changes were proposed in this pull request? The implemented `isCascadingTruncateTable` in `AggregatedDialect` is wrong. When no dialect claims cascading, once there is an unknown cascading truncate in the dialects, we should return unknown cascading, instead of false. ## How was this patch tested? Added test. Author: Liang-Chi Hsieh <[email protected]> Closes #19286 from viirya/SPARK-21338-followup.

## What changes were proposed in this pull request? This PR proposes to remove `assume` in `Utils.resolveURIs` and replace `assume` to `assert` in `Utils.resolveURI` in the test cases in `UtilsSuite`. It looks `Utils.resolveURIs` supports multiple but also single paths as input. So, it looks not meaningful to check if the input has `,`. For the test for `Utils.resolveURI`, I replaced it to `assert` because it looks taking single path and in order to prevent future mistakes when adding more tests here. For `assume` in `HiveDDLSuite`, it looks it should be `assert` to test at the last ## How was this patch tested? Fixed unit tests. Author: hyukjinkwon <[email protected]> Closes #19332 from HyukjinKwon/SPARK-22093.

…exception occurs. ## What changes were proposed in this pull request? EventLoggingListener use `val in = new BufferedInputStream(fs.open(log))` and will close it if `codec.map(_.compressedInputStream(in)).getOrElse(in)` occurs an exception . But, if `CompressionCodec.createCodec(new SparkConf, c)` throws an exception, the BufferedInputStream `in` will not be closed anymore. ## How was this patch tested? exist tests Author: zuotingbing <[email protected]> Closes #19277 from zuotingbing/SPARK-22058.

… for Scala 2.12 + other 2.12 fixes ## What changes were proposed in this pull request? Enable Scala 2.12 REPL. Fix most remaining issues with 2.12 compilation and warnings, including: - Selecting Kafka 0.10.1+ for Scala 2.12 and patching over a minor API difference - Fixing lots of "eta expansion of zero arg method deprecated" warnings - Resolving the SparkContext.sequenceFile implicits compile problem - Fixing an odd but valid jetty-server missing dependency in hive-thriftserver ## How was this patch tested? Existing tests Author: Sean Owen <[email protected]> Closes #19307 from srowen/Scala212.

## What changes were proposed in this pull request? Updated docs so that a line of python in the quick start guide executes. Closes #19283 ## How was this patch tested? Existing tests. Author: John O'Leary <[email protected]> Closes #19326 from jgoleary/issues/22107.

HyukjinKwon · 2017-09-25T03:29:22Z

@listenLearning Close this please.

AmplabJenkins · 2017-09-25T03:31:59Z

Can one of the admins verify this patch?

HyukjinKwon · 2017-09-25T03:32:35Z

@listenLearning, If you'd like to ask a question, please ask this to the mailing list (see https://spark.apache.org/community.html).

HyukjinKwon · 2017-09-25T04:38:19Z

ping @listenLearning!

… and change the output type to be the same as input type ## What changes were proposed in this pull request? The `percentile_approx` function previously accepted numeric type input and output double type results. But since all numeric types, date and timestamp types are represented as numerics internally, `percentile_approx` can support them easily. After this PR, it supports date type, timestamp type and numeric types as input types. The result type is also changed to be the same as the input type, which is more reasonable for percentiles. This change is also required when we generate equi-height histograms for these types. ## How was this patch tested? Added a new test and modified some existing tests. Author: Zhenhua Wang <[email protected]> Closes #19321 from wzhfy/approx_percentile_support_types.

## What changes were proposed in this pull request? MemoryStore.evictBlocksToFreeSpace acquires write locks for all the blocks it intends to evict up front. If there is a failure to evict blocks (eg., some failure dropping a block to disk), then we have to release the lock. Otherwise the lock is never released and an executor trying to get the lock will wait forever. ## How was this patch tested? Added unit test. Author: Imran Rashid <[email protected]> Closes #19311 from squito/SPARK-22083.

…ction in codegen ## What changes were proposed in this pull request? HashAggregateExec codegen uses two paths for fast hash table and a generic one. It generates code paths for iterating over both, and both code paths generate the consume code of the parent operator, resulting in that code being expanded twice. This leads to a long generated function that might be an issue for the compiler (see e.g. SPARK-21603). I propose to remove the double expansion by generating the consume code in a helper function that can just be called from both iterating loops. An issue with separating the `consume` code to a helper function was that a number of places relied and assumed on being in the scope of an outside `produce` loop and e.g. use `continue` to jump out. I replaced such code flows with nested scopes. It is code that should be handled the same by compiler, while getting rid of depending on assumptions that are outside of the `consume`'s own scope. ## How was this patch tested? Existing test coverage. Author: Juliusz Sompolski <[email protected]> Closes #19324 from juliuszsompolski/aggrconsumecodegen.

… warehouse directory ## What changes were proposed in this pull request? During TestHiveSparkSession.reset(), which is called after each TestHiveSingleton suite, we now delete and recreate the Hive warehouse directory. ## How was this patch tested? Ran full suite of tests locally, verified that they pass. Author: Greg Owen <[email protected]> Closes #19341 from GregOwen/SPARK-22120.

…ctests ## What changes were proposed in this pull request? This change disables the use of 0-parameter pandas_udfs due to the API being overly complex and awkward, and can easily be worked around by using an index column as an input argument. Also added doctests for pandas_udfs which revealed bugs for handling empty partitions and using the pandas_udf decorator. ## How was this patch tested? Reworked existing 0-parameter test to verify error is raised, added doctest for pandas_udf, added new tests for empty partition and decorator usage. Author: Bryan Cutler <[email protected]> Closes #19325 from BryanCutler/arrow-pandas_udf-0-param-remove-SPARK-22106.

caneGuy · 2017-09-26T02:58:52Z

在spark-user list提问吧： http://apache-spark-user-list.1001560.n3.nabble.com/ 2017-09-25 11:29 GMT+08:00 listenLearning <[email protected]>:

…

您好，最近我在开发的时候遇到一个问题，就是如果我用mappartitions这个api去存储数据到 hbase，会出现一个找不到partition的错误，然后跟着就会出现一个找不到广播变量的错误，请问这个是为什呢？？？一下是代码以及错误 def ASpan(span: DataFrame, time: String): Unit = { try { span.mapPartitions(iter=>{ iter.map(line => { val put = new Put(Bytes.toBytes(CreateRowkey.Bit16(line.getString(0)) + "_101301")) put.addColumn(Bytes.toBytes("CF"), Bytes.toBytes("CALLDT_TIME1PER_30"), Bytes.toBytes(line.getString(1))) put.addColumn(Bytes.toBytes("CF"), Bytes.toBytes("CALLDT_TIME2PER_30"), Bytes.toBytes(line.getString(2))) put.addColumn(Bytes.toBytes("CF"), Bytes.toBytes("CALLDT_TIME3PER_30"), Bytes.toBytes(line.getString(3))) put.addColumn(Bytes.toBytes("CF"), Bytes.toBytes("CALLDT_TIME4PER_30"), Bytes.toBytes(line.getString(4))) put.addColumn(Bytes.toBytes("CF"), Bytes.toBytes("CALLDT_HASCALL_1"), Bytes.toBytes(line.getLong(5).toString)) put.addColumn(Bytes.toBytes("CF"), Bytes.toBytes("CALLDT_HASCALL_3"), Bytes.toBytes(line.getLong(6).toString)) put.addColumn(Bytes.toBytes("CF"), Bytes.toBytes("CALLDT_HASCALL_6"), Bytes.toBytes(line.getLong(7).toString)) put.addColumn(Bytes.toBytes("CF"), Bytes.toBytes("CALLDT_NOCALL_1"), Bytes.toBytes(line.getLong(8).toString)) put.addColumn(Bytes.toBytes("CF"), Bytes.toBytes("CALLDT_NOCALL_3"), Bytes.toBytes(line.getLong(9).toString)) put.addColumn(Bytes.toBytes("CF"), Bytes.toBytes("CALLDT_NOCALL_6"), Bytes.toBytes(line.getLong(10).toString)) put.addColumn(Bytes.toBytes("CF"), Bytes.toBytes("DB_TIME"), Bytes.toBytes(time)) (new ImmutableBytesWritable, put) }) }).saveAsNewAPIHadoopDataset(shuliStreaming.indexTable) } catch { case e: Exception => shuliStreaming.WriteIn.writeLog("shuli", time, "静默期&近几月是否通话储错误", e) e.printStackTrace() println("静默期&近几月是否通话储错误" + e) } } error： 17/09/24 23:04:17 INFO spark.CacheManager: Partition rdd_11_1 not found, computing it 17/09/24 23:04:17 INFO rdd.HadoopRDD: Input split: hdfs://nameservice1/data/input/common/phlibrary/OFFLINEPHONELIBRARY.dat: 1146925+1146926 17/09/24 23:04:17 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 1 17/09/24 23:04:17 ERROR executor.Executor: Exception in task 1.0 in stage 250804.0 (TID 3190467) java.io.IOException: org.apache.spark.SparkException: Failed to get broadcast_1_piece0 of broadcast_1 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1223) at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock( TorrentBroadcast.scala:165) at org.apache.spark.broadcast.TorrentBroadcast._value$ lzycompute(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast._value( TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast.getValue( TorrentBroadcast.scala:88) at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70) at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:144) at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:212) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute( MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute( MapPartitionsRDD.scala:38) ------------------------------ You can view, comment on, or merge this pull request online at: #19335 Commit Summary - [SPARK-13969][ML] Add FeatureHasher transformer - [SPARK-21656][CORE] spark dynamic allocation should not idle timeout executors when tasks still to run - [SPARK-21603][SQL] The wholestage codegen will be much slower then that is closed when the function is too long - [SPARK-21738] Thriftserver doesn't cancel jobs when session is closed - [SPARK-21680][ML][MLLIB] optimize Vector compress - [SPARK-3151][BLOCK MANAGER] DiskStore.getBytes fails for files larger than 2GB - [SPARK-21743][SQL] top-most limit should not cause memory leak - [SPARK-21642][CORE] Use FQDN for DRIVER_HOST_ADDRESS instead of ip address - [SPARK-21428] Turn IsolatedClientLoader off while using builtin Hive jars for reusing CliSessionState - [SQL][MINOR][TEST] Set spark.unsafe.exceptionOnMemoryLeak to true - [SPARK-18394][SQL] Make an AttributeSet.toSeq output order consistent - [SPARK-16742] Mesos Kerberos Support - [SPARK-21677][SQL] json_tuple throws NullPointException when column is null as string type - [SPARK-21767][TEST][SQL] Add Decimal Test For Avro in VersionSuite - [SPARK-21739][SQL] Cast expression should initialize timezoneId when it is called statically to convert something into TimestampType - [SPARK-21778][SQL] Simpler Dataset.sample API in Scala / Java - [SPARK-21213][SQL] Support collecting partition-level statistics: rowCount and sizeInBytes - [SPARK-21743][SQL][FOLLOW-UP] top-most limit should not cause memory leak - [MINOR][TYPO] Fix typos: runnning and Excecutors - [SPARK-21566][SQL][PYTHON] Python method for summary - [SPARK-21790][TESTS] Fix Docker-based Integration Test errors. - [MINOR] Correct validateAndTransformSchema in GaussianMixture and AFTSurvivalRegression - [SPARK-21773][BUILD][DOCS] Installs mkdocs if missing in the path in SQL documentation build - [SPARK-21721][SQL][FOLLOWUP] Clear FileSystem deleteOnExit cache when paths are successfully removed - [SPARK-21782][CORE] Repartition creates skews when numPartitions is a power of 2 - [SPARK-21718][SQL] Heavy log of type: "Skipping partition based on stats ..." - [SPARK-21468][PYSPARK][ML] Python API for FeatureHasher - [SPARK-21790][TESTS][FOLLOW-UP] Add filter pushdown verification back. - [SPARK-21617][SQL] Store correct table metadata when altering schema in Hive metastore. - [SPARK-19762][ML][FOLLOWUP] Add necessary comments to L2Regularization. - [SPARK-21070][PYSPARK] Attempt to update cloudpickle again - [SPARK-21584][SQL][SPARKR] Update R method for summary to call new implementation - [SPARK-21803][TEST] Remove the HiveDDLCommandSuite - [SPARK-20641][CORE] Add missing kvstore module in Laucher and SparkSubmit code - [SPARK-21499][SQL] Support creating persistent function for Spark UDAF(UserDefinedAggregateFunction) - [SPARK-21769][SQL] Add a table-specific option for always respecting schemas inferred/controlled by Spark SQL - [SPARK-21681][ML] fix bug of MLOR do not work correctly when featureStd contains zero - [SPARK-10931][ML][PYSPARK] PySpark Models Copy Param Values from Estimator - [SPARK-21765] Set isStreaming on leaf nodes for streaming plans. - [ML][MINOR] Make sharedParams update. - [SPARK-19326] Speculated task attempts do not get launched in few scenarios - [SPARK-12664][ML] Expose probability in mlp model - [SPARK-21501] Change CacheLoader to limit entries based on memory footprint - [SPARK-21603][SQL][FOLLOW-UP] Change the default value of maxLinesPerFunction into 4000 - [SPARK-21807][SQL] Override ++ operation in ExpressionSet to reduce clone time - [SPARK-21805][SPARKR] Disable R vignettes code on Windows - [SPARK-21694][MESOS] Support Mesos CNI network labels - [MINOR][SQL] The comment of Class ExchangeCoordinator exist a typing and context error - [SPARK-21804][SQL] json_tuple returns null values within repeated columns except the first one - [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as arguments should validate input types for column - [SPARK-21745][SQL] Refactor ColumnVector hierarchy to make ColumnVector read-only and to introduce WritableColumnVector. - [SPARK-21759][SQL] In.checkInputDataTypes should not wrongly report unresolved plans for IN correlated subquery - [SPARK-21826][SQL] outer broadcast hash join should not throw NPE - [SPARK-21788][SS] Handle more exceptions when stopping a streaming query - [SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF` and ` SO_SNDBUF` in SparkConf. - [SPARK-21830][SQL] Bump ANTLR version and fix a few issues. - [SPARK-21108][ML] convert LinearSVC to aggregator framework - [SPARK-21255][SQL][WIP] Fixed NPE when creating encoder for enum - [SPARK-21527][CORE] Use buffer limit in order to use JAVA NIO Util's buffercache - [MINOR][BUILD] Fix build warnings and Java lint errors - [SPARK-21832][TEST] Merge SQLBuilderTest into ExpressionSQLBuilderSuite - [SPARK-21714][CORE][YARN] Avoiding re-uploading remote resources in yarn client mode - [SPARK-17742][CORE] Fail launcher app handle if child process exits with error. - [SPARK-21756][SQL] Add JSON option to allow unquoted control characters - [SPARK-21837][SQL][TESTS] UserDefinedTypeSuite Local UDTs not actually testing what it intends - [SPARK-21831][TEST] Remove `spark.sql.hive.convertMetastoreOrc` config in HiveCompatibilitySuite - [MINOR][DOCS] Minor doc fixes related with doc build and uses script dir in SQL doc gen script - [SPARK-21843] testNameNote should be "(minNumPostShufflePartitions: 5)" - [SPARK-21818][ML][MLLIB] Fix bug of MultivariateOnlineSummarizer.variance generate negative result - [SPARK-21798] No config to replace deprecated SPARK_CLASSPATH config for launching daemons like History Server - [SPARK-19662][SCHEDULER][TEST] Add Fair Scheduler Unit Test coverage for different build cases - [SPARK-17139][ML] Add model summary for MultinomialLogisticRegression - [SPARK-21781][SQL] Modify DataSourceScanExec to use concrete ColumnVector type. - [SPARK-21848][SQL] Add trait UserDefinedExpression to identify user-defined functions - [SPARK-21255][SQL] simplify encoder for java enum - [SPARK-21801][SPARKR][TEST] unit test randomly fail with randomforest - [MINOR][ML] Document treatment of instance weights in logreg summary - [SPARK-21728][CORE] Allow SparkSubmit to use Logging. - [SPARK-21813][CORE] Modify TaskMemoryManager.MAXIMUM_PAGE_SIZE_BYTES comments - [SPARK-21845][SQL] Make codegen fallback of expressions configurable - [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to handle FileOutputCommitter.getWorkPath==null - [MINOR][TEST] Off -heap memory leaks for unit tests - [SPARK-21873][SS] - Avoid using `return` inside `CachedKafkaConsumer.get` - [SPARK-21806][MLLIB] BinaryClassificationMetrics pr(): first point (0.0, 1.0) is misleading - [SPARK-21764][TESTS] Fix tests failures on Windows: resources not being closed and incorrect paths - [SPARK-21469][ML][EXAMPLES] Adding Examples for FeatureHasher - Revert "[SPARK-21845][SQL] Make codegen fallback of expressions configurable" - [MINOR][SQL][TEST] Test shuffle hash join while is not expected - [SPARK-21834] Incorrect executor request in case of dynamic allocation - [SPARK-21839][SQL] Support SQL config for ORC compression - [SPARK-21875][BUILD] Fix Java style bugs - [SPARK-11574][CORE] Add metrics StatsD sink - [SPARK-17321][YARN] Avoid writing shuffle metadata to disk if NM recovery is disabled - [SPARK-21534][SQL][PYSPARK] PickleException when creating dataframe from python row with empty bytearray - [SPARK-21583][SQL] Create a ColumnarBatch from ArrowColumnVectors - [SPARK-21878][SQL][TEST] Create SQLMetricsTestUtils - [SPARK-21886][SQL] Use SparkSession.internalCreateDataFrame to create… - [SPARK-20812][MESOS] Add secrets support to the dispatcher - [SPARK-21583][HOTFIX] Removed intercept in test causing failures - [SPARK-17107][SQL][FOLLOW-UP] Remove redundant pushdown rule for Union - [SPARK-21110][SQL] Structs, arrays, and other orderable datatypes should be usable in inequalities - [SPARK-17139][ML][FOLLOW-UP] Add convenient method `asBinary` for casting to BinaryLogisticRegressionSummary - [SPARK-21862][ML] Add overflow check in PCA - [SPARK-21779][PYTHON] Simpler DataFrame.sample API in Python - [SPARK-21789][PYTHON] Remove obsolete codes for parsing abstract schema strings - [SPARK-21728][CORE] Follow up: fix user config, auth in SparkSubmit logging. - [SPARK-21880][WEB UI] In the SQL table page, modify jobs trace information - [SPARK-14280][BUILD][WIP] Update change-version.sh and pom.xml to add Scala 2.12 profiles and enable 2.12 compilation - [SPARK-21895][SQL] Support changing database in HiveClient - [SPARK-21729][ML][TEST] Generic test for ProbabilisticClassifier to ensure consistent output columns - [SPARK-21891][SQL] Add TBLPROPERTIES to DDL statement: CREATE TABLE USING - [SPARK-21897][PYTHON][R] Add unionByName API to DataFrame in Python and R - [SPARK-21654][SQL] Complement SQL predicates expression description - [SPARK-21418][SQL] NoSuchElementException: None.get in DataSourceScanExec with sun.io.serialization.extendedDebugInfo=true - [SPARK-21913][SQL][TEST] withDatabase` should drop database with CASCADE - [SPARK-21903][BUILD] Upgrade scalastyle to 1.0.0. - [SPARK-20978][SQL] Bump up Univocity version to 2.5.4 - [SPARK-21845][SQL][TEST-MAVEN] Make codegen fallback of expressions configurable - [SPARK-21925] Update trigger interval documentation in docs with behavior change in Spark 2.2 - [SPARK-21652][SQL] Fix rule confliction between InferFiltersFromConstraints and ConstantPropagation - [MINOR][DOC] Update `Partition Discovery` section to enumerate all available file sources - [SPARK-18061][THRIFTSERVER] Add spnego auth support for ThriftServer thrift/http protocol - [SPARK-9104][CORE] Expose Netty memory metrics in Spark - [SPARK-21924][DOCS] Update structured streaming programming guide doc - [SPARK-19357][ML] Adding parallel model evaluation in ML tuning - [SPARK-21903][BUILD][FOLLOWUP] Upgrade scalastyle-maven-plugin and scalastyle as well in POM and SparkBuild.scala - [SPARK-21835][SQL] RewritePredicateSubquery should not produce unresolved query plans - [SPARK-21801][SPARKR][TEST] set random seed for predictable test - [SPARK-21765] Check that optimization doesn't affect isStreaming bit. - [SPARK-21901][SS] Define toString for StateOperatorProgress - Fixed pandoc dependency issue in python/setup.py - [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery should not produce unresolved query plans - [SPARK-21912][SQL] ORC/Parquet table should not create invalid column names - [SPARK-21890] Credentials not being passed to add the tokens - [SPARK-13656][SQL] Delete spark.sql.parquet.cacheMetadata from SQLConf and docs - [SPARK-21939][TEST] Use TimeLimits instead of Timeouts - [SPARK-21950][SQL][PYTHON][TEST] pyspark.sql.tests.SQLTests2 should stop SparkContext. - [SPARK-21949][TEST] Tables created in unit tests should be dropped after use - [SPARK-21726][SQL] Check for structural integrity of the plan in Optimzer in test mode. - [SPARK-21936][SQL] backward compatibility test framework for HiveExternalCatalog - [SPARK-21726][SQL][FOLLOW-UP] Check for structural integrity of the plan in Optimzer in test mode - [SPARK-21946][TEST] fix flaky test: "alter table: rename cached table" in InMemoryCatalogedDDLSuite - [SPARK-15243][ML][SQL][PYTHON] Add missing support for unicode in Param methods & functions in dataframe - [SPARK-19866][ML][PYSPARK] Add local version of Word2Vec findSynonyms for spark.ml: Python API - [SPARK-21941] Stop storing unused attemptId in SQLTaskMetrics - [SPARK-21954][SQL] JacksonUtils should verify MapType's value type instead of key type - [MINOR][SQL] Correct DataFrame doc. - [SPARK-4131] Support "Writing data into the filesystem from queries" - [SPARK-20098][PYSPARK] dataType's typeName fix - [SPARK-21610][SQL] Corrupt records are not handled properly when creating a dataframe from a file - [BUILD][TEST][SPARKR] add sparksubmitsuite to appveyor tests - [SPARK-21856] Add probability and rawPrediction to MLPC for Python - [MINOR][SQL] remove unuse import class - [SPARK-21976][DOC] Fix wrong documentation for Mean Absolute Error. - [SPARK-14516][ML] Adding ClusteringEvaluator with the implementation of Cosine silhouette and squared Euclidean silhouette. - [SPARK-21610][SQL][FOLLOWUP] Corrupt records are not handled properly when creating a dataframe from a file - [DOCS] Fix unreachable links in the document - [SPARK-17642][SQL] support DESC EXTENDED/FORMATTED table column commands - [SPARK-21027][ML][PYTHON] Added tunable parallelism to one vs. rest in both Scala mllib and Pyspark - [SPARK-21368][SQL] TPCDSQueryBenchmark can't refer query files. - [SPARK-18608][ML] Fix double caching - [SPARK-21979][SQL] Improve QueryPlanConstraints framework - [SPARK-21513][SQL] Allow UDF to_json support converting MapType to json - [SPARK-21027][MINOR][FOLLOW-UP] add missing since tag - [BUILD] Close stale PRs - [SPARK-21982] Set locale to US - [SPARK-21893][BUILD][STREAMING][WIP] Put Kafka 0.8 behind a profile - [SPARK-21963][CORE][TEST] Create temp file should be delete after use - [SPARK-21690][ML] one-pass imputer - [SPARK-21970][CORE] Fix Redundant Throws Declarations in Java Codebase - [SPARK-21980][SQL] References in grouping functions should be indexed with semanticEquals - [SPARK-4131] Merge HiveTmpFile.scala to SaveAsHiveFile.scala - [SPARK-20427][SQL] Read JDBC table use custom schema - [SPARK-21973][SQL] Add an new option to filter queries in TPC-DS - [MINOR][SQL] Only populate type metadata for required types such as CHAR/VARCHAR. - [SPARK-21854] Added LogisticRegressionTrainingSummary for MultinomialLogisticRegression in Python API - [MINOR][DOC] Add missing call of `update()` in examples of PeriodicGraphCheckpointer & PeriodicRDDCheckpointer - [SPARK-18608][ML][FOLLOWUP] Fix double caching for PySpark OneVsRest. - [SPARK-4131][FOLLOW-UP] Support "Writing data into the filesystem from queries" - [SPARK-21922] Fix duration always updating when task failed but status is still RUN… - [SPARK-17642][SQL][FOLLOWUP] drop test tables and improve comments - [SPARK-21988] Add default stats to StreamingExecutionRelation. - [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support converting MapType to json for PySpark and SparkR - [SPARK-22018][SQL] Preserve top-level alias metadata when collapsing projects - [SPARK-21902][CORE] Print root cause for BlockManager#doPut - [SPARK-22002][SQL] Read JDBC table use custom schema support specify partial fields. - [SPARK-21987][SQL] fix a compatibility issue of sql event logs - [SPARK-21958][ML] Word2VecModel save: transform data in the cluster - [SPARK-15689][SQL] data source v2 read path - [SPARK-22017] Take minimum of all watermark execs in StreamExecution. - [SPARK-21967][CORE] org.apache.spark.unsafe.types.UTF8String#compareTo Should Compare 8 Bytes at a Time for Better Performance - [SPARK-22032][PYSPARK] Speed up StructType conversion - [SPARK-21985][PYSPARK] PairDeserializer is broken for double-zipped RDDs - [SPARK-21953] Show both memory and disk bytes spilled if either is present - [SPARK-22043][PYTHON] Improves error message for show_profiles and dump_profiles - [SPARK-21113][CORE] Read ahead input stream to amortize disk IO cost … - [SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSuite - [SPARK-22003][SQL] support array column in vectorized reader with UDF - [SPARK-14878][SQL] Trim characters string function support - [SPARK-22030][CORE] GraphiteSink fails to re-connect to Graphite instances behind an ELB or any other auto-scaled LB - [SPARK-22047][FLAKY TEST] HiveExternalCatalogVersionsSuite - [SPARK-21923][CORE] Avoid calling reserveUnrollMemoryForThisTask for every record - [MINOR][CORE] Cleanup dead code and duplication in Mem. Management - [SPARK-22052] Incorrect Metric assigned in MetricsReporter.scala - [SPARK-21428][SQL][FOLLOWUP] CliSessionState should point to the actual metastore not a dummy one - [SPARK-21917][CORE][YARN] Supporting adding http(s) resources in yarn mode - [MINOR][ML] Remove unnecessary default value setting for evaluators. - [SPARK-21338][SQL] implement isCascadingTruncateTable() method in AggregatedDialect - [SPARK-21969][SQL] CommandUtils.updateTableStats should call refreshTable - [SPARK-22067][SQL] ArrowWriter should use position when setting UTF8String ByteBuffer - [SPARK-18838][CORE] Add separate listener queues to LiveListenerBus. - [SPARK-21977] SinglePartition optimizations break certain Streaming Stateful Aggregation requirements - [SPARK-22066][BUILD] Update checkstyle to 8.2, enable it, fix violations - [SPARK-22066][BUILD][HOTFIX] Revert scala-maven-plugin to 3.2.2 to work with Maven+zinc again - [SPARK-22049][DOCS] Confusing behavior of from_utc_timestamp and to_utc_timestamp - [SPARK-22076][SQL] Expand.projections should not be a Stream - [SPARK-18838][HOTFIX][YARN] Check internal context state before stopping it. - [SPARK-21384][YARN] Spark + YARN fails with LocalFileSystem as default FS - [SPARK-22076][SQL][FOLLOWUP] Expand.projections should not be a Stream - [SPARK-21934][CORE] Expose Shuffle Netty memory usage to MetricsSystem - [SPARK-21780][R] Simpler Dataset.sample API in R - [SPARK-17997][SQL] Add an aggregation function for counting distinct values for multiple intervals - [SPARK-22086][DOCS] Add expression description for CASE WHEN - [SPARK-21977][HOTFIX] Adjust EnsureStatefulOpPartitioningSuite to use scalatest lifecycle normally instead of constructor - [SPARK-21928][CORE] Set classloader on SerializerManager's private kryo - [INFRA] Close stale PRs. - [SPARK-22088][SQL] Incorrect scalastyle comment causes wrong styles in stringExpressions - [SPARK-22075][ML] GBTs unpersist datasets cached by Checkpointer - [SPARK-22009][ML] Using treeAggregate improve some algs - [SPARK-22053][SS] Stream-stream inner join in Append Mode - [SPARK-22094][SS] processAllAvailable should check the query state - [SPARK-21981][PYTHON][ML] Added Python interface for ClusteringEvaluator - [SPARK-21998][SQL] SortMergeJoinExec did not calculate its outputOrdering correctly during physical planning - [SPARK-22072][SPARK-22071][BUILD] Improve release build scripts - [SPARK-21190][PYSPARK] Python Vectorized UDFs - [UI][STREAMING] Modify the title, 'Records' instead of 'Input Size' - [SPARK-22092] Reallocation in OffHeapColumnVector.reserveInternal corrupts struct and array data - [SPARK-21766][PYSPARK][SQL] DataFrame toPandas() raises ValueError with nullable int columns - [SPARK-22060][ML] Fix CrossValidator/TrainValidationSplit param persist/load bug - [SPARK-18136] Fix SPARK_JARS_DIR for Python pip install on Windows - [SPARK-22099] The 'job ids' list style needs to be changed in the SQL page. - [SPARK-22033][CORE] BufferHolder, other size checks should account for the specific VM array size limitations - [SPARK-22109][SQL] Resolves type conflicts between strings and timestamps in partition column - [SPARK-20448][DOCS] Document how FileInputDStream works with object storage - [SPARK-22110][SQL][DOCUMENTATION] Add usage and improve documentation with arguments and examples for trim function - [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTruncateTable() method in AggregatedDialect - [SPARK-22093][TESTS] Fixes `assume` in `UtilsSuite` and `HiveDDLSuite` - [SPARK-22058][CORE] the BufferedInputStream will not be closed if an exception occurs. - [SPARK-22087][SPARK-14650][WIP][BUILD][REPL][CORE] Compile Spark REPL for Scala 2.12 + other 2.12 fixes - [SPARK-22107] Change as to alias in python quickstart File Changes - *A* .github/PULL_REQUEST_TEMPLATE <https://github.com/apache/spark/pull/19335/files#diff-0> (10) - *M* .gitignore <https://github.com/apache/spark/pull/19335/files#diff-1> (113) - *D* .rat-excludes <https://github.com/apache/spark/pull/19335/files#diff-2> (85) - *A* .travis.yml <https://github.com/apache/spark/pull/19335/files#diff-3> (50) - *M* CONTRIBUTING.md <https://github.com/apache/spark/pull/19335/files#diff-4> (4) - *M* LICENSE <https://github.com/apache/spark/pull/19335/files#diff-5> (38) - *M* NOTICE <https://github.com/apache/spark/pull/19335/files#diff-6> (78) - *M* R/.gitignore <https://github.com/apache/spark/pull/19335/files#diff-7> (2) - *A* R/CRAN_RELEASE.md <https://github.com/apache/spark/pull/19335/files#diff-8> (91) - *M* R/DOCUMENTATION.md <https://github.com/apache/spark/pull/19335/files#diff-9> (12) - *M* R/README.md <https://github.com/apache/spark/pull/19335/files#diff-10> (52) - *M* R/WINDOWS.md <https://github.com/apache/spark/pull/19335/files#diff-11> (33) - *A* R/check-cran.sh <https://github.com/apache/spark/pull/19335/files#diff-12> (76) - *M* R/create-docs.sh <https://github.com/apache/spark/pull/19335/files#diff-13> (24) - *A* R/create-rd.sh <https://github.com/apache/spark/pull/19335/files#diff-14> (37) - *A* R/find-r.sh <https://github.com/apache/spark/pull/19335/files#diff-15> (34) - *M* R/install-dev.bat <https://github.com/apache/spark/pull/19335/files#diff-16> (6) - *M* R/install-dev.sh <https://github.com/apache/spark/pull/19335/files#diff-17> (16) - *A* R/install-source-package.sh <https://github.com/apache/spark/pull/19335/files#diff-18> (57) - *A* R/pkg/.Rbuildignore <https://github.com/apache/spark/pull/19335/files#diff-19> (9) - *M* R/pkg/.lintr <https://github.com/apache/spark/pull/19335/files#diff-20> (2) - *M* R/pkg/DESCRIPTION <https://github.com/apache/spark/pull/19335/files#diff-21> (52) - *M* R/pkg/NAMESPACE <https://github.com/apache/spark/pull/19335/files#diff-22> (240) - *M* R/pkg/R/DataFrame.R <https://github.com/apache/spark/pull/19335/files#diff-23> (3301) - *M* R/pkg/R/RDD.R <https://github.com/apache/spark/pull/19335/files#diff-24> (1742) - *M* R/pkg/R/SQLContext.R <https://github.com/apache/spark/pull/19335/files#diff-25> (830) - *A* R/pkg/R/WindowSpec.R <https://github.com/apache/spark/pull/19335/files#diff-26> (224) - *M* R/pkg/R/backend.R <https://github.com/apache/spark/pull/19335/files#diff-27> (25) - *M* R/pkg/R/broadcast.R <https://github.com/apache/spark/pull/19335/files#diff-28> (9) - *A* R/pkg/R/catalog.R <https://github.com/apache/spark/pull/19335/files#diff-29> (526) - *M* R/pkg/R/client.R <https://github.com/apache/spark/pull/19335/files#diff-30> (16) - *M* R/pkg/R/column.R <https://github.com/apache/spark/pull/19335/files#diff-31> (178) - *M* R/pkg/R/context.R <https://github.com/apache/spark/pull/19335/files#diff-32> (470) - *M* R/pkg/R/deserialize.R <https://github.com/apache/spark/pull/19335/files#diff-33> (37) - *M* R/pkg/R/functions.R <https://github.com/apache/spark/pull/19335/files#diff-34> (3195) - *M* R/pkg/R/generics.R <https://github.com/apache/spark/pull/19335/files#diff-35> (1085) - *M* R/pkg/R/group.R <https://github.com/apache/spark/pull/19335/files#diff-36> (152) - *A* R/pkg/R/install.R <https://github.com/apache/spark/pull/19335/files#diff-37> (312) - *M* R/pkg/R/jobj.R <https://github.com/apache/spark/pull/19335/files#diff-38> (9) - *A* R/pkg/R/jvm.R <https://github.com/apache/spark/pull/19335/files#diff-39> (117) - *D* R/pkg/R/mllib.R <https://github.com/apache/spark/pull/19335/files#diff-40> (101) - *A* R/pkg/R/mllib_classification.R <https://github.com/apache/spark/pull/19335/files#diff-41> (635) - *A* R/pkg/R/mllib_clustering.R <https://github.com/apache/spark/pull/19335/files#diff-42> (634) - *A* R/pkg/R/mllib_fpm.R <https://github.com/apache/spark/pull/19335/files#diff-43> (162) - *A* R/pkg/R/mllib_recommendation.R <https://github.com/apache/spark/pull/19335/files#diff-44> (162) - *A* R/pkg/R/mllib_regression.R <https://github.com/apache/spark/pull/19335/files#diff-45> (552) - *A* R/pkg/R/mllib_stat.R <https://github.com/apache/spark/pull/19335/files#diff-46> (127) - *A* R/pkg/R/mllib_tree.R <https://github.com/apache/spark/pull/19335/files#diff-47> (765) - *A* R/pkg/R/mllib_utils.R <https://github.com/apache/spark/pull/19335/files#diff-48> (132) - *M* R/pkg/R/pairRDD.R <https://github.com/apache/spark/pull/19335/files#diff-49> (970) - *M* R/pkg/R/schema.R <https://github.com/apache/spark/pull/19335/files#diff-50> (103) - *M* R/pkg/R/serialize.R <https://github.com/apache/spark/pull/19335/files#diff-51> (6) - *M* R/pkg/R/sparkR.R <https://github.com/apache/spark/pull/19335/files#diff-52> (462) - *M* R/pkg/R/stats.R <https://github.com/apache/spark/pull/19335/files#diff-53> (173) - *A* R/pkg/R/streaming.R <https://github.com/apache/spark/pull/19335/files#diff-54> (214) - *A* R/pkg/R/types.R <https://github.com/apache/spark/pull/19335/files#diff-55> (85) - *M* R/pkg/R/utils.R <https://github.com/apache/spark/pull/19335/files#diff-56> (326) - *A* R/pkg/R/window.R <https://github.com/apache/spark/pull/19335/files#diff-57> (116) - *M* R/pkg/inst/profile/general.R <https://github.com/apache/spark/pull/19335/files#diff-58> (5) - *M* R/pkg/inst/profile/shell.R <https://github.com/apache/spark/pull/19335/files#diff-59> (14) - *D* R/pkg/inst/test_support/sparktestjar_2.10-1.0.jar <https://github.com/apache/spark/pull/19335/files#diff-60> (0) - *D* R/pkg/inst/tests/jarTest.R <https://github.com/apache/spark/pull/19335/files#diff-61> (32) - *D* R/pkg/inst/tests/packageInAJarTest.R <https://github.com/apache/spark/pull/19335/files#diff-62> (30) - *D* R/pkg/inst/tests/test_Serde.R <https://github.com/apache/spark/pull/19335/files#diff-63> (77) - *D* R/pkg/inst/tests/test_binaryFile.R <https://github.com/apache/spark/pull/19335/files#diff-64> (89) - *D* R/pkg/inst/tests/test_binary_function.R <https://github.com/apache/spark/pull/19335/files#diff-65> (101) - *D* R/pkg/inst/tests/test_broadcast.R <https://github.com/apache/spark/pull/19335/files#diff-66> (48) - *D* R/pkg/inst/tests/test_client.R <https://github.com/apache/spark/pull/19335/files#diff-67> (36) - *D* R/pkg/inst/tests/test_context.R <https://github.com/apache/spark/pull/19335/files#diff-68> (94) - *D* R/pkg/inst/tests/test_includeJAR.R <https://github.com/apache/spark/pull/19335/files#diff-69> (37) - *D* R/pkg/inst/tests/test_includePackage.R <https://github.com/apache/spark/pull/19335/files#diff-70> (57) - *D* R/pkg/inst/tests/test_mllib.R <https://github.com/apache/spark/pull/19335/files#diff-71> (86) - *D* R/pkg/inst/tests/test_parallelize_collect.R <https://github.com/apache/spark/pull/19335/files#diff-72> (109) - *D* R/pkg/inst/tests/test_rdd.R <https://github.com/apache/spark/pull/19335/files#diff-73> (793) - *D* R/pkg/inst/tests/test_shuffle.R <https://github.com/apache/spark/pull/19335/files#diff-74> (221) - *D* R/pkg/inst/tests/test_sparkSQL.R <https://github.com/apache/spark/pull/19335/files#diff-75> (1499) - *D* R/pkg/inst/tests/test_take.R <https://github.com/apache/spark/pull/19335/files#diff-76> (66) - *D* R/pkg/inst/tests/test_textFile.R <https://github.com/apache/spark/pull/19335/files#diff-77> (161) - *D* R/pkg/inst/tests/test_utils.R <https://github.com/apache/spark/pull/19335/files#diff-78> (140) - *A* R/pkg/inst/tests/testthat/test_basic.R <https://github.com/apache/spark/pull/19335/files#diff-79> (90) - *M* R/pkg/inst/worker/daemon.R <https://github.com/apache/spark/pull/19335/files#diff-80> (62) - *M* R/pkg/inst/worker/worker.R <https://github.com/apache/spark/pull/19335/files#diff-81> (135) - *A* R/pkg/tests/fulltests/jarTest.R <https://github.com/apache/spark/pull/19335/files#diff-82> (32) - *A* R/pkg/tests/fulltests/packageInAJarTest.R <https://github.com/apache/spark/pull/19335/files#diff-83> (30) - *A* R/pkg/tests/fulltests/test_Serde.R <https://github.com/apache/spark/pull/19335/files#diff-84> (79) - *A* R/pkg/tests/fulltests/test_Windows.R <https://github.com/apache/spark/pull/19335/files#diff-85> (27) - *A* R/pkg/tests/fulltests/test_binaryFile.R <https://github.com/apache/spark/pull/19335/files#diff-86> (92) - *A* R/pkg/tests/fulltests/test_binary_function.R <https://github.com/apache/spark/pull/19335/files#diff-87> (104) - *A* R/pkg/tests/fulltests/test_broadcast.R <https://github.com/apache/spark/pull/19335/files#diff-88> (51) - *A* R/pkg/tests/fulltests/test_client.R <https://github.com/apache/spark/pull/19335/files#diff-89> (43) - *A* R/pkg/tests/fulltests/test_context.R <https://github.com/apache/spark/pull/19335/files#diff-90> (0) - *M* R/pkg/tests/fulltests/test_includePackage.R <https://github.com/apache/spark/pull/19335/files#diff-91> (0) - *M* R/pkg/tests/fulltests/test_jvm_api.R <https://github.com/apache/spark/pull/19335/files#diff-92> (0) - *M* R/pkg/tests/fulltests/test_mllib_classification.R <https://github.com/apache/spark/pull/19335/files#diff-93> (0) - *M* R/pkg/tests/fulltests/test_mllib_clustering.R <https://github.com/apache/spark/pull/19335/files#diff-94> (0) - *M* R/pkg/tests/fulltests/test_mllib_fpm.R <https://github.com/apache/spark/pull/19335/files#diff-95> (0) - *M* R/pkg/tests/fulltests/test_mllib_recommendation.R <https://github.com/apache/spark/pull/19335/files#diff-96> (0) - *M* R/pkg/tests/fulltests/test_mllib_regression.R <https://github.com/apache/spark/pull/19335/files#diff-97> (0) - *M* R/pkg/tests/fulltests/test_mllib_stat.R <https://github.com/apache/spark/pull/19335/files#diff-98> (0) - *M* R/pkg/tests/fulltests/test_mllib_tree.R <https://github.com/apache/spark/pull/19335/files#diff-99> (0) - *M* R/pkg/tests/fulltests/test_parallelize_collect.R <https://github.com/apache/spark/pull/19335/files#diff-100> (0) - *M* R/pkg/tests/fulltests/test_rdd.R <https://github.com/apache/spark/pull/19335/files#diff-101> (0) - *M* R/pkg/tests/fulltests/test_shuffle.R <https://github.com/apache/spark/pull/19335/files#diff-102> (0) - *M* R/pkg/tests/fulltests/test_sparkR.R <https://github.com/apache/spark/pull/19335/files#diff-103> (0) - *M* R/pkg/tests/fulltests/test_sparkSQL.R <https://github.com/apache/spark/pull/19335/files#diff-104> (0) - *M* R/pkg/tests/fulltests/test_streaming.R <https://github.com/apache/spark/pull/19335/files#diff-105> (0) - *M* R/pkg/tests/fulltests/test_take.R <https://github.com/apache/spark/pull/19335/files#diff-106> (0) - *M* R/pkg/tests/fulltests/test_textFile.R <https://github.com/apache/spark/pull/19335/files#diff-107> (0) - *M* R/pkg/tests/fulltests/test_utils.R <https://github.com/apache/spark/pull/19335/files#diff-108> (0) - *M* R/pkg/tests/run-all.R <https://github.com/apache/spark/pull/19335/files#diff-109> (0) - *M* R/pkg/vignettes/sparkr-vignettes.Rmd <https://github.com/apache/spark/pull/19335/files#diff-110> (0) - *M* R/run-tests.sh <https://github.com/apache/spark/pull/19335/files#diff-111> (0) - *M* README.md <https://github.com/apache/spark/pull/19335/files#diff-112> (0) - *M* appveyor.yml <https://github.com/apache/spark/pull/19335/files#diff-113> (0) - *M* assembly/README <https://github.com/apache/spark/pull/19335/files#diff-114> (0) - *M* assembly/pom.xml <https://github.com/apache/spark/pull/19335/files#diff-115> (0) - *M* assembly/src/main/assembly/assembly.xml <https://github.com/apache/spark/pull/19335/files#diff-116> (0) - *M* bagel/pom.xml <https://github.com/apache/spark/pull/19335/files#diff-117> (0) - *M* bagel/src/main/scala/org/apache/spark/bagel/Bagel.scala <https://github.com/apache/spark/pull/19335/files#diff-118> (0) - *M* bagel/src/main/scala/org/apache/spark/bagel/package-info.java <https://github.com/apache/spark/pull/19335/files#diff-119> (0) - *M* bagel/src/main/scala/org/apache/spark/bagel/package.scala <https://github.com/apache/spark/pull/19335/files#diff-120> (0) - *M* bin/beeline <https://github.com/apache/spark/pull/19335/files#diff-121> (0) - *M* bin/beeline.cmd <https://github.com/apache/spark/pull/19335/files#diff-122> (0) - *M* bin/find-spark-home <https://github.com/apache/spark/pull/19335/files#diff-123> (0) - *M* bin/load-spark-env.cmd <https://github.com/apache/spark/pull/19335/files#diff-124> (0) - *M* bin/load-spark-env.sh <https://github.com/apache/spark/pull/19335/files#diff-125> (0) - *M* bin/pyspark <https://github.com/apache/spark/pull/19335/files#diff-126> (0) - *M* bin/pyspark.cmd <https://github.com/apache/spark/pull/19335/files#diff-127> (0) - *M* bin/pyspark2.cmd <https://github.com/apache/spark/pull/19335/files#diff-128> (0) - *M* bin/run-example <https://github.com/apache/spark/pull/19335/files#diff-129> (0) - *M* bin/run-example.cmd <https://github.com/apache/spark/pull/19335/files#diff-130> (0) - *M* bin/run-example2.cmd <https://github.com/apache/spark/pull/19335/files#diff-131> (0) - *M* bin/spark-class <https://github.com/apache/spark/pull/19335/files#diff-132> (0) - *M* bin/spark-class.cmd <https://github.com/apache/spark/pull/19335/files#diff-133> (0) - *M* bin/spark-class2.cmd <https://github.com/apache/spark/pull/19335/files#diff-134> (0) - *M* bin/spark-shell <https://github.com/apache/spark/pull/19335/files#diff-135> (0) - *M* bin/spark-shell.cmd <https://github.com/apache/spark/pull/19335/files#diff-136> (0) - *M* bin/spark-shell2.cmd <https://github.com/apache/spark/pull/19335/files#diff-137> (0) - *M* bin/spark-sql <https://github.com/apache/spark/pull/19335/files#diff-138> (0) - *M* bin/spark-submit <https://github.com/apache/spark/pull/19335/files#diff-139> (0) - *M* bin/spark-submit.cmd <https://github.com/apache/spark/pull/19335/files#diff-140> (0) - *M* bin/spark-submit2.cmd <https://github.com/apache/spark/pull/19335/files#diff-141> (0) - *M* bin/sparkR <https://github.com/apache/spark/pull/19335/files#diff-142> (0) - *M* bin/sparkR.cmd <https://github.com/apache/spark/pull/19335/files#diff-143> (0) - *M* bin/sparkR2.cmd <https://github.com/apache/spark/pull/19335/files#diff-144> (0) - *M* build/mvn <https://github.com/apache/spark/pull/19335/files#diff-145> (0) - *M* build/sbt-launch-lib.bash <https://github.com/apache/spark/pull/19335/files#diff-146> (0) - *M* build/spark-build-info <https://github.com/apache/spark/pull/19335/files#diff-147> (0) - *M* common/kvstore/pom.xml <https://github.com/apache/spark/pull/19335/files#diff-148> (0) - *M* common/kvstore/src/main/java/org/apache/spark/util/kvstore/ ArrayWrappers.java <https://github.com/apache/spark/pull/19335/files#diff-149> (0) - *M* common/kvstore/src/main/java/org/apache/spark/util/kvstore/ InMemoryStore.java <https://github.com/apache/spark/pull/19335/files#diff-150> (0) - *M* common/kvstore/src/main/java/org/apache/spark/util/kvstore/ KVIndex.java <https://github.com/apache/spark/pull/19335/files#diff-151> (0) - *M* common/kvstore/src/main/java/org/apache/spark/util/kvstore/ KVStore.java <https://github.com/apache/spark/pull/19335/files#diff-152> (0) - *M* common/kvstore/src/main/java/org/apache/spark/util/kvstore/ KVStoreIterator.java <https://github.com/apache/spark/pull/19335/files#diff-153> (0) - *M* common/kvstore/src/main/java/org/apache/spark/util/kvstore/ KVStoreSerializer.java <https://github.com/apache/spark/pull/19335/files#diff-154> (0) - *M* common/kvstore/src/main/java/org/apache/spark/util/kvstore/ KVStoreView.java <https://github.com/apache/spark/pull/19335/files#diff-155> (0) - *M* common/kvstore/src/main/java/org/apache/spark/util/kvstore/ KVTypeInfo.java <https://github.com/apache/spark/pull/19335/files#diff-156> (0) - *M* common/kvstore/src/main/java/org/apache/spark/util/kvstore/ LevelDB.java <https://github.com/apache/spark/pull/19335/files#diff-157> (0) - *M* common/kvstore/src/main/java/org/apache/spark/util/kvstore/ LevelDBIterator.java <https://github.com/apache/spark/pull/19335/files#diff-158> (0) - *M* common/kvstore/src/main/java/org/apache/spark/util/kvstore/ LevelDBTypeInfo.java <https://github.com/apache/spark/pull/19335/files#diff-159> (0) - *M* common/kvstore/src/main/java/org/apache/spark/util/kvstore/ UnsupportedStoreVersionException.java <https://github.com/apache/spark/pull/19335/files#diff-160> (0) - *M* common/kvstore/src/test/java/org/apache/spark/util/kvstore/ ArrayKeyIndexType.java <https://github.com/apache/spark/pull/19335/files#diff-161> (0) - *M* common/kvstore/src/test/java/org/apache/spark/util/kvstore/ ArrayWrappersSuite.java <https://github.com/apache/spark/pull/19335/files#diff-162> (0) - *M* common/kvstore/src/test/java/org/apache/spark/util/kvstore/ CustomType1.java <https://github.com/apache/spark/pull/19335/files#diff-163> (0) - *M* common/kvstore/src/test/java/org/apache/spark/util/kvstore/ DBIteratorSuite.java <https://github.com/apache/spark/pull/19335/files#diff-164> (0) - *M* common/kvstore/src/test/java/org/apache/spark/util/kvstore/ InMemoryIteratorSuite.java <https://github.com/apache/spark/pull/19335/files#diff-165> (0) - *M* common/kvstore/src/test/java/org/apache/spark/util/kvstore/ InMemoryStoreSuite.java <https://github.com/apache/spark/pull/19335/files#diff-166> (0) - *M* common/kvstore/src/test/java/org/apache/spark/util/kvstore/ LevelDBBenchmark.java <https://github.com/apache/spark/pull/19335/files#diff-167> (0) - *M* common/kvstore/src/test/java/org/apache/spark/util/kvstore/ LevelDBIteratorSuite.java <https://github.com/apache/spark/pull/19335/files#diff-168> (0) - *M* common/kvstore/src/test/java/org/apache/spark/util/kvstore/ LevelDBSuite.java <https://github.com/apache/spark/pull/19335/files#diff-169> (0) - *M* common/kvstore/src/test/java/org/apache/spark/util/kvstore/ LevelDBTypeInfoSuite.java <https://github.com/apache/spark/pull/19335/files#diff-170> (0) - *R* common/kvstore/src/test/resources/log4j.properties <https://github.com/apache/spark/pull/19335/files#diff-171> (0) - *M* common/network-common/pom.xml <https://github.com/apache/spark/pull/19335/files#diff-172> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/TransportContext.java <https://github.com/apache/spark/pull/19335/files#diff-173> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/buffer/FileSegmentManagedBuffer.java <https://github.com/apache/spark/pull/19335/files#diff-174> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/buffer/ManagedBuffer.java <https://github.com/apache/spark/pull/19335/files#diff-175> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/buffer/NettyManagedBuffer.java <https://github.com/apache/spark/pull/19335/files#diff-176> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/buffer/NioManagedBuffer.java <https://github.com/apache/spark/pull/19335/files#diff-177> (0) - *R* common/network-common/src/main/java/org/apache/spark/ network/client/ChunkFetchFailureException.java <https://github.com/apache/spark/pull/19335/files#diff-178> (0) - *R* common/network-common/src/main/java/org/apache/spark/ network/client/ChunkReceivedCallback.java <https://github.com/apache/spark/pull/19335/files#diff-179> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/client/RpcResponseCallback.java <https://github.com/apache/spark/pull/19335/files#diff-180> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/client/StreamCallback.java <https://github.com/apache/spark/pull/19335/files#diff-181> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/client/StreamInterceptor.java <https://github.com/apache/spark/pull/19335/files#diff-182> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/client/TransportClient.java <https://github.com/apache/spark/pull/19335/files#diff-183> (0) - *R* common/network-common/src/main/java/org/apache/spark/ network/client/TransportClientBootstrap.java <https://github.com/apache/spark/pull/19335/files#diff-184> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/client/TransportClientFactory.java <https://github.com/apache/spark/pull/19335/files#diff-185> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/client/TransportResponseHandler.java <https://github.com/apache/spark/pull/19335/files#diff-186> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/crypto/AuthClientBootstrap.java <https://github.com/apache/spark/pull/19335/files#diff-187> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/crypto/AuthEngine.java <https://github.com/apache/spark/pull/19335/files#diff-188> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/crypto/AuthRpcHandler.java <https://github.com/apache/spark/pull/19335/files#diff-189> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/crypto/AuthServerBootstrap.java <https://github.com/apache/spark/pull/19335/files#diff-190> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/crypto/ClientChallenge.java <https://github.com/apache/spark/pull/19335/files#diff-191> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/crypto/README.md <https://github.com/apache/spark/pull/19335/files#diff-192> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/crypto/ServerResponse.java <https://github.com/apache/spark/pull/19335/files#diff-193> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/crypto/TransportCipher.java <https://github.com/apache/spark/pull/19335/files#diff-194> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/AbstractMessage.java <https://github.com/apache/spark/pull/19335/files#diff-195> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/AbstractResponseMessage.java <https://github.com/apache/spark/pull/19335/files#diff-196> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/ChunkFetchFailure.java <https://github.com/apache/spark/pull/19335/files#diff-197> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/ChunkFetchRequest.java <https://github.com/apache/spark/pull/19335/files#diff-198> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/ChunkFetchSuccess.java <https://github.com/apache/spark/pull/19335/files#diff-199> (0) - *R* common/network-common/src/main/java/org/apache/spark/ network/protocol/Encodable.java <https://github.com/apache/spark/pull/19335/files#diff-200> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/Encoders.java <https://github.com/apache/spark/pull/19335/files#diff-201> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/Message.java <https://github.com/apache/spark/pull/19335/files#diff-202> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/MessageDecoder.java <https://github.com/apache/spark/pull/19335/files#diff-203> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/MessageEncoder.java <https://github.com/apache/spark/pull/19335/files#diff-204> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/MessageWithHeader.java <https://github.com/apache/spark/pull/19335/files#diff-205> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/OneWayMessage.java <https://github.com/apache/spark/pull/19335/files#diff-206> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/RequestMessage.java <https://github.com/apache/spark/pull/19335/files#diff-207> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/ResponseMessage.java <https://github.com/apache/spark/pull/19335/files#diff-208> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/RpcFailure.java <https://github.com/apache/spark/pull/19335/files#diff-209> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/RpcRequest.java <https://github.com/apache/spark/pull/19335/files#diff-210> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/RpcResponse.java <https://github.com/apache/spark/pull/19335/files#diff-211> (0) - *R* common/network-common/src/main/java/org/apache/spark/ network/protocol/StreamChunkId.java <https://github.com/apache/spark/pull/19335/files#diff-212> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/StreamFailure.java <https://github.com/apache/spark/pull/19335/files#diff-213> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/StreamRequest.java <https://github.com/apache/spark/pull/19335/files#diff-214> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/protocol/StreamResponse.java <https://github.com/apache/spark/pull/19335/files#diff-215> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/sasl/SaslClientBootstrap.java <https://github.com/apache/spark/pull/19335/files#diff-216> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/sasl/SaslEncryption.java <https://github.com/apache/spark/pull/19335/files#diff-217> (0) - *R* common/network-common/src/main/java/org/apache/spark/ network/sasl/SaslEncryptionBackend.java <https://github.com/apache/spark/pull/19335/files#diff-218> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/sasl/SaslMessage.java <https://github.com/apache/spark/pull/19335/files#diff-219> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/sasl/SaslRpcHandler.java <https://github.com/apache/spark/pull/19335/files#diff-220> (0) - *R* common/network-common/src/main/java/org/apache/spark/ network/sasl/SaslServerBootstrap.java <https://github.com/apache/spark/pull/19335/files#diff-221> (0) - *R* common/network-common/src/main/java/org/apache/spark/ network/sasl/SecretKeyHolder.java <https://github.com/apache/spark/pull/19335/files#diff-222> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/sasl/SparkSaslClient.java <https://github.com/apache/spark/pull/19335/files#diff-223> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/sasl/SparkSaslServer.java <https://github.com/apache/spark/pull/19335/files#diff-224> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/server/MessageHandler.java <https://github.com/apache/spark/pull/19335/files#diff-225> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/server/NoOpRpcHandler.java <https://github.com/apache/spark/pull/19335/files#diff-226> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/server/OneForOneStreamManager.java <https://github.com/apache/spark/pull/19335/files#diff-227> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/server/RpcHandler.java <https://github.com/apache/spark/pull/19335/files#diff-228> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/server/StreamManager.java <https://github.com/apache/spark/pull/19335/files#diff-229> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/server/TransportChannelHandler.java <https://github.com/apache/spark/pull/19335/files#diff-230> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/server/TransportRequestHandler.java <https://github.com/apache/spark/pull/19335/files#diff-231> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/server/TransportServer.java <https://github.com/apache/spark/pull/19335/files#diff-232> (0) - *R* common/network-common/src/main/java/org/apache/spark/ network/server/TransportServerBootstrap.java <https://github.com/apache/spark/pull/19335/files#diff-233> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/util/ByteArrayReadableChannel.java <https://github.com/apache/spark/pull/19335/files#diff-234> (0) - *R* common/network-common/src/main/java/org/apache/spark/ network/util/ByteArrayWritableChannel.java <https://github.com/apache/spark/pull/19335/files#diff-235> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/util/ByteUnit.java <https://github.com/apache/spark/pull/19335/files#diff-236> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/util/ConfigProvider.java <https://github.com/apache/spark/pull/19335/files#diff-237> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/util/CryptoUtils.java <https://github.com/apache/spark/pull/19335/files#diff-238> (0) - *R* common/network-common/src/main/java/org/apache/spark/ network/util/IOMode.java <https://github.com/apache/spark/pull/19335/files#diff-239> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/util/JavaUtils.java <https://github.com/apache/spark/pull/19335/files#diff-240> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/util/LevelDBProvider.java <https://github.com/apache/spark/pull/19335/files#diff-241> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/util/LimitedInputStream.java <https://github.com/apache/spark/pull/19335/files#diff-242> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/util/MapConfigProvider.java <https://github.com/apache/spark/pull/19335/files#diff-243> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/util/NettyMemoryMetrics.java <https://github.com/apache/spark/pull/19335/files#diff-244> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/util/NettyUtils.java <https://github.com/apache/spark/pull/19335/files#diff-245> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/util/TransportConf.java <https://github.com/apache/spark/pull/19335/files#diff-246> (0) - *M* common/network-common/src/main/java/org/apache/spark/ network/util/TransportFrameDecoder.java <https://github.com/apache/spark/pull/19335/files#diff-247> (0) - *M* common/network-common/src/test/java/org/apache/spark/network/ ChunkFetchIntegrationSuite.java <https://github.com/apache/spark/pull/19335/files#diff-248> (0) - *M* common/network-common/src/test/java/org/apache/spark/ network/ProtocolSuite.java <https://github.com/apache/spark/pull/19335/files#diff-249> (0) - *M* common/network-common/src/test/java/org/apache/spark/network/ RequestTimeoutIntegrationSuite.java <https://github.com/apache/spark/pull/19335/files#diff-250> (0) - *M* common/network-common/src/test/java/org/apache/spark/ network/RpcIntegrationSuite.java <https://github.com/apache/spark/pull/19335/files#diff-251> (0) - *M* common/network-common/src/test/java/org/apache/spark/ network/StreamSuite.java <https://github.com/apache/spark/pull/19335/files#diff-252> (0) - *R* common/network-common/src/test/java/org/apache/spark/ network/TestManagedBuffer.java <https://github.com/apache/spark/pull/19335/files#diff-253> (0) - *R* common/network-common/src/test/java/org/apache/spark/ network/TestUtils.java <https://github.com/apache/spark/pull/19335/files#diff-254> (0) - *M* common/network-common/src/test/java/org/apache/spark/network/ TransportClientFactorySuite.java <https://github.com/apache/spark/pull/19335/files#diff-255> (0) - *M* common/network-common/src/test/java/org/apache/spark/network/ TransportRequestHandlerSuite.java <https://github.com/apache/spark/pull/19335/files#diff-256> (0) - *M* common/network-common/src/test/java/org/apache/spark/network/ TransportResponseHandlerSuite.java <https://github.com/apache/spark/pull/19335/files#diff-257> (0) - *M* common/network-common/src/test/java/org/apache/spark/ network/crypto/AuthEngineSuite.java <https://github.com/apache/spark/pull/19335/files#diff-258> (0) - *M* common/network-common/src/test/java/org/apache/spark/ network/crypto/AuthIntegrationSuite.java <https://github.com/apache/spark/pull/19335/files#diff-259> (0) - *M* common/network-common/src/test/java/org/apache/spark/ network/crypto/AuthMessagesSuite.java <https://github.com/apache/spark/pull/19335/files#diff-260> (0) - *M* common/network-common/src/test/java/org/apache/spark/ network/protocol/MessageWithHeaderSuite.java <https://github.com/apache/spark/pull/19335/files#diff-261> (0) - *M* common/network-common/src/test/java/org/apache/spark/ network/sasl/SparkSaslSuite.java <https://github.com/apache/spark/pull/19335/files#diff-262> (0) - *M* common/network-common/src/test/java/org/apache/spark/ network/server/OneForOneStreamManagerSuite.java <https://github.com/apache/spark/pull/19335/files#diff-263> (0) - *M* common/network-common/src/test/java/org/apache/spark/ network/util/CryptoUtilsSuite.java <https://github.com/apache/spark/pull/19335/files#diff-264> (0) - *M* common/network-common/src/test/java/org/apache/spark/ network/util/NettyMemoryMetricsSuite.java <https://github.com/apache/spark/pull/19335/files#diff-265> (0) - *M* common/network-common/src/test/java/org/apache/spark/ network/util/TransportFrameDecoderSuite.java <https://github.com/apache/spark/pull/19335/files#diff-266> (0) - *M* common/network-common/src/test/resources/log4j.properties <https://github.com/apache/spark/pull/19335/files#diff-267> (0) - *M* common/network-shuffle/pom.xml <https://github.com/apache/spark/pull/19335/files#diff-268> (0) - *M* common/network-shuffle/src/main/java/org/apache/spark/ network/sasl/ShuffleSecretManager.java <https://github.com/apache/spark/pull/19335/files#diff-269> (0) - *R* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/BlockFetchingListener.java <https://github.com/apache/spark/pull/19335/files#diff-270> (0) - *M* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/ExternalShuffleBlockHandler.java <https://github.com/apache/spark/pull/19335/files#diff-271> (0) - *M* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/ExternalShuffleBlockResolver.java <https://github.com/apache/spark/pull/19335/files#diff-272> (0) - *M* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/ExternalShuffleClient.java <https://github.com/apache/spark/pull/19335/files#diff-273> (0) - *M* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/OneForOneBlockFetcher.java <https://github.com/apache/spark/pull/19335/files#diff-274> (0) - *M* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/RetryingBlockFetcher.java <https://github.com/apache/spark/pull/19335/files#diff-275> (0) - *M* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/ShuffleClient.java <https://github.com/apache/spark/pull/19335/files#diff-276> (0) - *M* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/ShuffleIndexInformation.java <https://github.com/apache/spark/pull/19335/files#diff-277> (0) - *M* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/ShuffleIndexRecord.java <https://github.com/apache/spark/pull/19335/files#diff-278> (0) - *M* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/TempShuffleFileManager.java <https://github.com/apache/spark/pull/19335/files#diff-279> (0) - *M* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/mesos/MesosExternalShuffleClient.java <https://github.com/apache/spark/pull/19335/files#diff-280> (0) - *M* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/protocol/BlockTransferMessage.java <https://github.com/apache/spark/pull/19335/files#diff-281> (0) - *M* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/protocol/ExecutorShuffleInfo.java <https://github.com/apache/spark/pull/19335/files#diff-282> (0) - *R* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/protocol/OpenBlocks.java <https://github.com/apache/spark/pull/19335/files#diff-283> (0) - *R* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/protocol/RegisterExecutor.java <https://github.com/apache/spark/pull/19335/files#diff-284> (0) - *R* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/protocol/StreamHandle.java <https://github.com/apache/spark/pull/19335/files#diff-285> (0) - *R* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/protocol/UploadBlock.java <https://github.com/apache/spark/pull/19335/files#diff-286> (0) - *M* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/protocol/mesos/RegisterDriver.java <https://github.com/apache/spark/pull/19335/files#diff-287> (0) - *M* common/network-shuffle/src/main/java/org/apache/spark/ network/shuffle/protocol/mesos/ShuffleServiceHeartbeat.java <https://github.com/apache/spark/pull/19335/files#diff-288> (0) - *M* common/network-shuffle/src/test/java/org/apache/spark/ network/sasl/SaslIntegrationSuite.java <https://github.com/apache/spark/pull/19335/files#diff-28

…n under codegen ## What changes were proposed in this pull request? We can override `usedInputs` to claim that an operator defers input evaluation. `Sample` and `Limit` are two operators which should claim it but don't. We should do it. ## How was this patch tested? Existing tests. Author: Liang-Chi Hsieh <[email protected]> Closes #19345 from viirya/SPARK-22124.

## What changes were proposed in this pull request? Address PR comments that appeared post-merge, to rename `addExtraCode` to `addInnerClass`, and not count the size of the inner class to the size of the outer class. ## How was this patch tested? YOLO. Author: Juliusz Sompolski <[email protected]> Closes #19353 from juliuszsompolski/SPARK-22103followup.

Closes #13794 Closes #18474 Closes #18897 Closes #18978 Closes #19152 Closes #19238 Closes #19295 Closes #19334 Closes #19335 Closes #19347 Closes #19236 Closes #19244 Closes #19300 Closes #19315 Closes #19356 Closes #15009 Closes #18253 Author: hyukjinkwon <[email protected]> Closes #19348 from HyukjinKwon/stale-prs.

yaooqinn and others added 30 commits August 18, 2017 00:24

[MINOR][TYPO] Fix typos: runnning and Excecutors

a2db5c5

## What changes were proposed in this pull request? Fix typos ## How was this patch tested? Existing tests Author: Andrew Ash <[email protected]> Closes #18996 from ash211/patch-2.

[SPARK-21468][PYSPARK][ML] Python API for FeatureHasher

988b84d

Add Python API for `FeatureHasher` transformer. ## How was this patch tested? New doc test. Author: Nick Pentreath <[email protected]> Closes #18970 from MLnick/SPARK-21468-pyspark-hasher.

viirya and others added 13 commits September 22, 2017 22:39

listenLearning changed the title ~~mapPartitions Api~~ mapPartitions Api #24 Sep 25, 2017

listenLearning changed the title ~~mapPartitions Api #24~~ mapPartitions Api Sep 25, 2017

wzhfy and others added 5 commits September 25, 2017 09:28

HyukjinKwon mentioned this pull request Sep 26, 2017

[BUILD] Close stale PRs #19348

Closed

viirya and others added 3 commits September 26, 2017 15:23

asfgit closed this in ceaec93 Sep 27, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mapPartitions Api #19335

mapPartitions Api #19335

Uh oh!

listenLearning commented Sep 25, 2017 •

edited

Loading

Uh oh!

HyukjinKwon commented Sep 25, 2017

Uh oh!

AmplabJenkins commented Sep 25, 2017

Uh oh!

HyukjinKwon commented Sep 25, 2017

Uh oh!

HyukjinKwon commented Sep 25, 2017

Uh oh!

caneGuy commented Sep 26, 2017 via email

Uh oh!

Uh oh!

mapPartitions Api #19335

mapPartitions Api #19335

Uh oh!

Conversation

listenLearning commented Sep 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HyukjinKwon commented Sep 25, 2017

Uh oh!

AmplabJenkins commented Sep 25, 2017

Uh oh!

HyukjinKwon commented Sep 25, 2017

Uh oh!

HyukjinKwon commented Sep 25, 2017

Uh oh!

caneGuy commented Sep 26, 2017 via email

Uh oh!

Uh oh!

listenLearning commented Sep 25, 2017 •

edited

Loading