Skip to content

Commit 50a0496

Browse files
Brennon YorkJoshRosen
authored andcommitted
[SPARK-7017] [BUILD] [PROJECT INFRA] Refactor dev/run-tests into Python
All, this is a first attempt at refactoring `dev/run-tests` into Python. Initially I merely converted all Bash calls over to Python, then moved to a much more modular approach (more functions, moved the calls around, etc.). What is here is the initial culmination and should provide a great base to various downstream issues (e.g. SPARK-7016, modularize / parallelize testing, etc.). Would love comments / suggestions for this initial first step! /cc srowen pwendell nchammas Author: Brennon York <[email protected]> Closes #5694 from brennonyork/SPARK-7017 and squashes the following commits: 154ed73 [Brennon York] updated finding java binary if JAVA_HOME not set 3922a85 [Brennon York] removed necessary passed in variable f9fbe54 [Brennon York] reverted doc test change 8135518 [Brennon York] removed the test check for documentation changes until jenkins can get updated 05d435b [Brennon York] added check for jekyll install 22edb78 [Brennon York] add check if jekyll isn't installed on the path 2dff136 [Brennon York] fixed pep8 whitespace errors 767a668 [Brennon York] fixed path joining issues, ensured docs actually build on doc changes c42cf9a [Brennon York] unpack set operations with splat (*) fb85a41 [Brennon York] fixed minor set bug 0379833 [Brennon York] minor doc addition to print the changed modules aa03d9e [Brennon York] added documentation builds as a top level test component, altered high level project changes to properly execute core tests only when necessary, changed variable names for simplicity ec1ae78 [Brennon York] minor name changes, bug fixes b7c72b9 [Brennon York] reverting streaming context 03fdd7b [Brennon York] fixed the tuple () wraps around example lambda 705d12e [Brennon York] changed example to comply with pep3113 supporting python3 60b3d51 [Brennon York] prepend rather than append onto PATH 7d2f5e2 [Brennon York] updated python tests to remove unused variable 2898717 [Brennon York] added a change to streaming test to check if it only runs streaming tests eb684b6 [Brennon York] fixed sbt_test_goals reference error db7ae6f [Brennon York] reverted SPARK_HOME from start of command 1ecca26 [Brennon York] fixed merge conflicts 2fcdfc0 [Brennon York] testing targte branch dump on jenkins 1f607b1 [Brennon York] finalizing revisions to modular tests 8afbe93 [Brennon York] made error codes a global 0629de8 [Brennon York] updated to refactor and remove various small bugs, removed pep8 complaints d90ab2d [Brennon York] fixed merge conflicts, ensured that for regular builds both core and sql tests always run b1248dc [Brennon York] exec python rather than running python and exiting with return code f9deba1 [Brennon York] python to python2 and removed newline 6d0a052 [Brennon York] incorporated merge conflicts with SPARK-7249 f950010 [Brennon York] removed building hive-0.12.0 per SPARK-6908 703f095 [Brennon York] fixed merge conflicts b1ca593 [Brennon York] reverted the sparkR test afeb093 [Brennon York] updated to make sparkR test fail 1dada6b [Brennon York] reverted pyspark test failure 9a592ec [Brennon York] reverted mima exclude issue, added pyspark test failure d825aa4 [Brennon York] revert build break, add mima break f041d8a [Brennon York] added space from commented import to now test build breaking 983f2a2 [Brennon York] comment out import to fail build test 2386785 [Brennon York] Merge remote-tracking branch 'upstream/master' into SPARK-7017 76335fb [Brennon York] reverted rat license issue for sparkconf e4a96cc [Brennon York] removed the import error and added license error, fixed the way run-tests and run-tests.py report their error codes 56d3cb9 [Brennon York] changed test back and commented out import to break compile b37328c [Brennon York] fixed typo and added default return is no error block was found in the environment 7613558 [Brennon York] updated to return the proper env variable for return codes a5bd445 [Brennon York] reverted license, changed test in shuffle to fail 803143a [Brennon York] removed license file for SparkContext b0b2604 [Brennon York] comment out import to see if build fails and returns properly 83e80ef [Brennon York] attempt at better python output when called from bash c095fa6 [Brennon York] removed another wait() call 26e18e8 [Brennon York] removed unnecessary wait() 07210a9 [Brennon York] minor doc string change for java version with namedtuple update ec03bf3 [Brennon York] added namedtuple for java version to add readability 2cb413b [Brennon York] upcased global variables, changes various calling methods from check_output to check_call 639f1e9 [Brennon York] updated with pep8 rules, fixed minor bugs, added run-tests file in bash to call the run-tests.py script 3c53a1a [Brennon York] uncomment the scala tests :) 6126c4f [Brennon York] refactored run-tests into python
1 parent 6765ef9 commit 50a0496

File tree

5 files changed

+546
-224
lines changed

5 files changed

+546
-224
lines changed

dev/run-tests

Lines changed: 1 addition & 218 deletions
Original file line numberDiff line numberDiff line change
@@ -17,224 +17,7 @@
1717
# limitations under the License.
1818
#
1919

20-
# Go to the Spark project root directory
2120
FWDIR="$(cd "`dirname $0`"/..; pwd)"
2221
cd "$FWDIR"
2322

24-
# Clean up work directory and caches
25-
rm -rf ./work
26-
rm -rf ~/.ivy2/local/org.apache.spark
27-
rm -rf ~/.ivy2/cache/org.apache.spark
28-
29-
source "$FWDIR/dev/run-tests-codes.sh"
30-
31-
CURRENT_BLOCK=$BLOCK_GENERAL
32-
33-
function handle_error () {
34-
echo "[error] Got a return code of $? on line $1 of the run-tests script."
35-
exit $CURRENT_BLOCK
36-
}
37-
38-
39-
# Build against the right version of Hadoop.
40-
{
41-
if [ -n "$AMPLAB_JENKINS_BUILD_PROFILE" ]; then
42-
if [ "$AMPLAB_JENKINS_BUILD_PROFILE" = "hadoop1.0" ]; then
43-
export SBT_MAVEN_PROFILES_ARGS="-Phadoop-1 -Dhadoop.version=1.2.1"
44-
elif [ "$AMPLAB_JENKINS_BUILD_PROFILE" = "hadoop2.0" ]; then
45-
export SBT_MAVEN_PROFILES_ARGS="-Phadoop-1 -Dhadoop.version=2.0.0-mr1-cdh4.1.1"
46-
elif [ "$AMPLAB_JENKINS_BUILD_PROFILE" = "hadoop2.2" ]; then
47-
export SBT_MAVEN_PROFILES_ARGS="-Pyarn -Phadoop-2.2"
48-
elif [ "$AMPLAB_JENKINS_BUILD_PROFILE" = "hadoop2.3" ]; then
49-
export SBT_MAVEN_PROFILES_ARGS="-Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0"
50-
fi
51-
fi
52-
53-
if [ -z "$SBT_MAVEN_PROFILES_ARGS" ]; then
54-
export SBT_MAVEN_PROFILES_ARGS="-Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0"
55-
fi
56-
}
57-
58-
export SBT_MAVEN_PROFILES_ARGS="$SBT_MAVEN_PROFILES_ARGS -Pkinesis-asl"
59-
60-
# Determine Java path and version.
61-
{
62-
if test -x "$JAVA_HOME/bin/java"; then
63-
declare java_cmd="$JAVA_HOME/bin/java"
64-
else
65-
declare java_cmd=java
66-
fi
67-
68-
# We can't use sed -r -e due to OS X / BSD compatibility; hence, all the parentheses.
69-
JAVA_VERSION=$(
70-
$java_cmd -version 2>&1 \
71-
| grep -e "^java version" --max-count=1 \
72-
| sed "s/java version \"\(.*\)\.\(.*\)\.\(.*\)\"/\1\2/"
73-
)
74-
75-
if [ "$JAVA_VERSION" -lt 18 ]; then
76-
echo "[warn] Java 8 tests will not run because JDK version is < 1.8."
77-
fi
78-
}
79-
80-
# Only run Hive tests if there are SQL changes.
81-
# Partial solution for SPARK-1455.
82-
if [ -n "$AMPLAB_JENKINS" ]; then
83-
target_branch="$ghprbTargetBranch"
84-
git fetch origin "$target_branch":"$target_branch"
85-
86-
# AMP_JENKINS_PRB indicates if the current build is a pull request build.
87-
if [ -n "$AMP_JENKINS_PRB" ]; then
88-
# It is a pull request build.
89-
sql_diffs=$(
90-
git diff --name-only "$target_branch" \
91-
| grep -e "^sql/" -e "^bin/spark-sql" -e "^sbin/start-thriftserver.sh"
92-
)
93-
94-
non_sql_diffs=$(
95-
git diff --name-only "$target_branch" \
96-
| grep -v -e "^sql/" -e "^bin/spark-sql" -e "^sbin/start-thriftserver.sh"
97-
)
98-
99-
if [ -n "$sql_diffs" ]; then
100-
echo "[info] Detected changes in SQL. Will run Hive test suite."
101-
_RUN_SQL_TESTS=true
102-
103-
if [ -z "$non_sql_diffs" ]; then
104-
echo "[info] Detected no changes except in SQL. Will only run SQL tests."
105-
_SQL_TESTS_ONLY=true
106-
fi
107-
fi
108-
else
109-
# It is a regular build. We should run SQL tests.
110-
_RUN_SQL_TESTS=true
111-
fi
112-
fi
113-
114-
set -o pipefail
115-
trap 'handle_error $LINENO' ERR
116-
117-
echo ""
118-
echo "========================================================================="
119-
echo "Running Apache RAT checks"
120-
echo "========================================================================="
121-
122-
CURRENT_BLOCK=$BLOCK_RAT
123-
124-
./dev/check-license
125-
126-
echo ""
127-
echo "========================================================================="
128-
echo "Running Scala style checks"
129-
echo "========================================================================="
130-
131-
CURRENT_BLOCK=$BLOCK_SCALA_STYLE
132-
133-
./dev/lint-scala
134-
135-
echo ""
136-
echo "========================================================================="
137-
echo "Running Python style checks"
138-
echo "========================================================================="
139-
140-
CURRENT_BLOCK=$BLOCK_PYTHON_STYLE
141-
142-
./dev/lint-python
143-
144-
echo ""
145-
echo "========================================================================="
146-
echo "Building Spark"
147-
echo "========================================================================="
148-
149-
CURRENT_BLOCK=$BLOCK_BUILD
150-
151-
{
152-
HIVE_BUILD_ARGS="$SBT_MAVEN_PROFILES_ARGS -Phive -Phive-thriftserver"
153-
echo "[info] Compile with Hive 0.13.1"
154-
[ -d "lib_managed" ] && rm -rf lib_managed
155-
echo "[info] Building Spark with these arguments: $HIVE_BUILD_ARGS"
156-
157-
if [ "${AMPLAB_JENKINS_BUILD_TOOL}" == "maven" ]; then
158-
build/mvn $HIVE_BUILD_ARGS clean package -DskipTests
159-
else
160-
echo -e "q\n" \
161-
| build/sbt $HIVE_BUILD_ARGS package assembly/assembly streaming-kafka-assembly/assembly \
162-
| grep -v -e "info.*Resolving" -e "warn.*Merging" -e "info.*Including"
163-
fi
164-
}
165-
166-
echo ""
167-
echo "========================================================================="
168-
echo "Detecting binary incompatibilities with MiMa"
169-
echo "========================================================================="
170-
171-
CURRENT_BLOCK=$BLOCK_MIMA
172-
173-
./dev/mima
174-
175-
echo ""
176-
echo "========================================================================="
177-
echo "Running Spark unit tests"
178-
echo "========================================================================="
179-
180-
CURRENT_BLOCK=$BLOCK_SPARK_UNIT_TESTS
181-
182-
{
183-
# If the Spark SQL tests are enabled, run the tests with the Hive profiles enabled.
184-
# This must be a single argument, as it is.
185-
if [ -n "$_RUN_SQL_TESTS" ]; then
186-
SBT_MAVEN_PROFILES_ARGS="$SBT_MAVEN_PROFILES_ARGS -Phive -Phive-thriftserver"
187-
fi
188-
189-
if [ -n "$_SQL_TESTS_ONLY" ]; then
190-
# This must be an array of individual arguments. Otherwise, having one long string
191-
# will be interpreted as a single test, which doesn't work.
192-
SBT_MAVEN_TEST_ARGS=("catalyst/test" "sql/test" "hive/test" "hive-thriftserver/test" "mllib/test")
193-
else
194-
SBT_MAVEN_TEST_ARGS=("test")
195-
fi
196-
197-
echo "[info] Running Spark tests with these arguments: $SBT_MAVEN_PROFILES_ARGS ${SBT_MAVEN_TEST_ARGS[@]}"
198-
199-
if [ "${AMPLAB_JENKINS_BUILD_TOOL}" == "maven" ]; then
200-
build/mvn test $SBT_MAVEN_PROFILES_ARGS --fail-at-end
201-
else
202-
# NOTE: echo "q" is needed because sbt on encountering a build file with failure
203-
# (either resolution or compilation) prompts the user for input either q, r, etc
204-
# to quit or retry. This echo is there to make it not block.
205-
# NOTE: Do not quote $SBT_MAVEN_PROFILES_ARGS or else it will be interpreted as a
206-
# single argument!
207-
# "${SBT_MAVEN_TEST_ARGS[@]}" is cool because it's an array.
208-
# QUESTION: Why doesn't 'yes "q"' work?
209-
# QUESTION: Why doesn't 'grep -v -e "^\[info\] Resolving"' work?
210-
echo -e "q\n" \
211-
| build/sbt $SBT_MAVEN_PROFILES_ARGS "${SBT_MAVEN_TEST_ARGS[@]}" \
212-
| grep -v -e "info.*Resolving" -e "warn.*Merging" -e "info.*Including"
213-
fi
214-
}
215-
216-
echo ""
217-
echo "========================================================================="
218-
echo "Running PySpark tests"
219-
echo "========================================================================="
220-
221-
CURRENT_BLOCK=$BLOCK_PYSPARK_UNIT_TESTS
222-
223-
# add path for python 3 in jenkins
224-
export PATH="${PATH}:/home/anaconda/envs/py3k/bin"
225-
./python/run-tests
226-
227-
echo ""
228-
echo "========================================================================="
229-
echo "Running SparkR tests"
230-
echo "========================================================================="
231-
232-
CURRENT_BLOCK=$BLOCK_SPARKR_UNIT_TESTS
233-
234-
if [ $(command -v R) ]; then
235-
./R/install-dev.sh
236-
./R/run-tests.sh
237-
else
238-
echo "Ignoring SparkR tests as R was not found in PATH"
239-
fi
240-
23+
exec python -u ./dev/run-tests.py

dev/run-tests-codes.sh

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,9 @@ readonly BLOCK_GENERAL=10
2121
readonly BLOCK_RAT=11
2222
readonly BLOCK_SCALA_STYLE=12
2323
readonly BLOCK_PYTHON_STYLE=13
24-
readonly BLOCK_BUILD=14
25-
readonly BLOCK_MIMA=15
26-
readonly BLOCK_SPARK_UNIT_TESTS=16
27-
readonly BLOCK_PYSPARK_UNIT_TESTS=17
28-
readonly BLOCK_SPARKR_UNIT_TESTS=18
24+
readonly BLOCK_DOCUMENTATION=14
25+
readonly BLOCK_BUILD=15
26+
readonly BLOCK_MIMA=16
27+
readonly BLOCK_SPARK_UNIT_TESTS=17
28+
readonly BLOCK_PYSPARK_UNIT_TESTS=18
29+
readonly BLOCK_SPARKR_UNIT_TESTS=19

dev/run-tests-jenkins

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -210,6 +210,8 @@ done
210210
failing_test="Scala style tests"
211211
elif [ "$test_result" -eq "$BLOCK_PYTHON_STYLE" ]; then
212212
failing_test="Python style tests"
213+
elif [ "$test_result" -eq "$BLOCK_DOCUMENTATION" ]; then
214+
failing_test="to generate documentation"
213215
elif [ "$test_result" -eq "$BLOCK_BUILD" ]; then
214216
failing_test="to build"
215217
elif [ "$test_result" -eq "$BLOCK_MIMA" ]; then

0 commit comments

Comments
 (0)