[SPARK-28894][SQL][TESTS] Add a clue to make it easier to debug via Jenkins's test results

HyukjinKwon · dongjoon-hyun · commit 7cc0f0e9a795 · 2019-08-30T15:10:40.000-07:00
### What changes were proposed in this pull request? See https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109834/testReport/junit/org.apache.spark.sql/SQLQueryTestSuite/ ![Screen Shot 2019-08-28 at 4 08 58 PM](https://user-images.githubusercontent.com/6477701/63833484-2a23ea00-c9ae-11e9-91a1-0859cb183fea.png) ```xml <?xml version="1.0" encoding="UTF-8"?> <testsuite hostname="C02Y52ZLJGH5" name="org.apache.spark.sql.SQLQueryTestSuite" tests="3" errors="0" failures="0" skipped="0" time="14.475"> ... <testcase classname="org.apache.spark.sql.SQLQueryTestSuite" name="sql - Scala UDF" time="6.703"> </testcase> <testcase classname="org.apache.spark.sql.SQLQueryTestSuite" name="sql - Regular Python UDF" time="4.442"> </testcase> <testcase classname="org.apache.spark.sql.SQLQueryTestSuite" name="sql - Scalar Pandas UDF" time="3.33"> </testcase> <system-out/> <system-err/> </testsuite> ``` Root cause seems a bug in SBT - it truncates the test name based on the last dot. sbt/sbt#2949 https://github.com/sbt/sbt/blob/v0.13.18/testing/src/main/scala/sbt/JUnitXmlTestsListener.scala#L71-L79 I tried to find a better way but couldn't find. Therefore, this PR proposes a workaround by appending the test file name into the assert log: ```diff [info] - inner-join.sql *** FAILED *** (4 seconds, 306 milliseconds) + [info] inner-join.sql [info] Expected "1 a [info] 1 a [info] 1 b [info] 1[]", but got "1 a [info] 1 a [info] 1 b [info] 1[ b]" Result did not match for query #6 [info] SELECT tb.* FROM ta INNER JOIN tb ON ta.a = tb.a AND ta.tag = tb.tag (SQLQueryTestSuite.scala:377) [info] org.scalatest.exceptions.TestFailedException: [info] at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:528) ``` It will at least prevent us to search full logs to identify which test file is failed by clicking filed test. Note that this PR does not fully fix the issue but only fix the logs on its failed tests. ### Why are the changes needed? To debug Jenkins logs easier. Otherwise, we should open full logs and search which test was failed. ### Does this PR introduce any user-facing change? It will print out the file name of failed tests in Jenkins' test reports. ### How was this patch tested? Manually tested but Jenkins tests are required in this PR. Now it at least shows which file it is: ![Screen Shot 2019-08-30 at 10 16 32 PM](https://user-images.githubusercontent.com/6477701/64023705-de22a200-cb73-11e9-8806-2e98ad35adef.png) Closes #25630 from HyukjinKwon/SPARK-28894-1. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala
@@ -342,39 +342,44 @@ class SQLQueryTestSuite extends QueryTest with SharedSparkSession {
       stringToFile(resultFile, goldenOutput)
     }
 
-    // Read back the golden file.
-    val expectedOutputs: Seq[QueryOutput] = {
-      val goldenOutput = fileToString(new File(testCase.resultFile))
-      val segments = goldenOutput.split("-- !query.+\n")
-
-      // each query has 3 segments, plus the header
-      assert(segments.size == outputs.size * 3 + 1,
-        s"Expected ${outputs.size * 3 + 1} blocks in result file but got ${segments.size}. " +
-        s"Try regenerate the result files.")
-      Seq.tabulate(outputs.size) { i =>
-        QueryOutput(
-          sql = segments(i * 3 + 1).trim,
-          schema = segments(i * 3 + 2).trim,
-          output = segments(i * 3 + 3).replaceAll("\\s+$", "")
-        )
+    // This is a temporary workaround for SPARK-28894. The test names are truncated after
+    // the last dot due to a bug in SBT. This makes easier to debug via Jenkins test result
+    // report. See SPARK-28894.
+    withClue(s"${testCase.name}${System.lineSeparator()}") {
+      // Read back the golden file.
+      val expectedOutputs: Seq[QueryOutput] = {
+        val goldenOutput = fileToString(new File(testCase.resultFile))
+        val segments = goldenOutput.split("-- !query.+\n")
+
+        // each query has 3 segments, plus the header
+        assert(segments.size == outputs.size * 3 + 1,
+          s"Expected ${outputs.size * 3 + 1} blocks in result file but got ${segments.size}. " +
+            s"Try regenerate the result files.")
+        Seq.tabulate(outputs.size) { i =>
+          QueryOutput(
+            sql = segments(i * 3 + 1).trim,
+            schema = segments(i * 3 + 2).trim,
+            output = segments(i * 3 + 3).replaceAll("\\s+$", "")
+          )
+        }
       }
-    }
 
-    // Compare results.
-    assertResult(expectedOutputs.size, s"Number of queries should be ${expectedOutputs.size}") {
-      outputs.size
-    }
-
-    outputs.zip(expectedOutputs).zipWithIndex.foreach { case ((output, expected), i) =>
-      assertResult(expected.sql, s"SQL query did not match for query #$i\n${expected.sql}") {
-        output.sql
-      }
-      assertResult(expected.schema,
-        s"Schema did not match for query #$i\n${expected.sql}: $output") {
-        output.schema
+      // Compare results.
+      assertResult(expectedOutputs.size, s"Number of queries should be ${expectedOutputs.size}") {
+        outputs.size
       }
-      assertResult(expected.output, s"Result did not match for query #$i\n${expected.sql}") {
-        output.output
+
+      outputs.zip(expectedOutputs).zipWithIndex.foreach { case ((output, expected), i) =>
+        assertResult(expected.sql, s"SQL query did not match for query #$i\n${expected.sql}") {
+          output.sql
+        }
+        assertResult(expected.schema,
+          s"Schema did not match for query #$i\n${expected.sql}: $output") {
+          output.schema
+        }
+        assertResult(expected.output, s"Result did not match for query #$i\n${expected.sql}") {
+          output.output
+        }
       }
     }
   }