[SPARK-18016][SQL][CATALYST] Code Generation: Constant Pool Limit - Class Splitting #18075

bdrillard · 2017-05-23T20:41:33Z

What changes were proposed in this pull request?

This pull-request exclusively includes the class splitting feature described in #16648. When code for a given class would grow beyond 1600k bytes, a private, nested sub-class is generated into which subsequent functions are inlined. Additional sub-classes are generated as the code threshold is met subsequent times. This code includes 3 changes:

Includes helper maps, lists, and functions for keeping track of sub-classes during code generation (included in the CodeGenerator class). These helper functions allow nested classes and split functions to be initialized/declared/inlined to the appropriate locations in the various projection classes.
Changes addNewFunction to return a string to support instances where a split function is inlined to a nested class and not the outer class (and so must be invoked using the class-qualified name). Uses of addNewFunction throughout the codebase are modified so that the returned name is properly used.
Removes instances of the this keyword when used on data inside generated classes. All state declared in the outer class is by default global and accessible to the nested classes. However, if a reference to global state in a nested class is prepended with the this keyword, it would attempt to reference state belonging to the nested class (which would not exist), rather than the correct variable belonging to the outer class.

How was this patch tested?

Added a test case to the GeneratedProjectionSuite that increases the number of columns tested in various projections to a threshold that would previously have triggered a JaninoRuntimeException for the Constant Pool.

Note: This PR does not address the second Constant Pool issue with code generation (also mentioned in #16648): excess global mutable state. A second PR may be opened to resolve that issue.

kiszk · 2017-05-24T05:05:52Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

@@ -629,7 +736,9 @@ class CodegenContext {

  /**
   * Splits the generated code of expressions into multiple functions, because function has
-   * 64kb code size limit in JVM
+   * 64kb code size limit in JVM. If the class the function is to be inlined to would grow beyond
+   * 1600kb, a private, netsted sub-class is declared, and the function is inlined to it, because


nit: netsted -> nested?

Fixed: 90a907a#diff-8bcc5aea39c73d4bf38aef6f6951d42cL740

kiszk · 2017-05-24T05:32:36Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

@@ -792,7 +887,18 @@ class CodegenContext {
      addMutableState(javaType(expr.dataType), value,
        s"$value = ${defaultValue(expr.dataType)};")

-      subexprFunctions += s"$fnName($INPUT_ROW);"
+      // Generate the code for this expression tree and wrap it in a function.


Is there any reason to move this code block from the original place to here?

Not for the scope of this class splitting change. The previous pull-request that also made addMutableState return an accessor reference required the order to change so that correct variables could be declared prior to the code of fn, but for this pull-request, only ensuring that subexprFunctions gets the return of addNewFunction is necessary. I'll change the ordering back.

Fixed: 90a907a#diff-8bcc5aea39c73d4bf38aef6f6951d42cR872
Note: a rebase with the two commits squashed will show the subexprFunctions as the only changed line, which is what we want.

kiszk · 2017-05-24T05:33:43Z

Thank you. Absolutely, it is easier to review this change.

kiszk · 2017-05-24T06:00:40Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+   */
+  private[sql] def declareAddedFunctions(): String = {
+    classFunctions("OuterClass").map {
+      case (funcName, funcCode) => funcCode


nit: funcName -> _

Fixed: 90a907a#diff-8bcc5aea39c73d4bf38aef6f6951d42cL322

kiszk · 2017-05-24T06:01:00Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+        } else {
+          val code = functions.map {
+            case (_, funcCode) =>
+              s"$funcCode"


nit: s"$funcCode" -> funcCode

Fixed: 90a907a#diff-8bcc5aea39c73d4bf38aef6f6951d42cL336

kiszk · 2017-05-24T16:26:26Z

sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala

@@ -299,6 +297,9 @@ case class SampleExec(
          | }
         """.stripMargin.trim)

+      ctx.addMutableState(s"$samplerClass<UnsafeRow>", sampler,


Can we put this block at the original place?

This change should stay the way it is. Notice that the initialization code for the addMutableState call on the line just below is the same code passed to the addNewFunction call. This means that when we create the new function, we may get back a class-qualified function name (which here we store in initSamplerFuncName), so the call to addMutableState must come after the addNewFunction call.

Ah, got it. I overlooked the new dependency regarding initSamplerFuncName.

kiszk · 2017-05-24T16:26:41Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+   */
+  private[sql] def initNestedClasses(): String = {
+    // Nested, private sub-classes have no mutable state (though they do reference the outer class'
+    // mutable state), so we declare and initialize them inline ot the OuterClass


nit: ot -> at

Fixed: 78bccda#diff-8bcc5aea39c73d4bf38aef6f6951d42cL306

kiszk · 2017-05-24T17:49:42Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+    val name = classInfo._1
+
+    classSize.update(name, classSize(name) + funcCode.length)
+    classFunctions.update(name, classFunctions(name) += funcCode)


I am not 100% sure, but is " += " required? How about " = "?

+= will be necessary since classFunctions(name) will return the ListBuffer[String] containing all functions belonging to the given class name from the classFunctions map, and we want to append the new funcCode to that buffer. Also, classFunctions.update is going to expect a ListBuffer[String] as its second argument, but the return type of assignment = is just Unit. += will append the element and then return the modified buffer (which is the behavior we want as per the API doc for ListBuffer).

Sorry for confusing you, I made a mistake in my comment. I wanted to say " + " instead of " = ".

Ah, sure. The story seems pretty much the same, we still want the append operation that also returns the reference to the modified buffer, and that's given by +=. Also, it doesn't look like + is defined as an operation on a ListBuffer[A] and an element of type A (see ListBuffer).

Thank you for your clarification.
In that case, is classFunctions(name) += funcCode enough instead of calling classFunctions.update?

Yeah, good point, since we're using a mutable buffer, we can update the referenced object directly even if its contained inside the map. Since += explicitly returns the reference to the modified buffer, it would probably be most straightforward to use

classFunctions(name).append(funcCode)

since append has Unit return type, and we don't need any results from appending the code to the class's buffer of function code.

Here's a commit with that change if you think it checks out: c225f3a#diff-8bcc5aea39c73d4bf38aef6f6951d42cL290

kiszk · 2017-05-25T15:35:11Z

Thanks, sound good to me for now.
cc @ueshin

ueshin

Thank you for working on this!

I added some comments and I found some other places we might need to modify:

It seems like we need to remove this. at objects.scala#L984.
I guess also we need to add initializations of inner classes at GenerateColumnAccessor.scala#L225.

ueshin · 2017-05-29T04:43:26Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+  def addNewFunction(
+    funcName: String,
+    funcCode: String,
+    inlineToOuterClass: Boolean = false): String = {


nit: indent. Add 2 more spaces.

Fixed: 493113c#diff-8bcc5aea39c73d4bf38aef6f6951d42cL271

ueshin · 2017-05-29T04:57:43Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+    // limit, 65,536. We cannot know how many constants will be inserted for a class, so we use a
+    // threshold of 1600k bytes to determine when a function should be inlined to a private, nested
+    // sub-class.
+    val classInfo = if (inlineToOuterClass) {


nit: We can use val (className, classInstance) = if ... here.

Fixed (with some refactoring based on those variables being available in the scope earlier): 493113c#diff-8bcc5aea39c73d4bf38aef6f6951d42cL278

ueshin · 2017-05-29T05:11:39Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+
+  // Adds a new class. Requires the class' name, and its instance name.
+  private def addClass(className: String, classInstance: String): Unit = {
+    classes.prepend(Tuple2(className, classInstance))


nit: How about classes.prepend((className, classInstance)) or classes.prepend(className -> classInstance)?

Fixed: 493113c#diff-8bcc5aea39c73d4bf38aef6f6951d42cL250

ueshin · 2017-05-29T05:38:23Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala

    })

    // Create the collection.
    val wrapperClass = classOf[mutable.WrappedArray[_]].getName
    ctx.addMutableState(
      s"$wrapperClass<InternalRow>",
      ev.value,
-      s"this.${ev.value} = $wrapperClass$$.MODULE$$.make(this.$rowData);")
+      s"this.${ev.value} = $wrapperClass$$.MODULE$$.make($rowData);")


Should remove one more this. here?

Good catch, fixed: 493113c#diff-16493d6958b6daaf4a24dd7b780ba4bcL201

ueshin · 2017-05-29T06:24:02Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+    val name = classInfo._1
+
+    classSize.update(name, classSize(name) + funcCode.length)
+    classFunctions(name).append(funcCode)


How about:

classSize(name) += funcCode.length classFunctions(name) += funcCode

I suppose that's more concise/readable given the underlying mutable data structures: 493113c#diff-8bcc5aea39c73d4bf38aef6f6951d42cL292

bdrillard · 2017-05-30T18:35:36Z

@ueshin As for the remaining this in objects.scala, 493113c#diff-e436c96ea839dfe446837ab2a3531f93L984
and the need for an additional nested classes declaration in GenerateColumnAccessor.scala, 493113c#diff-58a69e526de8182bcb4c840a8cb29e2dR225

ueshin · 2017-05-31T00:23:18Z

ok to test

SparkQA · 2017-05-31T02:29:31Z

Test build #77564 has finished for PR 18075 at commit 493113c.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

…ements, and ordering

…inits, and style

…ust be replaced

SparkQA · 2017-05-31T19:48:13Z

Test build #77597 has finished for PR 18075 at commit 7fe5e4a.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

bdrillard · 2017-05-31T21:32:40Z

The earlier failure occurred when the stopEarly() function is registered to the OuterClass potentially twice. Using a Map of functions holding the function code and its name fixes the issue, as whenever a function of the same name would be added more than once, it updates the older value. The tests pass after the change.

ueshin · 2017-06-01T00:45:52Z

LGTM for now, cc @cloud-fan.

fbertsch · 2017-06-12T16:43:54Z

We're really looking forward to this change! This bug is limiting a lot of the work we'd like to do with Spark. Any idea who we can ping to move this along?

kiszk · 2017-06-13T05:18:57Z

@cloud-fan can you have a time to look at this?

cloud-fan · 2017-06-13T23:55:08Z

reviewing

cloud-fan · 2017-06-14T01:22:01Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+
+  // A map holding the current size in bytes of each class to be generated.
+  private val classSize: mutable.Map[String, Int] =
+    mutable.Map[String, Int]("OuterClass" -> 0)


we can create a variable for this "OuterClass" instead of hardcoding it in many places.

Fixed: 678b4ad#diff-8bcc5aea39c73d4bf38aef6f6951d42cR225

cloud-fan · 2017-06-14T01:28:14Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+    classSize(className) += funcCode.length
+    classFunctions(className) += funcName -> funcCode
+
+    if (className.equals("OuterClass")) {


nit: className == "OuterClass"

Fixed: 678b4ad#diff-8bcc5aea39c73d4bf38aef6f6951d42cR296

cloud-fan · 2017-06-14T01:31:20Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+    mutable.Map[String, Int]("OuterClass" -> 0)
+
+  // Nested maps holding function names and their code belonging to each class.
+  private val classFunctions: mutable.Map[String, mutable.Map[String, String]] =


seems we only need a map from class name to function codes? i.e. mutable.Map[String, mutable.ListBuffer[String]]

I had originally thought so, but it turns out that there's at least one instance where the code for a given function name is updated during the code-generation process. The generated stopEarly function can actually be inserted twice, once returning a variable returning a different stopEarly variable each time. What would end up occurring is that two functions of the same signature would exist in the class, causing a compile error. So we need to use a map to make sure the implementation gets updated for a given function when necessary.

Note also that the old implementation of addedFunctions was a map also.

cloud-fan · 2017-06-14T01:32:52Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

@@ -629,7 +730,9 @@ class CodegenContext {

  /**
   * Splits the generated code of expressions into multiple functions, because function has
-   * 64kb code size limit in JVM
+   * 64kb code size limit in JVM. If the class the function is to be inlined to would grow beyond


If the class the function -> If the class with the function?

I think this is the grammatically correct/hopefully more clear form of that same docstring: 678b4ad#diff-8bcc5aea39c73d4bf38aef6f6951d42cR727

cloud-fan · 2017-06-14T01:36:22Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+  private[sql] def initNestedClasses(): String = {
+    // Nested, private sub-classes have no mutable state (though they do reference the outer class'
+    // mutable state), so we declare and initialize them inline to the OuterClass.
+    classes.map {


nit: classes.filterKeys(_ != "OuterClass").map ...

Fixed: 678b4ad#diff-8bcc5aea39c73d4bf38aef6f6951d42cR327

cloud-fan · 2017-06-14T01:40:01Z

.../test/scala/org/apache/spark/sql/catalyst/expressions/codegen/GeneratedProjectionSuite.scala

@@ -83,6 +83,58 @@ class GeneratedProjectionSuite extends SparkFunSuite {
    assert(result === row2)
  }

+  test("SPARK-18016: generated projections on wider table requiring class-splitting") {
+    val N = 4000
+    val wideRow1 = new GenericInternalRow((1 to N).toArray[Any])


nit: new GenericInternalRow(N)

Fixed. I've cleaned up the test you comment on, and the one above it, since they both have the same structure, just different values for N:
678b4ad#diff-a14107cf4a4c41671bba24a82f6042d9R36

cloud-fan · 2017-06-14T01:42:20Z

.../test/scala/org/apache/spark/sql/catalyst/expressions/codegen/GeneratedProjectionSuite.scala

+    val unsafeProj = UnsafeProjection.create(nestedSchema)
+    val unsafe: UnsafeRow = unsafeProj(nested)
+    (0 until N).foreach { i =>
+      val s = UTF8String.fromString((i + 1).toString)


create input data with 0 until N, or test it with 1 to N, to avoid this i + 1

Fixed. See the above comment. Creating the data with 0 until N cleans up the indexing on i.

cloud-fan · 2017-06-14T01:44:28Z

sql/core/src/main/scala/org/apache/spark/sql/execution/ColumnarBatchScan.scala

@@ -93,7 +93,7 @@ private[sql] trait ColumnarBatchScan extends CodegenSupport {
    }

    val nextBatch = ctx.freshName("nextBatch")


nit: inline this nextBatch

actually, how about we make addNewFunction accepts funcNameHint and generate unique func name inside addNewFunction? Then users can just call: val nextBatch = ctx.addNewFunction("nextBatch", ...)

I suppose it depends on which implementation we think is cleaner. The freshName generated by the caller is typically used twice, once in the call to addNewFunction, but also immediately in the function code as the method name. If we use a name hint, we'd have to do a string replace inside addNewFunction to update the placeholder method name with the freshname. So it would seem either we keep

val nextBatch = ctx.freshName("nextBatch") val nextBatchFuncName = ctx.addNewFunction(nextBatch, s""" |private void $nextBatch() throws java.io.IOException { | long getBatchStart = System.nanoTime(); | if ($input.hasNext()) { | $batch = ($columnarBatchClz)$input.next(); | $numOutputRows.add($batch.numRows()); | $idx = 0; | ${columnAssigns.mkString("", "\n", "\n")} | } | $scanTimeTotalNs += System.nanoTime() - getBatchStart; |}""".stripMargin)

or we have

val nextBatchHint = "nextBatch" val nextBatch = ctx.addNewFunction(nextBatchHint, s""" |private void $nextBatchHint() throws java.io.IOException { | long getBatchStart = System.nanoTime(); | if ($input.hasNext()) { | $batch = ($columnarBatchClz)$input.next(); | $numOutputRows.add($batch.numRows()); | $idx = 0; | ${columnAssigns.mkString("", "\n", "\n")} | } | $scanTimeTotalNs += System.nanoTime() - getBatchStart; |}""".stripMargin)

where addNewFunction would do the proper replacement over the code for the method with a freshname generated from "nextBatch" as a name hint.

Or in every instance, we just duplicate the string hint without creating a variable for it in both the addNewFunction call and the method name:

val nextBatch = ctx.addNewFunction("nextBatch", s""" |private void nextBatch() throws java.io.IOException { | long getBatchStart = System.nanoTime(); | if ($input.hasNext()) { | $batch = ($columnarBatchClz)$input.next(); | $numOutputRows.add($batch.numRows()); | $idx = 0; | ${columnAssigns.mkString("", "\n", "\n")} | } | $scanTimeTotalNs += System.nanoTime() - getBatchStart; |}""".stripMargin)

Which would you prefer?

let's keep it as it was

cloud-fan · 2017-06-14T01:49:13Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+   *
+   * @param funcName the class-unqualified name of the function
+   * @param funcCode the body of the function
+   * @param inlineToOuterClass whether the given code must be inlined to the `OuterClass`. This


can you give an example? I'm not very clear when we need this

Yes, see the portion of doConsume in the Limit class where the stopEarly function is registered, https://github.com/apache/spark/pull/18075/files#diff-379cccace8699ca00b76ff5631222adeR73

In this section of code, the registration of the function is separate from the caller code, so unlike other changes in this patch, we have no way of informing the caller code what the potentially class-qualified name of the function would be if it were inlined to a nested class. Instead, the caller code for the function (in WholeStageCodegenExec), makes a hard assumption that stopEarly will be visible globally, that is, in the outer class. The caller is divorced from the function producer across classes, so it's not clear how to make a generated function name visible, but the hint to inline to just inline the function to the outer class fixes that issue.

It seems to me, as the stopEarly in Limit is going to override the stopEarly in BufferedRowIterator, we can only put it in outer class.

yup, whole stage codegen is really tricky...

cloud-fan · 2017-06-14T01:49:57Z

LGTM except some style comments

SparkQA · 2017-06-14T20:30:45Z

Test build #78059 has finished for PR 18075 at commit 678b4ad.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2017-06-15T05:31:03Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

@@ -233,10 +222,118 @@ class CodegenContext {
  // The collection of sub-expression result resetting methods that need to be called on each row.
  val subexprFunctions = mutable.ArrayBuffer.empty[String]

-  def declareAddedFunctions(): String = {
-    addedFunctions.map { case (funcName, funcCode) => funcCode }.mkString("\n")
+  val outerClassName = "OuterClass"


nit: private val

cloud-fan · 2017-06-15T05:34:54Z

.../test/scala/org/apache/spark/sql/catalyst/expressions/codegen/GeneratedProjectionSuite.scala

@@ -33,10 +33,10 @@ class GeneratedProjectionSuite extends SparkFunSuite {

  test("generated projections on wider table") {
    val N = 1000
-    val wideRow1 = new GenericInternalRow((1 to N).toArray[Any])
+    val wideRow1 = new GenericInternalRow((0 until N).toArray[Any])


nit: can be new GenericInternalRow(N)

cloud-fan · 2017-06-15T05:35:21Z

.../test/scala/org/apache/spark/sql/catalyst/expressions/codegen/GeneratedProjectionSuite.scala

+
+  test("SPARK-18016: generated projections on wider table requiring class-splitting") {
+    val N = 4000
+    val wideRow1 = new GenericInternalRow((0 until N).toArray[Any])


cloud-fan · 2017-06-15T05:45:37Z

thanks, merging to master! you can address the remaining comments in your other PRs

viirya · 2017-06-16T03:20:23Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+
+  /**
+   * Adds a function to the generated class. If the code for the `OuterClass` grows too large, the
+   * function will be inlined into a new private, nested class, and a class-qualified name for the


nit: class instance-qualified name

viirya · 2017-06-16T03:20:37Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+   *                           can be necessary when a function is declared outside of the context
+   *                           it is eventually referenced and a returned qualified function name
+   *                           cannot otherwise be accessed.
+   * @return the name of the function, qualified by class if it will be inlined to a private,


…lass Splitting ## What changes were proposed in this pull request? This pull-request exclusively includes the class splitting feature described in apache#16648. When code for a given class would grow beyond 1600k bytes, a private, nested sub-class is generated into which subsequent functions are inlined. Additional sub-classes are generated as the code threshold is met subsequent times. This code includes 3 changes: 1. Includes helper maps, lists, and functions for keeping track of sub-classes during code generation (included in the `CodeGenerator` class). These helper functions allow nested classes and split functions to be initialized/declared/inlined to the appropriate locations in the various projection classes. 2. Changes `addNewFunction` to return a string to support instances where a split function is inlined to a nested class and not the outer class (and so must be invoked using the class-qualified name). Uses of `addNewFunction` throughout the codebase are modified so that the returned name is properly used. 3. Removes instances of the `this` keyword when used on data inside generated classes. All state declared in the outer class is by default global and accessible to the nested classes. However, if a reference to global state in a nested class is prepended with the `this` keyword, it would attempt to reference state belonging to the nested class (which would not exist), rather than the correct variable belonging to the outer class. ## How was this patch tested? Added a test case to the `GeneratedProjectionSuite` that increases the number of columns tested in various projections to a threshold that would previously have triggered a `JaninoRuntimeException` for the Constant Pool. Note: This PR does not address the second Constant Pool issue with code generation (also mentioned in apache#16648): excess global mutable state. A second PR may be opened to resolve that issue. Author: ALeksander Eskilson <[email protected]> Closes apache#18075 from bdrillard/class_splitting_only.

…ol Limit - Class Splitting ## What changes were proposed in this pull request? This is a backport patch for Spark 2.1.x of the class splitting feature over excess generated code as was merged in #18075. ## How was this patch tested? The same test provided in #18075 is included in this patch. Author: ALeksander Eskilson <[email protected]> Closes #18354 from bdrillard/class_splitting_2.1.

…ol Limit - Class Splitting ## What changes were proposed in this pull request? This is a backport patch for Spark 2.2.x of the class splitting feature over excess generated code as was merged in #18075. ## How was this patch tested? The same test provided in #18075 is included in this patch. Author: ALeksander Eskilson <[email protected]> Closes #18377 from bdrillard/class_splitting_2.2.

bdrillard · 2017-10-17T18:25:38Z

The second part that follows this merged PR is up as #19518.

bdrillard mentioned this pull request May 23, 2017

[SPARK-18016][SQL][CATALYST] Code Generation: Constant Pool Limit #16648

Closed

kiszk reviewed May 24, 2017

View reviewed changes

ueshin reviewed May 29, 2017

View reviewed changes

ALeksander Eskilson added 8 commits May 31, 2017 10:37

class_splitting_only adding class splitting

76e291b

class_splitting_only addressing code review comments: typo, case stat…

b6bf6db

…ements, and ordering

class_splitting_only refactoring classFunctions to use a simpler list

28fc548

class_splitting_only fixing error in pom scalatest-maven-plugin

d30d097

class_splitting_only more consistent use of assoc notation

442332b

class_splitting_only fixing classFunctions buffer append

a1c93fb

class_splitting_only removing instances of this, adding nested class …

1086bb3

…inits, and style

class_splitting_only using a map for functions since some functions m…

7fe5e4a

…ust be replaced

bdrillard force-pushed the class_splitting_only branch from 493113c to 7fe5e4a Compare May 31, 2017 17:01

cloud-fan reviewed Jun 14, 2017

View reviewed changes

class_splitting_only addressing review comments

678b4ad

cloud-fan reviewed Jun 15, 2017

View reviewed changes

asfgit closed this in b32b212 Jun 15, 2017

viirya reviewed Jun 16, 2017

View reviewed changes

This was referenced Jun 19, 2017

[SPARK-18016][SQL][CATALYST][BRANCH-2.1] Code Generation: Constant Pool Limit - Class Splitting #18354

Closed

[SPARK-18016][SQL][CATALYST][BRANCH-2.2] Code Generation: Constant Pool Limit - Class Splitting #18377

Closed

juliuszsompolski mentioned this pull request Sep 22, 2017

[SPARK-22103] Move HashAggregateExec parent consume to a separate function in codegen #19324

Closed

This was referenced Oct 16, 2017

[SPARK-22226][SQL] splitExpression can create too many method calls in the outer class #19480

Closed

[SPARK-18016][SQL][CATALYST] Code Generation: Constant Pool Limit - State Compaction #19518

Closed

		@@ -93,7 +93,7 @@ private[sql] trait ColumnarBatchScan extends CodegenSupport {
		}

		val nextBatch = ctx.freshName("nextBatch")

[SPARK-18016][SQL][CATALYST] Code Generation: Constant Pool Limit - Class Splitting #18075

[SPARK-18016][SQL][CATALYST] Code Generation: Constant Pool Limit - Class Splitting #18075

Uh oh!

Conversation

bdrillard commented May 23, 2017

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bdrillard May 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bdrillard May 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kiszk commented May 24, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bdrillard May 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bdrillard May 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kiszk commented May 25, 2017

Uh oh!

ueshin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

bdrillard May 24, 2017 •

edited

Loading

bdrillard May 24, 2017 •

edited

Loading

bdrillard May 24, 2017 •

edited

Loading

bdrillard May 24, 2017 •

edited

Loading

ueshin commented Jun 1, 2017 •

edited

Loading