-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-18016][SQL][CATALYST] Code Generation: Constant Pool Limit - Class Splitting #18075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -629,7 +736,9 @@ class CodegenContext { | |||
|
|||
/** | |||
* Splits the generated code of expressions into multiple functions, because function has | |||
* 64kb code size limit in JVM | |||
* 64kb code size limit in JVM. If the class the function is to be inlined to would grow beyond | |||
* 1600kb, a private, netsted sub-class is declared, and the function is inlined to it, because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: netsted -> nested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -792,7 +887,18 @@ class CodegenContext { | |||
addMutableState(javaType(expr.dataType), value, | |||
s"$value = ${defaultValue(expr.dataType)};") | |||
|
|||
subexprFunctions += s"$fnName($INPUT_ROW);" | |||
// Generate the code for this expression tree and wrap it in a function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason to move this code block from the original place to here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for the scope of this class splitting change. The previous pull-request that also made addMutableState
return an accessor reference required the order to change so that correct variables could be declared prior to the code of fn
, but for this pull-request, only ensuring that subexprFunctions
gets the return of addNewFunction
is necessary. I'll change the ordering back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed: 90a907a#diff-8bcc5aea39c73d4bf38aef6f6951d42cR872
Note: a rebase with the two commits squashed will show the subexprFunctions
as the only changed line, which is what we want.
Thank you. Absolutely, it is easier to review this change. |
*/ | ||
private[sql] def declareAddedFunctions(): String = { | ||
classFunctions("OuterClass").map { | ||
case (funcName, funcCode) => funcCode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: funcName -> _
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} else { | ||
val code = functions.map { | ||
case (_, funcCode) => | ||
s"$funcCode" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s"$funcCode" -> funcCode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -299,6 +297,9 @@ case class SampleExec( | |||
| } | |||
""".stripMargin.trim) | |||
|
|||
ctx.addMutableState(s"$samplerClass<UnsafeRow>", sampler, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put this block at the original place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change should stay the way it is. Notice that the initialization code for the addMutableState
call on the line just below is the same code passed to the addNewFunction
call. This means that when we create the new function, we may get back a class-qualified function name (which here we store in initSamplerFuncName
), so the call to addMutableState
must come after the addNewFunction
call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, got it. I overlooked the new dependency regarding initSamplerFuncName
.
*/ | ||
private[sql] def initNestedClasses(): String = { | ||
// Nested, private sub-classes have no mutable state (though they do reference the outer class' | ||
// mutable state), so we declare and initialize them inline ot the OuterClass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ot -> at
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
val name = classInfo._1 | ||
|
||
classSize.update(name, classSize(name) + funcCode.length) | ||
classFunctions.update(name, classFunctions(name) += funcCode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not 100% sure, but is " += " required? How about " = "?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+=
will be necessary since classFunctions(name)
will return the ListBuffer[String]
containing all functions belonging to the given class name from the classFunctions
map, and we want to append the new funcCode to that buffer. Also, classFunctions.update
is going to expect a ListBuffer[String]
as its second argument, but the return type of assignment =
is just Unit
. +=
will append the element and then return the modified buffer (which is the behavior we want as per the API doc for ListBuffer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for confusing you, I made a mistake in my comment. I wanted to say " + " instead of " = ".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sure. The story seems pretty much the same, we still want the append operation that also returns the reference to the modified buffer, and that's given by +=
. Also, it doesn't look like +
is defined as an operation on a ListBuffer[A]
and an element of type A
(see ListBuffer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your clarification.
In that case, is classFunctions(name) += funcCode
enough instead of calling classFunctions.update
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good point, since we're using a mutable buffer, we can update the referenced object directly even if its contained inside the map. Since +=
explicitly returns the reference to the modified buffer, it would probably be most straightforward to use
classFunctions(name).append(funcCode)
since append
has Unit
return type, and we don't need any results from appending the code to the class's buffer of function code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a commit with that change if you think it checks out: c225f3a#diff-8bcc5aea39c73d4bf38aef6f6951d42cL290
Thanks, sound good to me for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this!
I added some comments and I found some other places we might need to modify:
- It seems like we need to remove
this.
at objects.scala#L984. - I guess also we need to add initializations of inner classes at GenerateColumnAccessor.scala#L225.
def addNewFunction( | ||
funcName: String, | ||
funcCode: String, | ||
inlineToOuterClass: Boolean = false): String = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indent. Add 2 more spaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// limit, 65,536. We cannot know how many constants will be inserted for a class, so we use a | ||
// threshold of 1600k bytes to determine when a function should be inlined to a private, nested | ||
// sub-class. | ||
val classInfo = if (inlineToOuterClass) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We can use val (className, classInstance) = if ...
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed (with some refactoring based on those variables being available in the scope earlier): 493113c#diff-8bcc5aea39c73d4bf38aef6f6951d42cL278
|
||
// Adds a new class. Requires the class' name, and its instance name. | ||
private def addClass(className: String, classInstance: String): Unit = { | ||
classes.prepend(Tuple2(className, classInstance)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: How about classes.prepend((className, classInstance))
or classes.prepend(className -> classInstance)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
}) | ||
|
||
// Create the collection. | ||
val wrapperClass = classOf[mutable.WrappedArray[_]].getName | ||
ctx.addMutableState( | ||
s"$wrapperClass<InternalRow>", | ||
ev.value, | ||
s"this.${ev.value} = $wrapperClass$$.MODULE$$.make(this.$rowData);") | ||
s"this.${ev.value} = $wrapperClass$$.MODULE$$.make($rowData);") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should remove one more this.
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, fixed: 493113c#diff-16493d6958b6daaf4a24dd7b780ba4bcL201
val name = classInfo._1 | ||
|
||
classSize.update(name, classSize(name) + funcCode.length) | ||
classFunctions(name).append(funcCode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about:
classSize(name) += funcCode.length
classFunctions(name) += funcCode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose that's more concise/readable given the underlying mutable data structures: 493113c#diff-8bcc5aea39c73d4bf38aef6f6951d42cL292
@ueshin As for the remaining this in |
ok to test |
Test build #77564 has finished for PR 18075 at commit
|
…ements, and ordering
493113c
to
7fe5e4a
Compare
Test build #77597 has finished for PR 18075 at commit
|
The earlier failure occurred when the |
LGTM for now, cc @cloud-fan. |
We're really looking forward to this change! This bug is limiting a lot of the work we'd like to do with Spark. Any idea who we can ping to move this along? |
@cloud-fan can you have a time to look at this? |
reviewing |
|
||
// A map holding the current size in bytes of each class to be generated. | ||
private val classSize: mutable.Map[String, Int] = | ||
mutable.Map[String, Int]("OuterClass" -> 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can create a variable for this "OuterClass"
instead of hardcoding it in many places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
classSize(className) += funcCode.length | ||
classFunctions(className) += funcName -> funcCode | ||
|
||
if (className.equals("OuterClass")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: className == "OuterClass"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mutable.Map[String, Int]("OuterClass" -> 0) | ||
|
||
// Nested maps holding function names and their code belonging to each class. | ||
private val classFunctions: mutable.Map[String, mutable.Map[String, String]] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems we only need a map from class name to function codes? i.e. mutable.Map[String, mutable.ListBuffer[String]]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had originally thought so, but it turns out that there's at least one instance where the code for a given function name is updated during the code-generation process. The generated stopEarly
function can actually be inserted twice, once returning a variable returning a different stopEarly
variable each time. What would end up occurring is that two functions of the same signature would exist in the class, causing a compile error. So we need to use a map to make sure the implementation gets updated for a given function when necessary.
Note also that the old implementation of addedFunctions
was a map also.
@@ -629,7 +730,9 @@ class CodegenContext { | |||
|
|||
/** | |||
* Splits the generated code of expressions into multiple functions, because function has | |||
* 64kb code size limit in JVM | |||
* 64kb code size limit in JVM. If the class the function is to be inlined to would grow beyond |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the class the function
-> If the class with the function
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the grammatically correct/hopefully more clear form of that same docstring: 678b4ad#diff-8bcc5aea39c73d4bf38aef6f6951d42cR727
private[sql] def initNestedClasses(): String = { | ||
// Nested, private sub-classes have no mutable state (though they do reference the outer class' | ||
// mutable state), so we declare and initialize them inline to the OuterClass. | ||
classes.map { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: classes.filterKeys(_ != "OuterClass").map ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -83,6 +83,58 @@ class GeneratedProjectionSuite extends SparkFunSuite { | |||
assert(result === row2) | |||
} | |||
|
|||
test("SPARK-18016: generated projections on wider table requiring class-splitting") { | |||
val N = 4000 | |||
val wideRow1 = new GenericInternalRow((1 to N).toArray[Any]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: new GenericInternalRow(N)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. I've cleaned up the test you comment on, and the one above it, since they both have the same structure, just different values for N:
678b4ad#diff-a14107cf4a4c41671bba24a82f6042d9R36
val unsafeProj = UnsafeProjection.create(nestedSchema) | ||
val unsafe: UnsafeRow = unsafeProj(nested) | ||
(0 until N).foreach { i => | ||
val s = UTF8String.fromString((i + 1).toString) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create input data with 0 until N
, or test it with 1 to N
, to avoid this i + 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. See the above comment. Creating the data with 0 until N
cleans up the indexing on i
.
@@ -93,7 +93,7 @@ private[sql] trait ColumnarBatchScan extends CodegenSupport { | |||
} | |||
|
|||
val nextBatch = ctx.freshName("nextBatch") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: inline this nextBatch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, how about we make addNewFunction
accepts funcNameHint
and generate unique func name inside addNewFunction
? Then users can just call: val nextBatch = ctx.addNewFunction("nextBatch", ...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it depends on which implementation we think is cleaner. The freshName generated by the caller is typically used twice, once in the call to addNewFunction
, but also immediately in the function code as the method name. If we use a name hint, we'd have to do a string replace
inside addNewFunction
to update the placeholder method name with the freshname. So it would seem either we keep
val nextBatch = ctx.freshName("nextBatch")
val nextBatchFuncName = ctx.addNewFunction(nextBatch,
s"""
|private void $nextBatch() throws java.io.IOException {
| long getBatchStart = System.nanoTime();
| if ($input.hasNext()) {
| $batch = ($columnarBatchClz)$input.next();
| $numOutputRows.add($batch.numRows());
| $idx = 0;
| ${columnAssigns.mkString("", "\n", "\n")}
| }
| $scanTimeTotalNs += System.nanoTime() - getBatchStart;
|}""".stripMargin)
or we have
val nextBatchHint = "nextBatch"
val nextBatch = ctx.addNewFunction(nextBatchHint,
s"""
|private void $nextBatchHint() throws java.io.IOException {
| long getBatchStart = System.nanoTime();
| if ($input.hasNext()) {
| $batch = ($columnarBatchClz)$input.next();
| $numOutputRows.add($batch.numRows());
| $idx = 0;
| ${columnAssigns.mkString("", "\n", "\n")}
| }
| $scanTimeTotalNs += System.nanoTime() - getBatchStart;
|}""".stripMargin)
where addNewFunction
would do the proper replacement over the code for the method with a freshname generated from "nextBatch" as a name hint.
Or in every instance, we just duplicate the string hint without creating a variable for it in both the addNewFunction
call and the method name:
val nextBatch = ctx.addNewFunction("nextBatch",
s"""
|private void nextBatch() throws java.io.IOException {
| long getBatchStart = System.nanoTime();
| if ($input.hasNext()) {
| $batch = ($columnarBatchClz)$input.next();
| $numOutputRows.add($batch.numRows());
| $idx = 0;
| ${columnAssigns.mkString("", "\n", "\n")}
| }
| $scanTimeTotalNs += System.nanoTime() - getBatchStart;
|}""".stripMargin)
Which would you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's keep it as it was
* | ||
* @param funcName the class-unqualified name of the function | ||
* @param funcCode the body of the function | ||
* @param inlineToOuterClass whether the given code must be inlined to the `OuterClass`. This |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you give an example? I'm not very clear when we need this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see the portion of doConsume
in the Limit class where the stopEarly
function is registered, https://github.com/apache/spark/pull/18075/files#diff-379cccace8699ca00b76ff5631222adeR73
In this section of code, the registration of the function is separate from the caller code, so unlike other changes in this patch, we have no way of informing the caller code what the potentially class-qualified name of the function would be if it were inlined to a nested class. Instead, the caller code for the function (in WholeStageCodegenExec), makes a hard assumption that stopEarly
will be visible globally, that is, in the outer class. The caller is divorced from the function producer across classes, so it's not clear how to make a generated function name visible, but the hint to inline to just inline the function to the outer class fixes that issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me, as the stopEarly
in Limit
is going to override the stopEarly
in BufferedRowIterator
, we can only put it in outer class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, whole stage codegen is really tricky...
LGTM except some style comments |
Test build #78059 has finished for PR 18075 at commit
|
@@ -233,10 +222,118 @@ class CodegenContext { | |||
// The collection of sub-expression result resetting methods that need to be called on each row. | |||
val subexprFunctions = mutable.ArrayBuffer.empty[String] | |||
|
|||
def declareAddedFunctions(): String = { | |||
addedFunctions.map { case (funcName, funcCode) => funcCode }.mkString("\n") | |||
val outerClassName = "OuterClass" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: private val
@@ -33,10 +33,10 @@ class GeneratedProjectionSuite extends SparkFunSuite { | |||
|
|||
test("generated projections on wider table") { | |||
val N = 1000 | |||
val wideRow1 = new GenericInternalRow((1 to N).toArray[Any]) | |||
val wideRow1 = new GenericInternalRow((0 until N).toArray[Any]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can be new GenericInternalRow(N)
|
||
test("SPARK-18016: generated projections on wider table requiring class-splitting") { | ||
val N = 4000 | ||
val wideRow1 = new GenericInternalRow((0 until N).toArray[Any]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
thanks, merging to master! you can address the remaining comments in your other PRs |
|
||
/** | ||
* Adds a function to the generated class. If the code for the `OuterClass` grows too large, the | ||
* function will be inlined into a new private, nested class, and a class-qualified name for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: class instance-qualified name
* can be necessary when a function is declared outside of the context | ||
* it is eventually referenced and a returned qualified function name | ||
* cannot otherwise be accessed. | ||
* @return the name of the function, qualified by class if it will be inlined to a private, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
…lass Splitting ## What changes were proposed in this pull request? This pull-request exclusively includes the class splitting feature described in apache#16648. When code for a given class would grow beyond 1600k bytes, a private, nested sub-class is generated into which subsequent functions are inlined. Additional sub-classes are generated as the code threshold is met subsequent times. This code includes 3 changes: 1. Includes helper maps, lists, and functions for keeping track of sub-classes during code generation (included in the `CodeGenerator` class). These helper functions allow nested classes and split functions to be initialized/declared/inlined to the appropriate locations in the various projection classes. 2. Changes `addNewFunction` to return a string to support instances where a split function is inlined to a nested class and not the outer class (and so must be invoked using the class-qualified name). Uses of `addNewFunction` throughout the codebase are modified so that the returned name is properly used. 3. Removes instances of the `this` keyword when used on data inside generated classes. All state declared in the outer class is by default global and accessible to the nested classes. However, if a reference to global state in a nested class is prepended with the `this` keyword, it would attempt to reference state belonging to the nested class (which would not exist), rather than the correct variable belonging to the outer class. ## How was this patch tested? Added a test case to the `GeneratedProjectionSuite` that increases the number of columns tested in various projections to a threshold that would previously have triggered a `JaninoRuntimeException` for the Constant Pool. Note: This PR does not address the second Constant Pool issue with code generation (also mentioned in apache#16648): excess global mutable state. A second PR may be opened to resolve that issue. Author: ALeksander Eskilson <[email protected]> Closes apache#18075 from bdrillard/class_splitting_only.
…ol Limit - Class Splitting ## What changes were proposed in this pull request? This is a backport patch for Spark 2.1.x of the class splitting feature over excess generated code as was merged in #18075. ## How was this patch tested? The same test provided in #18075 is included in this patch. Author: ALeksander Eskilson <[email protected]> Closes #18354 from bdrillard/class_splitting_2.1.
…ol Limit - Class Splitting ## What changes were proposed in this pull request? This is a backport patch for Spark 2.2.x of the class splitting feature over excess generated code as was merged in #18075. ## How was this patch tested? The same test provided in #18075 is included in this patch. Author: ALeksander Eskilson <[email protected]> Closes #18377 from bdrillard/class_splitting_2.2.
The second part that follows this merged PR is up as #19518. |
What changes were proposed in this pull request?
This pull-request exclusively includes the class splitting feature described in #16648. When code for a given class would grow beyond 1600k bytes, a private, nested sub-class is generated into which subsequent functions are inlined. Additional sub-classes are generated as the code threshold is met subsequent times. This code includes 3 changes:
CodeGenerator
class). These helper functions allow nested classes and split functions to be initialized/declared/inlined to the appropriate locations in the various projection classes.addNewFunction
to return a string to support instances where a split function is inlined to a nested class and not the outer class (and so must be invoked using the class-qualified name). Uses ofaddNewFunction
throughout the codebase are modified so that the returned name is properly used.this
keyword when used on data inside generated classes. All state declared in the outer class is by default global and accessible to the nested classes. However, if a reference to global state in a nested class is prepended with thethis
keyword, it would attempt to reference state belonging to the nested class (which would not exist), rather than the correct variable belonging to the outer class.How was this patch tested?
Added a test case to the
GeneratedProjectionSuite
that increases the number of columns tested in various projections to a threshold that would previously have triggered aJaninoRuntimeException
for the Constant Pool.Note: This PR does not address the second Constant Pool issue with code generation (also mentioned in #16648): excess global mutable state. A second PR may be opened to resolve that issue.