fix(native): Prevent decimal precision loss in sidecar function registry and expression optimizer#27735
Draft
pramodsatya wants to merge 1 commit intoprestodb:masterfrom
Draft
fix(native): Prevent decimal precision loss in sidecar function registry and expression optimizer#27735pramodsatya wants to merge 1 commit intoprestodb:masterfrom
pramodsatya wants to merge 1 commit intoprestodb:masterfrom
Conversation
…try and expression optimizer
Contributor
Reviewer's GuidePrevents decimal precision loss when the native sidecar is used by (1) disabling native concat overload exposure so Presto’s built-in decimal-safe concat remains authoritative, and (2) disabling native constant folding for any expressions whose type tree contains DECIMAL so they are evaluated by Presto instead of being serialized as lossy doubles. Sequence diagram for decimal-aware native expression optimizationsequenceDiagram
participant Planner
participant NativeExpressionOptimizer
participant CollectingVisitor
participant PrestoEvaluator
participant NativeSidecar
Planner->>NativeExpressionOptimizer: optimize(RowExpression root)
NativeExpressionOptimizer->>CollectingVisitor: visitCall(CallExpression node, context)
CollectingVisitor->>CollectingVisitor: canBeOptimized(node)
CollectingVisitor->>CollectingVisitor: containsDecimalType(node)
alt expression contains DECIMAL
CollectingVisitor->>CollectingVisitor: visitNode(node, false)
Note right of PrestoEvaluator: Expression remains in Presto
NativeExpressionOptimizer->>PrestoEvaluator: evaluate(node) with full DECIMAL precision
else expression does not contain DECIMAL
CollectingVisitor->>CollectingVisitor: visitNode(node, true)
NativeExpressionOptimizer->>NativeSidecar: send expression for constant folding
NativeSidecar-->>NativeExpressionOptimizer: folded constants (non-decimal)
NativeExpressionOptimizer->>Planner: return optimized RowExpression
end
Class diagram for NativeExpressionOptimizer decimal guard changesclassDiagram
class NativeExpressionOptimizer {
- CollectingVisitor collectingVisitor
+ Void visitCall(CallExpression node, Object context)
+ List~RowExpression~ getExpressionsToOptimize()
+ boolean canBeOptimized(RowExpression expression)
}
class CollectingVisitor {
- List~RowExpression~ expressionsToOptimize
+ Void visitCall(CallExpression node, Object context)
+ Void visitNode(RowExpression node, boolean canBeOptimized)
+ List~RowExpression~ getExpressionsToOptimize()
+ static boolean containsDecimalType(RowExpression expression)
+ static boolean containsDecimalType(Type type)
}
class RowExpression {
+ Type getType()
+ List~RowExpression~ getChildren()
}
class CallExpression {
+ List~RowExpression~ getArguments()
+ boolean isConstant()
}
class Type {
+ TypeSignature getTypeSignature()
+ List~Type~ getTypeParameters()
}
class TypeSignature {
+ String getBase()
}
NativeExpressionOptimizer *-- CollectingVisitor : uses
CollectingVisitor ..> RowExpression : traverses
CollectingVisitor ..> CallExpression : visits
CollectingVisitor ..> Type : inspects
Type ..> TypeSignature : owns
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Resolves #27749.
Two related bugs caused decimal precision loss and flaky test failures in
testCoercionswhen the native sidecar is enabled:1. Wrong
concatoverload bound at planning timeNativeSidecarFunctionRegistryToolexposes all native worker function signatures to the Presto coordinator, includingconcatoverloads. The nativeconcatcontains a signature that binds(array(decimal), decimal)operands asarray(double), which wins overload resolution over Presto's built-in decimal-awareconcat. The query's output type is therefore resolved asarray(double)at planning time — precision is lost before execution even begins.2. Lossy decimal serialization during native constant folding
NativeExpressionOptimizercan elect to constant-fold sub-expressions by shipping them to the native sidecar. The sidecar's wire protocol serializesDECIMALconstants as floating-point, so any folded decimal value comes back as aDOUBLE. The precision loss is silent and only affects expressions that are constant-foldable, making it intermittent.Together these produced flaky failures: tests without a live sidecar always passed; tests with sidecar enabled failed on the decimal-array coercion queries.
Solution
NativeSidecarFunctionRegistryTool— filter out all nativeconcatsignatures before exposing them to the coordinator. Presto's built-inconcatoverloads remain the sole candidates for overload resolution, so decimal array/element concat always resolves to the type-preserving implementation.NativeExpressionOptimizer— add acontainsDecimalTypeguard invisitCall: if theCallExpressionor any child expression has aDECIMALtype anywhere in its type tree, immediately mark it non-constant-foldable (visitNode(node, false)). The expression stays in Presto's own evaluator, which handles decimal materialization correctly.Testing
TestTpchDistributedQueries#testCoercionsnow passes consistently with sidecar enabled. No new tests added — the existing coercions test is the regression coverage.Summary by Sourcery
Prevent decimal precision loss when using the native sidecar by avoiding lossy concat overloads and disabling native constant folding for decimal expressions.
Bug Fixes: