Skip to content

[SPARK-52771][PS] Fix float32 type widening in truediv/floordiv #51456

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

xinrong-meng
Copy link
Member

@xinrong-meng xinrong-meng commented Jul 11, 2025

What changes were proposed in this pull request?

Fix float32 type widening in truediv/floordiv when ANSI on/off.

Why are the changes needed?

Ensure pandas on Spark works well with ANSI mode on/off.

Note that the issue exists whether ANSI mode is on or off, as shown below,

>>> pser = pd.Series([1.1, 2.2, 3.3], dtype=np.float32)
>>> psser = ps.from_pandas(pser)
>>> spark.conf.set("spark.sql.ansi.enabled", False)
>>> psser / 1
0    1.1                                                                        
1    2.2
2    3.3
dtype: float64
>>> spark.conf.set("spark.sql.ansi.enabled", True)
>>> psser / 1
0    1.1
1    2.2
2    3.3
dtype: float64

Does this PR introduce any user-facing change?

Yes. truediv/floordiv under ANSI works as expected, with ANSI on/off, as shown below.

>>> import pandas as pd
>>> import numpy as np
>>> 
>>> ps.set_option("compute.fail_on_ansi_mode", False)
>>> ps.set_option("compute.ansi_mode_support", True)
>>> 
>>> pser = pd.Series([1.1, 2.2, 3.3], dtype=np.float32)
>>> psser = ps.from_pandas(pser)
>>> psser / 1
0    1.1                                                                        
1    2.2
2    3.3
dtype: float32
>>> psser // 1
0    1.0
1    2.0
2    3.0
dtype: float32

How was this patch tested?

Unit tests.

Commands below all passed

 1103  SPARK_ANSI_SQL_MODE=true  ./python/run-tests --python-executables=python3.11 --testnames "pyspark.pandas.tests.computation.test_binary_ops FrameBinaryOpsTests.test_binary_operator_truediv"
 1104  SPARK_ANSI_SQL_MODE=false  ./python/run-tests --python-executables=python3.11 --testnames "pyspark.pandas.tests.computation.test_binary_ops FrameBinaryOpsTests.test_binary_operator_truediv"
 1106  SPARK_ANSI_SQL_MODE=true  ./python/run-tests --python-executables=python3.11 --testnames "pyspark.pandas.tests.computation.test_binary_ops FrameBinaryOpsTests.test_binary_operator_floordiv"
 1108  SPARK_ANSI_SQL_MODE=false  ./python/run-tests --python-executables=python3.11 --testnames "pyspark.pandas.tests.computation.test_binary_ops FrameBinaryOpsTests.test_binary_operator_floordiv"
 1126  git status
 1127  SPARK_ANSI_SQL_MODE=true  ./python/run-tests --python-executables=python3.11 --testnames "pyspark.pandas.tests.computation.test_binary_ops FrameBinaryOpsTests.test_divide_by_zero_behavior"
 1128  SPARK_ANSI_SQL_MODE=false  ./python/run-tests --python-executables=python3.11 --testnames "pyspark.pandas.tests.computation.test_binary_ops FrameBinaryOpsTests.test_divide_by_zero_behavior"

Was this patch authored or co-authored using generative AI tooling?

No.

@xinrong-meng xinrong-meng changed the title [SPARK-52771][PS] Fix float32 type widening in truediv/floordiv under ANSI [SPARK-52771][PS] Fix float32 type widening in truediv/floordiv Jul 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant