[SPARK-47547][CORE] Add `BloomFilter` V2 and use it as default #50933

ishnagy · 2025-05-19T09:37:45Z

What changes were proposed in this pull request?

This change fixes a performance degradation issue in the current BloomFilter implementation.

The current bit index calculation logic does not use any part of the indexable space above the first 31bits, so when the inserted item count approaches (or exceeds) Integer.MAX_VALUE, it will produce significantly worse collision rates than an (ideal) uniformly distributing hash function.

Why are the changes needed?

This should qualify as a bug.

The upper bound on the bit capacity of the current BloomFilter implementation in spark is approx 137G bits (64 bit longs in an Integer.MAX_VALUE sized array). The current indexing scheme can only address about 2G bits of these.

On the other hand, due to the way the BloomFilters are used, the bug won't cause any logical errors, it will gradually render the BloomFilter instance useless by forcing more-and-more queries on the slow path.

Does this PR introduce any user-facing change?

No

How was this patch tested?

new test

One new java testclass was added to sketch to test different combinations of item counts and expected fpp rates.

common/sketch/src/test/java/org/apache/spark/util/sketch/TestSparkBloomFilter.java

testAccuracyEvenOdd
in N number of iterations inserts N even numbers (2*i), and leaves out N odd numbers (2*i+1) from the BloomFilter.

The test checks the 100% accuracy of mightContain=true on all of the even items, and measures the mightContain=true (false positive) rate on the not-inserted odd numbers.

testAccuracyRandom
in 2N number of iterations inserts N pseudorandomly generated numbers in two differently seeded (theoretically independent) BloomFilter instances. All the random numbers generated in an even-iteration will be inserted into both filters, all the random numbers generated in an odd-iteration will be left out from both.

The test checks the 100% accuracy of mightContain=true for all of the items inserted in an even-loop. It counts the false positives as the number of odd-loop items for which the primary filter reports mightContain=true but secondary reports mightContain=false. Since we inserted the same elements into both instances, and the secondary reports non-insertion, the mightContain=true from the primary can only be a false positive.

patched

One minor (test) issue was fixed in

common/sketch/src/test/scala/org/apache/spark/util/sketch/BloomFilterSuite.scala

where the potential repetitions in the randomly generated stream of insertable items resulted in slightly worse fpp measurements than the actual. The problem affected the those testcases more where the cardinality of the tested type is low (the chance of repetition is high), e.g. Byte and Short.

Was this patch authored or co-authored using generative AI tooling?

No

…errors in scala suite

…of the combined hash

peter-toth · 2025-05-19T12:57:47Z

common/sketch/src/test/java/org/apache/spark/util/sketch/TestSparkBloomFilter.java

+            }
+        }
+
+        long mightContainEven = 0;


Please rename these 2 in this test case to clarify that these are actually indices of numbers in a randomly generated stream.

peter-toth · 2025-05-19T13:18:09Z

common/sketch/src/test/java/org/apache/spark/util/sketch/TestSparkBloomFilter.java

+                optimalNumOfBits / Byte.SIZE / 1024 / 1024
+        );
+        Assumptions.assumeTrue(
+                2 * optimalNumOfBits / Byte.SIZE < 4 * ONE_GB,


I guess 4 * ONE_GB is a reasoable limit, can we extract it to a constant and add some comment to it.

…eckstyle errors, renaming test vars

…ward compatible with previously serialized streams

peter-toth · 2025-05-19T14:56:27Z

common/sketch/src/test/java/org/apache/spark/util/sketch/TestSparkBloomFilter.java

+                "mightContainLong must return true for all inserted numbers"
+        );
+
+        double actualFpp = (double) mightContainOddIndexed / numItems;


/ numItems doesn't seem correct here as you don't test numItems number of numbers that were surely not added into the filter.

indeed, it should probably be very close to the proper value, but this calculation doesn't account for the odd indexes ignored based on the secondary's result.

let me try to address that somehow.

… in random test + test formatting

…ic in random test

peter-toth · 2025-05-20T09:39:34Z

Can you please post the output of the new TestSparkBloomFilter here when the 4GB limit of REQUIRED_HEAP_UPPER_BOUND_IN_BYTES is lifted?
And summarize the actual false positive rate (FPP) before and after this fix when numItems = {1000000, 1000000000, 5000000000} and expected FPP is the default 3%?

ishnagy · 2025-05-20T17:36:19Z

the tests with the 4GB limit are still running, I'll post a summary from the results tomorrow, and start a new run that can cover all of the 5G element count cases.

ishnagy · 2025-05-21T16:27:01Z

The filter-from-hex-constant test started to make me worry about compatibility with serialized instances created with the older logic. Even if we can deserialize the buffer and the seed properly, the actual bits will be set in completely different positions. That is, there's no point in trying to use an old (serialized) buffer with the new logic.

Should we create a dedicated BloomFilterImplV2 class for the fixed logic, just so we can keep the old V1 implementation for deserializing old byte streams?

peter-toth · 2025-05-22T09:38:55Z

Should we create a dedicated BloomFilterImplV2 class for the fixed logic, just so we can keep the old V1 implementation for deserializing old byte streams?

I don't think we need to keep the old implementation just to support old serialized versions. It seems we use our bloom filter implementation only in BloomFilterAggregate.

cc @cloud-fan

ishnagy · 2025-05-23T07:00:14Z

I ran into some trouble with generating the test results (running on a single thread, the whole batch takes ~10h on my machine). I'll try to make an update on Monday.

…t output capture

…t output capture - 2nd take

…t output capture - 3rd take

ishnagy · 2025-05-26T22:58:52Z

version	testName	n	fpp	allocatedBitCount	setBitCount	saturation	expectedFpp%	actualFpp%	runningTime
OLD	testAccuracyEvenOdd	1000000	0.05	6235264 (0 MB)	2952137	0.473458	5.000000 %	5.025400 %	PT19.267149499S
OLD	testAccuracyEvenOdd	1000000	0.03	7298496 (0 MB)	3618475	0.495784	3.000000 %	3.022900 %	PT19.628671953S
OLD	testAccuracyEvenOdd	1000000	0.01	9585088 (1 MB)	4968111	0.518317	1.000000 %	0.994700 %	PT19.476457289S
OLD	testAccuracyEvenOdd	1000000	0.001	14377600 (1 MB)	7203887	0.501049	0.100000 %	0.102200 %	PT19.944492903S
OLD	testAccuracyEvenOdd	1000000000	0.05	6235224256 (743 MB)	1814052150	0.290936	5.000000 %	50.920521 %	PT28M6.091484671S
OLD	testAccuracyEvenOdd	1000000000	0.03	7298440896 (870 MB)	1938187323	0.265562	3.000000 %	59.888499 %	PT30M26.383544378S
OLD	testAccuracyEvenOdd	1000000000	0.01	9585058432 (1142 MB)	2065015223	0.215441	1.000000 %	76.025548 %	PT36M30.827858084S
OLD	testAccuracyEvenOdd	1000000000	0.001	14377587584 (1713 MB)	2127081112	0.147944	0.100000 %	90.896130 %	PT45M58.403282401S
OLD	testAccuracyEvenOdd	5000000000	0.05	31176121152 (3716 MB)	2147290054	0.068876	5.000000 %	99.963940 %	PT1H28M39.598973373S
OLD	testAccuracyEvenOdd	5000000000	0.03	36492204224 (4350 MB)	2147464804	0.058847	3.000000 %	99.995623 %	PT1H41M22.171084285S
OLD	testAccuracyEvenOdd	5000000000	0.01	47925291904 (5713 MB)	2147483464	0.044809	1.000000 %	99.999939 %	PT1H59M42.481346242S
OLD	testAccuracyEvenOdd	5000000000	0.001	71887937856 (8569 MB)	2147483648	0.029873	0.100000 %	100.000000 %	PT2H32M41.743734635S

ishnagy · 2025-05-26T23:20:34Z

version	testName	n	fpp	allocatedBitCount	setBitCount	saturation	expectedFpp%	actualFpp%	runningTime
NEW	testAccuracyEvenOdd	1000000	0.05	6235264 (0 MB)	2952282	0.473481	5.000000 %	5.046800 %	PT13.599525353S
NEW	testAccuracyEvenOdd	1000000	0.03	7298496 (0 MB)	3619967	0.495988	3.000000 %	3.018000 %	PT14.086955381S
NEW	testAccuracyEvenOdd	1000000	0.01	9585088 (1 MB)	4968081	0.518314	1.000000 %	1.013400 %	PT14.300125629S
NEW	testAccuracyEvenOdd	1000000	0.001	14377600 (1 MB)	7205256	0.501145	0.100000 %	0.095100 %	PT14.746387272S
NEW	testAccuracyEvenOdd	1000000000	0.05	6235224256 (743 MB)	2963568196	0.475295	5.000000 %	4.889721 %	PT35M6.22696009S
NEW	testAccuracyEvenOdd	1000000000	0.03	7298440896 (870 MB)	3628684972	0.497186	3.000000 %	2.963030 %	PT37M31.833552669S
NEW	testAccuracyEvenOdd	1000000000	0.01	9585058432 (1142 MB)	4973807865	0.518913	1.000000 %	1.001407 %	PT43M23.782325058S
NEW	testAccuracyEvenOdd	1000000000	0.001	14377587584 (1713 MB)	7210348423	0.501499	0.100000 %	0.100803 %	PT57M35.474342424S
NEW	testAccuracyEvenOdd	5000000000	0.05	31176121152 (3716 MB)	14360939834	0.460639	5.000000 %	6.727508 %	PT2H21M2.643592951S
NEW	testAccuracyEvenOdd	5000000000	0.03	36492204224 (4350 MB)	17711039216	0.485338	3.000000 %	3.806971 %	PT2H29M18.334864292S
NEW	testAccuracyEvenOdd	5000000000	0.01	47925291904 (5713 MB)	24462662240	0.510433	1.000000 %	1.321482 %	PT2H56M51.935983408S
NEW	testAccuracyEvenOdd	5000000000	0.001	71887937856 (8569 MB)	35637830341	0.495741	0.100000 %	0.176216 %	PT3H38M21.888031962S

peter-toth

Yeah, actualFpp% seems to be much better when the number of inserted items (n) is huge (~1B).
I'm not sure that the bug actually caused any issues in the injected runtime filters due to the much lower default values of spark.sql.optimizer.runtime.bloomFilter.max... configs, but it is also possible to build a bloom filter manually so it is better to fix it.

BTW, this issue seems to have been observed in Spark: https://stackoverflow.com/questions/78162973/why-is-observed-false-positive-rate-in-spark-bloom-filter-higher-than-expected and was tried to fix with #46370 before.
That old PR was similar to how the issue was fixed in Guava with adding a new strategy / Murmur implementation while this PR fixes the root cause in the current Bloom filter implementation.

peter-toth · 2025-05-27T10:54:30Z

@cloud-fan, as you added the original bloom filter implementation to Spark, could you please take a look at this PR?

…e in switch block

…eam class name in comment

…with 4 spaces

ishnagy · 2025-07-22T13:10:11Z

@dongjoon-hyun @LuciferYang
please have a look at the latest updates, whether those adequately address your remarks.

LuciferYang · 2025-07-22T13:15:39Z

+1, LGTM
Of course, I also hope that #50933 (comment) can be fixed, because there's no need to add a new test dependency just for the current testing scenario.

…the test dependencies

ishnagy · 2025-07-22T14:33:29Z

+1, LGTM Of course, I also hope that #50933 (comment) can be fixed, because there's no need to add a new test dependency just for the current testing scenario.

@LuciferYang , here you go:
6849dbe

LuciferYang · 2025-07-23T02:54:51Z

@dongjoon-hyun Do you need to take another look?

dongjoon-hyun

+1, LGTM. Sorry for being late. I was distracted by other PRs.

Thank you, @ishnagy , @peter-toth , @LuciferYang .

dongjoon-hyun · 2025-07-23T04:21:58Z

Oh, @ishnagy , we need to change the PR title.

dongjoon-hyun · 2025-07-23T04:25:11Z

I revised a little but please feel free to choose a proper one, @ishnagy and @LuciferYang .

ishnagy · 2025-07-23T08:27:55Z

How about

[SPARK-47547][CORE] New `BloomFilterImplV2` as default `BloomFilter` to address FPP degradation

?

(not a strong preference, I'm fine with the current title as well)

peter-toth · 2025-07-23T10:18:12Z

The current title looks good to me.

Thanks @ishnagy for the fix and @dongjoon-hyun , @LuciferYang for the review.

Merged to master (4.1.0).

@dongjoon-hyun , can you please help me to add https://issues.apache.org/jira/secure/ViewProfile.jspa?name=ishnagy to contributors and assign https://issues.apache.org/jira/browse/SPARK-47547 to him?

ishnagy · 2025-07-23T10:25:50Z

thank you,
@peter-toth
@dongjoon-hyun
@LuciferYang

LuciferYang · 2025-07-23T10:34:59Z

@dongjoon-hyun , can you please help me to add https://issues.apache.org/jira/secure/ViewProfile.jspa?name=ishnagy to contributors and assign https://issues.apache.org/jira/browse/SPARK-47547 to him?

@peter-toth I have added ishnagy to the contributors group and assigned this ticket to him.

dongjoon-hyun · 2025-07-23T13:58:28Z

Welcome to the Apache Spark community, @ishnagy .

ishnagy · 2025-07-23T15:14:04Z

Welcome to the Apache Spark community, @ishnagy .

Thanks @dongjoon-hyun , a pleasure to be here. I'm looking forward to contributing in this area.

dongjoon-hyun · 2025-08-02T15:13:44Z

Oh, @ishnagy and @peter-toth , newly added test case seems to take over 12 minutes (721s) which is quite excessive as a unit test. Can we reduce the testing time reasonably?

[info] Test org.apache.spark.util.sketch.SparkBloomFilterSuite#testAccuracyRandomDistribution(long, double, int, org.junit.jupiter.api.TestInfo):#1 started
[info] Test org.apache.spark.util.sketch.SparkBloomFilterSuite#testAccuracyEvenOdd(long, double, int, org.junit.jupiter.api.TestInfo):#1 started
[info] Test run finished: 0 failed, 0 ignored, 2 total, 721.939s

dongjoon-hyun · 2025-08-02T15:30:04Z

I filed two JIRA issues and made a PR to disable SparkBloomFilterSuite for now.

SPARK-53076: Disable SparkBloomFilterSuite
SPARK-53077: Re-enable SparkBloomFilterSuite

### What changes were proposed in this pull request? This PR aims to disable `SparkBloomFilterSuite` due to the excessive running time. - SPARK-53077 is filed to re-enable this with the reasonable running time. ### Why are the changes needed? Previously, `common/sketch` module took less than 10s. ``` $ mvn package --pl common/sketch ... [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 7.177 s [INFO] Finished at: 2025-08-02T08:25:43-07:00 [INFO] ------------------------------------------------------------------------ ``` After `SparkBloomFilterSuite` was added newly, `SparkBloomFilterSuite` took over 12 minutes. It's too long as a unit test. - #50933 ``` [info] Test org.apache.spark.util.sketch.SparkBloomFilterSuite#testAccuracyRandomDistribution(long, double, int, org.junit.jupiter.api.TestInfo):#1 started [info] Test org.apache.spark.util.sketch.SparkBloomFilterSuite#testAccuracyEvenOdd(long, double, int, org.junit.jupiter.api.TestInfo):#1 started [info] Test run finished: 0 failed, 0 ignored, 2 total, 721.939s ``` ### Does this PR introduce _any_ user-facing change? No, this is a test change. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #51788 from dongjoon-hyun/SPARK-53076. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: yangjie01 <[email protected]>

…loomFilterSuite ## reduce insertion count in SparkBloomFilterSuite to mitigate long running time ### What changes were proposed in this pull request? This change reduces the insertion count in the `SparkBloomFilterSuite` testsuite to the bare minimum that's necessary to demonstrate the int truncation bug in the V1 version of `BloomFilterImpl`. ### Why are the changes needed? #50933 introduced a new `SparkBloomFilterSuite` testsuite which increased the test running time of the common/sketch module from about 7s to a whopping 12minutes. This change is a workaround to decrease the test running time, until we can devise a way to then (and only then) trigger these long running tests when there are actual changes done in `common/sketch`. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? the minimum insertion count was selected based on the following measurements with the V1 version of the `BloomFilterImpl`: ``` 100M testAccuracyRandomDistribution: acceptableFpp(3.000000 %) < actualFpp (3.050257 %) [00m18s] T: ~9.6% testAccuracyEvenOdd: acceptableFpp(3.000000 %) < actualFpp (3.053887 %) [00m09s] T: ~9.3% 150M testAccuracyRandomDistribution: acceptableFpp(3.000000 %) < actualFpp (3.080157 %) [00m28s] T: ~15.0% testAccuracyEvenOdd: acceptableFpp(3.000000 %) < actualFpp (3.079987 %) [00m15s] T: ~15.4% 200M testAccuracyRandomDistribution: acceptableFpp(3.000000 %) < actualFpp (3.861257 %) [00m37s] T: ~19.8% testAccuracyEvenOdd: acceptableFpp(3.000000 %) < actualFpp (3.860424 %) [00m20s] T: ~20.6% 250M testAccuracyRandomDistribution: acceptableFpp(3.000000 %) < actualFpp (3.676172 %) [00m47s] T: ~25.1% testAccuracyEvenOdd: acceptableFpp(3.000000 %) < actualFpp (3.675387 %) [00m25s] T: ~25.8% 300M testAccuracyRandomDistribution: acceptableFpp(3.000000 %) < actualFpp (3.210548 %) [00m57s] T: ~30.5% testAccuracyEvenOdd: acceptableFpp(3.000000 %) < actualFpp (3.209847 %) [00m30s] T: ~30.1% 350M testAccuracyRandomDistribution: acceptableFpp(3.000000 %) < actualFpp (5.377388 %) [01m07s] T: ~35.8% testAccuracyEvenOdd: acceptableFpp(3.000000 %) < actualFpp (5.377483 %) [00m36s] T: ~37.1% 400M testAccuracyRandomDistribution: acceptableFpp(3.000000 %) < actualFpp (8.170380 %) [01m17s] T: ~41.2% testAccuracyEvenOdd: acceptableFpp(3.000000 %) < actualFpp (8.170716 %) [00m40s] T: ~41.2% 500M testAccuracyRandomDistribution: acceptableFpp(3.000000 %) < actualFpp (15.392861 %) [01m36s] T: ~51.3% testAccuracyEvenOdd: acceptableFpp(3.000000 %) < actualFpp (15.391692 %) [00m50s] T: ~51.5% 1G testAccuracyRandomDistribution: acceptableFpp(3.000000 %) < actualFpp (59.890330 %) [03m07s] T: 100.0% testAccuracyEvenOdd: acceptableFpp(3.000000 %) < actualFpp (59.888499 %) [01m37s] T: 100.0% ``` ### Was this patch authored or co-authored using generative AI tooling? No Closes #51845 from ishnagy/SPARK-53077_reenable_SparkBloomFilterSuite. Authored-by: Ish Nagy <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

ishnagy added 4 commits May 15, 2025 01:20

SPARK-47547 BloomFilter fpp degradation: addressing the int32 truncation

3c5a843

SPARK-47547 BloomFilter fpp degradation: fixing test data repetition …

08cbfeb

…errors in scala suite

SPARK-47547 BloomFilter fpp degradation: scrambling the high 32bytes …

e3cb08e

…of the combined hash

SPARK-47547 BloomFilter fpp degradation: random distribution fpp test

c4e3f58

github-actions bot added SQL BUILD labels May 19, 2025

peter-toth reviewed May 19, 2025

View reviewed changes

ishnagy added 2 commits May 19, 2025 16:24

SPARK-47547 BloomFilter fpp degradation: javadoc for test methods, ch…

1a0b66f

…eckstyle errors, renaming test vars

SPARK-47547 BloomFilter fpp degradation: make seed serialization back…

d912b66

…ward compatible with previously serialized streams

peter-toth reviewed May 19, 2025

View reviewed changes

ishnagy force-pushed the SPARK-47547_bloomfilter_fpp_degradation branch from 8edf4dd to 57298f0 Compare May 19, 2025 16:07

SPARK-47547 BloomFilter fpp degradation: counting discarded odd items…

f589e2c

… in random test + test formatting

ishnagy force-pushed the SPARK-47547_bloomfilter_fpp_degradation branch from 57298f0 to f589e2c Compare May 19, 2025 16:11

ishnagy added 2 commits May 19, 2025 19:03

SPARK-47547 BloomFilter fpp degradation: refactoring FPP counting log…

f597c76

…ic in random test

SPARK-47547 BloomFilter fpp degradation: checkstyle fix

4ea633d

SPARK-47547 BloomFilter fpp degradation: fix test bug

6696106

ishnagy added 3 commits May 26, 2025 17:51

SPARK-47547 BloomFilter fpp degradation: parallelization friendly tes…

b75e187

…t output capture

SPARK-47547 BloomFilter fpp degradation: parallelization friendly tes…

2d8a9f1

…t output capture - 2nd take

SPARK-47547 BloomFilter fpp degradation: parallelization friendly tes…

4a30794

…t output capture - 3rd take

ishnagy changed the title ~~[WIP] [SPARK-47547] BloomFilter fpp degradation~~ [SPARK-47547] BloomFilter fpp degradation May 27, 2025

peter-toth approved these changes May 27, 2025

View reviewed changes

ishnagy added 3 commits July 15, 2025 19:42

SPARK-47547 BloomFilter fpp degradation: remove redundant default cas…

ce3ad76

…e in switch block

SPARK-47547 BloomFilter fpp degradation: properly capitalize InputStr…

626e459

…eam class name in comment

SPARK-47547 BloomFilter fpp degradation: indenting method parameters …

b0f5b45

…with 4 spaces

SPARK-47547 BloomFilter fpp degradation: removing junit-pioneer from …

6849dbe

…the test dependencies

github-actions bot removed the BUILD label Jul 22, 2025

LuciferYang approved these changes Jul 22, 2025

View reviewed changes

dongjoon-hyun approved these changes Jul 23, 2025

View reviewed changes

dongjoon-hyun changed the title ~~[SPARK-47547] BloomFilter fpp degradation~~ [SPARK-47547][CORE] BloomFilter fpp degradation Jul 23, 2025

dongjoon-hyun changed the title ~~[SPARK-47547][CORE] BloomFilter fpp degradation~~ [SPARK-47547][CORE] Add BloomFilter V2 and use it as default Jul 23, 2025

peter-toth closed this in a08d8b0 Jul 23, 2025

zhztheplayer mentioned this pull request Jul 24, 2025

[GLUTEN-9849][VL] Avoid VeloxBloomFilterMightContain being applied to FileSourceScan partition filters apache/incubator-gluten#9850

Merged

dongjoon-hyun mentioned this pull request Aug 2, 2025

[SPARK-53076][CORE][TESTS] Disable SparkBloomFilterSuite #51788

Closed

ishnagy mentioned this pull request Aug 5, 2025

[SPARK-53077][CORE][TESTS][FOLLOWUP] Reduce insertion count in SparkBloomFilterSuite #51845

Closed

[SPARK-47547][CORE] Add BloomFilter V2 and use it as default #50933

[SPARK-47547][CORE] Add BloomFilter V2 and use it as default #50933

Uh oh!

Conversation

ishnagy commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

new test

patched

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

peter-toth May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

peter-toth May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

peter-toth May 19, 2025

Choose a reason for hiding this comment

Uh oh!

ishnagy May 19, 2025

Choose a reason for hiding this comment

Uh oh!

peter-toth commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ishnagy commented May 20, 2025

Uh oh!

ishnagy commented May 21, 2025

Uh oh!

peter-toth commented May 22, 2025

Uh oh!

ishnagy commented May 23, 2025

Uh oh!

ishnagy commented May 26, 2025

Uh oh!

ishnagy commented May 26, 2025

Uh oh!

peter-toth left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

peter-toth commented May 27, 2025

Uh oh!

ishnagy commented Jul 22, 2025

Uh oh!

LuciferYang commented Jul 22, 2025

Uh oh!

ishnagy commented Jul 22, 2025

Uh oh!

LuciferYang commented Jul 23, 2025

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Jul 23, 2025

Uh oh!

dongjoon-hyun commented Jul 23, 2025

Uh oh!

ishnagy commented Jul 23, 2025

Uh oh!

peter-toth commented Jul 23, 2025

Uh oh!

ishnagy commented Jul 23, 2025

Uh oh!

LuciferYang commented Jul 23, 2025

Uh oh!

dongjoon-hyun commented Jul 23, 2025

Uh oh!

ishnagy commented Jul 23, 2025

Uh oh!

dongjoon-hyun commented Aug 2, 2025

Uh oh!

dongjoon-hyun commented Aug 2, 2025

Uh oh!

[SPARK-47547][CORE] Add `BloomFilter` V2 and use it as default #50933

[SPARK-47547][CORE] Add `BloomFilter` V2 and use it as default #50933

ishnagy commented May 19, 2025 •

edited

Loading

peter-toth May 19, 2025 •

edited

Loading

peter-toth May 19, 2025 •

edited

Loading

peter-toth commented May 20, 2025 •

edited

Loading

peter-toth left a comment •

edited

Loading