-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-7251][Spark Core] Perform sequential scan when iterating over entries in BytesToBytesMap #5836
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Can one of the admins verify this patch? |
@@ -89,7 +89,9 @@ public static void putDouble(Object object, long offset, double value) { | |||
} | |||
|
|||
public static long allocateMemory(long size) { | |||
return _UNSAFE.allocateMemory(size); | |||
long address = _UNSAFE.allocateMemory(size); | |||
_UNSAFE.setMemory(address, size, (byte) 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that we should zero out our allocated memory by default:
- In many cases, it is unnecessary and carries a performance penalty.
setMemory
isn't available in all versions of Java 6; this is one of the reasons why I chose not to expose it in theUNSAFE
facade.
To address your comment from JIRA:
We should handle this corner case. One approach might be to store a negative value for the record length. We should add a test case to BytesToBytesMapSuite that tries storing empty keys and values. |
@JoshRosen Please test this PR. |
Jenkins, this is ok to test. |
Merged build triggered. |
Merged build started. |
Test build #31726 has started for PR 5836 at commit |
Test build #31726 has finished for PR 5836 at commit
|
Merged build finished. Test FAILed. |
Test FAILed. |
@JoshRosen oro#oro;2.0.8!oro.jar origin location must be absolute. Building failed. Please retest. |
We're still investigating this Jenkins flakiness issue (it might have something to do with certain build machines' ivy caches). In the meantime, let's restest this. I'll loop back later to do a review pass on this. |
Jenkins, retest this please. |
Merged build triggered. |
Merged build started. |
Test build #31795 has started for PR 5836 at commit |
Test build #31795 has finished for PR 5836 at commit
|
Merged build finished. Test FAILed. |
Test FAILed. |
@JoshRosen Please retest this. |
Merged build triggered. |
Merged build started. |
Test build #31815 has started for PR 5836 at commit |
Merged build finished. Test FAILed. |
Test FAILed. |
@JoshRosen Please retest this. Failed because SparkSubmitSuit timeout. |
Jenkins, retest this please. |
(Sorry, we've been battling some test flakiness this week; I think that this test is either fixed or ignored now). |
Merged build triggered. |
Merged build started. |
Test build #31885 has started for PR 5836 at commit |
Test build #31885 has finished for PR 5836 at commit
|
Merged build finished. Test PASSed. |
Test PASSed. |
@JoshRosen Please merge this to master. |
nextPos = bitset.nextSetBit(nextPos + 1); | ||
return loc.with(pos, 0, true); | ||
if (currentPage == null) { | ||
currentPage = dataPages.get(pageCur++); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should never use inline increment operations like this. When I see this, I have to take time to remember whether get()
is called with the value before the increment or after. I don't think that it makes sense to trade off brevity for clarity like this; please move the increment to a separate line.
I decided to try to benchmark this using the following harness: https://gist.github.com/JoshRosen/286b26494ab27e657051 When I ran this benchmark, I immediately ran into a bug: [error] Exception in thread "main" java.lang.NullPointerException
[error] at org.apache.spark.unsafe.memory.TaskMemoryManager.getPage(TaskMemoryManager.java:194)
[error] at org.apache.spark.unsafe.map.BytesToBytesMap$1.next(BytesToBytesMap.java:195)
[error] at org.apache.spark.unsafe.map.BytesToBytesMap$1.next(BytesToBytesMap.java:174)
[error] at org.apache.spark.sql.BytesToBytesMapIterationBenchmark$.runBenchmark(BytesToBytesMapIterationBenchmark.scala:36)
[error] at org.apache.spark.sql.BytesToBytesMapIterationBenchmark$$anonfun$main$2$$anonfun$apply$mcVI$sp$1$$anonfun$apply$1.apply$mcVI$sp(BytesToBytesMapIterationBenchmark.scala:64)
[error] at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
[error] at org.apache.spark.sql.BytesToBytesMapIterationBenchmark$$anonfun$main$2$$anonfun$apply$mcVI$sp$1.apply(BytesToBytesMapIterationBenchmark.scala:63)
[error] at org.apache.spark.sql.BytesToBytesMapIterationBenchmark$$anonfun$main$2$$anonfun$apply$mcVI$sp$1.apply(BytesToBytesMapIterationBenchmark.scala:58)
[error] at scala.collection.immutable.List.foreach(List.scala:318)
[error] at org.apache.spark.sql.BytesToBytesMapIterationBenchmark$$anonfun$main$2.apply$mcVI$sp(BytesToBytesMapIterationBenchmark.scala:58)
[error] at org.apache.spark.sql.BytesToBytesMapIterationBenchmark$$anonfun$main$2.apply(BytesToBytesMapIterationBenchmark.scala:57)
[error] at org.apache.spark.sql.BytesToBytesMapIterationBenchmark$$anonfun$main$2.apply(BytesToBytesMapIterationBenchmark.scala:57)
[error] at scala.collection.Iterator$class.foreach(Iterator.scala:727)
[error] at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
[error] at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
[error] at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
[error] at org.apache.spark.sql.BytesToBytesMapIterationBenchmark$.main(BytesToBytesMapIterationBenchmark.scala:57)
[error] at org.apache.spark.sql.BytesToBytesMapIterationBenchmark.main(BytesToBytesMapIterationBenchmark.scala) I noticed many issues in this patch, which implies that we need better unit tests for this code (the external stress test harnesses manage to test this pretty well, but we should integrate those into our CI pipeline). Also, we need to have a better description of the changes; this PR shouldn't have an empty description. The code also needs comments to make it clear that we first iterate through data pages, then through records in a page, rolling over to the next page or stopping when we encounter the special negative length. We also need comments on some of the various |
It's fine if you want to continue to work on this patch, but I'd also be happy to just take this over myself, since I don't think it will take me more than 30 minutes tops to implement this with good comments + tests. |
I've opened #6159, which cleans up this patch and brings it up to 100% line coverage. |
This patch modifies `BytesToBytesMap.iterator()` to iterate through records in the order that they appear in the data pages rather than iterating through the hashtable pointer arrays. This results in fewer random memory accesses, significantly improving performance for scan-and-copy operations. This is possible because our data pages are laid out as sequences of `[keyLength][data][valueLength][data]` entries. In order to mark the end of a partially-filled data page, we write `-1` as a special end-of-page length (BytesToByesMap supports empty/zero-length keys and values, which is why we had to use a negative length). This patch incorporates / closes #5836. Author: Josh Rosen <[email protected]> Closes #6159 from JoshRosen/SPARK-7251 and squashes the following commits: 05bd90a [Josh Rosen] Compare capacity, not size, to MAX_CAPACITY 2a20d71 [Josh Rosen] Fix maximum BytesToBytesMap capacity bc4854b [Josh Rosen] Guard against overflow when growing BytesToBytesMap f5feadf [Josh Rosen] Add test for iterating over an empty map 273b842 [Josh Rosen] [SPARK-7251] Perform sequential scan when iterating over entries in BytesToBytesMap (cherry picked from commit f2faa7a) Signed-off-by: Josh Rosen <[email protected]>
This patch modifies `BytesToBytesMap.iterator()` to iterate through records in the order that they appear in the data pages rather than iterating through the hashtable pointer arrays. This results in fewer random memory accesses, significantly improving performance for scan-and-copy operations. This is possible because our data pages are laid out as sequences of `[keyLength][data][valueLength][data]` entries. In order to mark the end of a partially-filled data page, we write `-1` as a special end-of-page length (BytesToByesMap supports empty/zero-length keys and values, which is why we had to use a negative length). This patch incorporates / closes apache#5836. Author: Josh Rosen <[email protected]> Closes apache#6159 from JoshRosen/SPARK-7251 and squashes the following commits: 05bd90a [Josh Rosen] Compare capacity, not size, to MAX_CAPACITY 2a20d71 [Josh Rosen] Fix maximum BytesToBytesMap capacity bc4854b [Josh Rosen] Guard against overflow when growing BytesToBytesMap f5feadf [Josh Rosen] Add test for iterating over an empty map 273b842 [Josh Rosen] [SPARK-7251] Perform sequential scan when iterating over entries in BytesToBytesMap
This patch modifies `BytesToBytesMap.iterator()` to iterate through records in the order that they appear in the data pages rather than iterating through the hashtable pointer arrays. This results in fewer random memory accesses, significantly improving performance for scan-and-copy operations. This is possible because our data pages are laid out as sequences of `[keyLength][data][valueLength][data]` entries. In order to mark the end of a partially-filled data page, we write `-1` as a special end-of-page length (BytesToByesMap supports empty/zero-length keys and values, which is why we had to use a negative length). This patch incorporates / closes apache#5836. Author: Josh Rosen <[email protected]> Closes apache#6159 from JoshRosen/SPARK-7251 and squashes the following commits: 05bd90a [Josh Rosen] Compare capacity, not size, to MAX_CAPACITY 2a20d71 [Josh Rosen] Fix maximum BytesToBytesMap capacity bc4854b [Josh Rosen] Guard against overflow when growing BytesToBytesMap f5feadf [Josh Rosen] Add test for iterating over an empty map 273b842 [Josh Rosen] [SPARK-7251] Perform sequential scan when iterating over entries in BytesToBytesMap
This patch modifies `BytesToBytesMap.iterator()` to iterate through records in the order that they appear in the data pages rather than iterating through the hashtable pointer arrays. This results in fewer random memory accesses, significantly improving performance for scan-and-copy operations. This is possible because our data pages are laid out as sequences of `[keyLength][data][valueLength][data]` entries. In order to mark the end of a partially-filled data page, we write `-1` as a special end-of-page length (BytesToByesMap supports empty/zero-length keys and values, which is why we had to use a negative length). This patch incorporates / closes apache#5836. Author: Josh Rosen <[email protected]> Closes apache#6159 from JoshRosen/SPARK-7251 and squashes the following commits: 05bd90a [Josh Rosen] Compare capacity, not size, to MAX_CAPACITY 2a20d71 [Josh Rosen] Fix maximum BytesToBytesMap capacity bc4854b [Josh Rosen] Guard against overflow when growing BytesToBytesMap f5feadf [Josh Rosen] Add test for iterating over an empty map 273b842 [Josh Rosen] [SPARK-7251] Perform sequential scan when iterating over entries in BytesToBytesMap
No description provided.