SPY-1394: CSD caching policy at block level #189
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The problem originated from the fact that spark can not read a cache block from disk store with size over 2G. When data is distributed highly skewed over partitions, we see problems recorded in SPY-1394, with any RDD cached on disk.
The ultimate solution of being to adapt the partitions to skew is a long shot,
but imposing a CSD policy on cache block size is feasible.
This PR is proposing the following policy:
A negative value should disable the block policy.
The default is set to Integer.MAX_VALUE, but we could choose to make it smaller.
By applying the above policy, we ensure all the cache block is less 2G in size either in memory or disk. This braces us against skewed data and other type of abnormal caching pattern.
@davidnavas @markhamstra