Skip to content

Conversation

@kletkavrubashku
Copy link
Contributor

@kletkavrubashku kletkavrubashku commented Dec 23, 2025

Summary:

History

Some time ago similar PR was landed (#6699, D49501856) and caused SEV S369242 in Meta. That time ZSTD_DCtx context was created and reused on decompressor level. The optimization was reverted due to OOMs.

Getting back to it again

The optimization still makes sense. For example:

The fix

Instead of creating ZSTD_DCtx per decompressor, we should create it per thread. Then we will be able to reuse the allocation and don't consume so much memory in FlatMaps.

Test plan

I did multiple experiments. The biggest one runs 100k shadow queries on the same cluster with 2 packages: https://fburl.com/scuba/presto_queries/ddi8pxmd
Same upstream revision for prod and test builds. Only this diff on top.

Here is what I got:
https://pxl.cl/8Dsk1
Execution time is 9% better. The data is not probably really accurate, but at least it's better. I appreciate suggestions here.
The memory usage didn't grow: https://pxl.cl/8Dsjn

Differential Revision: D89716393

Summary:
## History
Some time ago similar PR was landed (facebookincubator#6699, D49501856) and caused SEV S369242 in Meta. That time `ZSTD_DCtx` context was created and reused on decompressor level. The optimization was reverted due to OOMs.

## Getting back to it again
The optimization still makes sense. For example:
- In Presto Adhoc we spend 0.07% of CPU cycles in `ZSTD_createDCtx_internal`: https://fburl.com/strobelight/2lt1wn4v
- In Presto batch we spend  0.1% of CPU cycles in `ZSTD_createDCtx_internal`: https://fburl.com/strobelight/js4kn8za

## The fix
Instead of creating `ZSTD_DCtx` per decompressor, we should create it per thread. Then we will be able to reuse the allocation and don't consume so much memory in FlatMaps.

Differential Revision: D89716393
@netlify
Copy link

netlify bot commented Dec 23, 2025

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 4381030
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/694b2523872ece00087e77a5

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 23, 2025
@meta-codesync
Copy link

meta-codesync bot commented Dec 23, 2025

@kletkavrubashku has exported this pull request. If you are a Meta employee, you can view the originating Diff in D89716393.

@kletkavrubashku kletkavrubashku changed the title fix: [velox] Reuse context in ZSTD_decompress perf(dwio): Reuse context in ZSTD_decompress Dec 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants