Skip to content

[8.19] Optimize sparse vector stats collection (#128740) #128771

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 2, 2025

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Jun 2, 2025

Backports the following commits to 8.19:

This change improves the performance of sparse vector statistics gathering by using the document count of terms directly, rather than relying on the field name field to compute stats.
By avoiding per-term disk/network reads and instead leveraging statistics already loaded into leaf readers at index opening, we expect to significantly reduce overhead.

Relates to elastic#128583
@jimczi jimczi added :Data Management/Stats Statistics tracking and retrieval APIs >enhancement auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) backport Team:Data Management Meta label for data/management team labels Jun 2, 2025
@elasticsearchmachine elasticsearchmachine merged commit 839aa2b into elastic:8.19 Jun 2, 2025
15 checks passed
@jimczi jimczi deleted the backport/8.19/pr-128740 branch June 2, 2025 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) backport :Data Management/Stats Statistics tracking and retrieval APIs >enhancement Team:Data Management Meta label for data/management team v8.19.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants