MB-69881: Improved APIs and perf optimizations for vector search by CascadingRadium · Pull Request #2270 · blevesearch/bleve

CascadingRadium · 2025-12-28T05:18:07Z

Use a bitset to track eligible documents instead of a slice of N uint64s, reducing memory usage from 8N bytes to N/8 bytes per segment (up to 64× reduction) and improving cache locality.
Pass an iterator over eligible documents that iterates the bitset directly, allowing direct translation into a bitset of eligible vector IDs in the storage layer and eliminating the need for a separate slice intermediary.
Fix garbage creation in the UnadornedPostingsIterator, which previously allocated a temporary struct per Next() call to wrap a doc number and satisfy the Postings interface; the iterator now returns a single reusable struct (one-time allocation) consistent with the working of the PostingsIterator in the storage-layer.
Avoid unnecessary BytesRead statistics computation when executing searches in no-scoring mode, removing redundant work as a micro-optimization.

Copilot

Pull request overview

This PR re-architects vector search to improve memory efficiency and reduce garbage collection pressure. The changes replace slice-based eligible document tracking with bitsets, achieving up to 64× memory reduction per segment, and optimize the iterator pattern to eliminate per-call allocations in the unadorned postings iterator.

Key changes:

Replaced slice-based eligible document tracking ([]uint64) with bitsets, reducing memory from 8N bytes to N/8 bytes per segment
Introduced iterator-based API for eligible documents that directly translates to bitset iteration at the storage layer
Fixed garbage creation in UnadornedPostingsIterator by reusing a single struct instance instead of allocating per Next() call
Optimized bytes read tracking to skip computation in no-scoring mode

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
index/scorch/snapshot_vector_index.go	Introduces bitset-based eligible document storage and iterator API, replacing the previous slice-based approach
index/scorch/unadorned.go	Changes `UnadornedPosting` from `uint64` to struct with pointer receivers and adds reusable struct fields to iterators to eliminate per-call allocations
index/scorch/snapshot_index_tfr.go	Adds conditional bytes read tracking via `updateBytesRead` flag to skip computation in no-scoring mode
index/scorch/snapshot_index.go	Initializes `updateBytesRead` flag based on scoring requirements
index/scorch/optimize_knn.go	Removes `requiresFiltering` flag and updates to use new `SegmentEligibleDocuments` API
index/scorch/optimize.go	Sets `updateBytesRead` to false for unadorned term field readers
index/scorch/snapshot_index_vr.go	Updates `InterpretVectorIndex` call to remove filtering parameter
index_test.go	Updates expected bytes read values to reflect the optimization that skips unnecessary computation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coveralls · 2026-01-12T21:10:13Z

coverage: 53.677% (-0.05%) from 53.722%
when pulling abac255 on perf
into 9808f42 on master.

- Use a `bitset` to track eligible documents instead of a slice of `N uint64s`, reducing memory usage from `8N bytes` to `N/8 bytes` per segment (up to `64×` reduction) and improving cache locality. - Pass an iterator over eligible documents that iterates the bitset directly, allowing direct translation into a bitset of eligible vector IDs in the storage layer and eliminating the need for a separate slice intermediary. - Fix garbage creation in the `UnadornedPostingsIterator`, which previously allocated a temporary struct per Next() call to wrap a doc number and satisfy the `Postings` interface; the iterator now returns a single reusable struct (one-time allocation) consistent with the working of the `PostingsIterator` in the storage-layer. - Avoid unnecessary `BytesRead` statistics computation when executing searches in no-scoring mode, removing redundant work as a micro-optimization. --------- Co-authored-by: Abhinav Dangeti <abhinav@couchbase.com>

- Backport the following commits into `v8.0.x-couchbase` via cherry-pick - #2270 - #2224 - #2272 --------- Co-authored-by: Abhinav Dangeti <abhinav@couchbase.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

CascadingRadium added 6 commits December 25, 2025 01:26

minor opt

65b931b

remove redundant variable

f6bf3af

overhaul the eligible iterator for performance

ec24d99

fix bytes stat

27f2d9f

fix unadorned posting garbage

da11922

micro optimization

a76bdab

CascadingRadium added this to Vector Search v2 Dec 28, 2025

github-project-automation Bot moved this to Todo in Vector Search v2 Dec 28, 2025

CascadingRadium moved this from Todo to Done in Vector Search v2 Dec 28, 2025

CascadingRadium requested review from Likith101, Thejas-bhat, abhinavdangeti, capemox, Copilot and maneuvertomars December 28, 2025 06:35

Copilot started reviewing on behalf of CascadingRadium December 28, 2025 06:35 View session

Copilot AI reviewed Dec 28, 2025

View reviewed changes

Comment thread index/scorch/snapshot_vector_index.go Outdated

Comment thread index/scorch/snapshot_index_tfr.go Outdated

Comment thread index/scorch/snapshot_vector_index.go

Comment thread index/scorch/snapshot_vector_index.go Outdated

CascadingRadium added 2 commits January 7, 2026 15:19

code review

4ffeb2a

code review

39d9ec2

abhinavdangeti changed the title ~~MB-69881: Re-architect vector search~~ MB-69881: Improved APIs for vector search Jan 8, 2026

abhinavdangeti added this to the v2.6.0 milestone Jan 8, 2026

micro optimization

8229bf3

CascadingRadium requested a review from Copilot January 9, 2026 12:24

Copilot started reviewing on behalf of CascadingRadium January 9, 2026 12:25 View session

Copilot AI reviewed Jan 9, 2026

View reviewed changes

Comment thread index/scorch/unadorned.go

Comment thread index/scorch/unadorned.go

CascadingRadium moved this from Done to In Progress in Vector Search v2 Jan 10, 2026

Add zapx/v17 as a dependency (default still at v16)

c648b03

abhinavdangeti previously approved these changes Jan 12, 2026

View reviewed changes

go@v1.24 is a bit strict-er

193403a

abhinavdangeti dismissed their stale review via 193403a January 12, 2026 21:03

Default to segment plugin - 17

abac255

abhinavdangeti approved these changes Jan 12, 2026

View reviewed changes

abhinavdangeti changed the title ~~MB-69881: Improved APIs for vector search~~ MB-69881: Improved APIs and perf optimizations for vector search Jan 12, 2026

Likith101 approved these changes Jan 13, 2026

View reviewed changes

CascadingRadium merged commit 32d9882 into master Jan 13, 2026
10 checks passed

github-project-automation Bot moved this from In Progress to Done in Vector Search v2 Jan 13, 2026

CascadingRadium deleted the perf branch January 13, 2026 13:33

CascadingRadium mentioned this pull request Jan 20, 2026

MB-27666, MB-69881: Backport necessary commits into morpheus #2273

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MB-69881: Improved APIs and perf optimizations for vector search#2270

MB-69881: Improved APIs and perf optimizations for vector search#2270
CascadingRadium merged 12 commits intomasterfrom
perf

CascadingRadium commented Dec 28, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

coveralls commented Jan 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

CascadingRadium commented Dec 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

coveralls commented Jan 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

CascadingRadium commented Dec 28, 2025 •

edited

Loading