-
Notifications
You must be signed in to change notification settings - Fork 4.4k
CPU Faiss Intel SVS ‐ Overview
Faiss supports Intel Scalable Vector Search (SVS) indexes. The integration lets you build, query, save, and load SVS indexes with the usual Faiss Index API. For end-to-end examples and code snippets, see Faiss + SVS usage guide.
-
Faiss-native workflow — SVS indexes have the standard
Indexlifecycle (train/add/search, I/O, factory strings), facilitating the migration of existing code. - Graph or flat search — Use Vamana for high-recall graph-based ANN search or SVS Flat for brute-force baselines, each exposed with the same Faiss abstractions.
- Intel compression stack — LVQ and LeanVec operate entirely on compressed vectors, trimming memory while preserving recall when running on Intel CPUs.
-
SVS Vamana — Combines the Vamana graph-based search algorithm, introduced by Subramanya et al., with Intel's LVQ and LeanVec compression. It uses a single-layer proximity graph (vs. HNSW’s multi-layer) for efficient search, and supports float32, float16, 8-bit quantization with global scaling, plus Intel-specific compression for low memory footprint and higher throughput.
-
SVS Flat — A brute-force implementation that streams queries through the SVS runtime.
SVS indexes in Faiss operate on plain float32 vectors out of the box, but they can also leverage Intel's proprietary compression Locally-Adaptive Vector Quantization (LVQ) and LeanVec. Both compressors shrink the dataset footprint to reduce memory usage while preserving accuracy for large-scale vector search. LVQ and LeanVec are available on Intel CPUs only.
- Method: Per-vector normalization + scalar quantization.
-
Benefits:
- Fast, on-the-fly decompression for distance computation.
- SIMD-optimized layout (Turbo LVQ).
- Robust to distribution shifts.
-
Variants:
- LVQ4x4: 8 bits/dim, fast search, large memory savings.
- LVQ4x8: 12 bits/dim, higher recall when needed.
- LVQ4x0: 4 bits/dim, single-level, maximum memory savings.
- Method: Query-aware dimensionality reduction + LVQ.
-
Benefits:
- Ideal for high-dimensional vectors.
- Significant memory and performance gains.
-
Variants:
- LeanVec4x8: Best for high-dimensional datasets, fastest search.
- LeanVec4x4: Larger memory savings.
- LeanVec8x8: Higher recall when needed.
-
Optional: Further reduce dimensions using
leanvec_dimargument. Default is d//2.
Both LVQ and LeanVec support two-level schemes:
- Level 1: Compress vectors for fast candidate retrieval.
- Level 2: Encode residuals for accurate re-ranking.
- Full-precision vectors are not used at query time, search operates entirely on compressed representations.
Naming Convention: <B₁>x<B₂>
- B₁: Bits per dimension (first level).
- B₂: Bits per dimension (second level).
- Example: LVQ4x8 = 4 bits/dim (level 1) + 8 bits/dim (level 2); LeanVec4x8_256 = 256 dims with 4 bits/dim (level 1) + original dimensionality with 8 bits/dim (level 2).
- LeanVec requires training on a representative data sample; large distribution drifts may degrade recall.
- LVQ, which uses the global vector average, may also be impacted by distribution shifts but is highly robust. LVQ automatically determines the global vector average from the first batch of added vectors; it does not support explicit training.
- Two-level presets keep all computations in the compressed domain. Float32 vectors are not required at query time.
- LeanVec's out-of-distribution mode will be available soon!
Faiss building blocks: clustering, PCA, quantization
Index IO, cloning and hyper parameter tuning
CPU Faiss + Intel SVS - Overview
GPU Faiss + NVIDIA cuVS - Overview
GPU Faiss + NVIDIA cuVS - Usage
Threads and asynchronous calls
Inverted list objects and scanners
Indexes that do not fit in RAM
Brute force search without an index
Fast accumulation of PQ and AQ codes (FastScan)
Setting search parameters for one query
Binary hashing index benchmark