Skip to content

Commit 476fc2a

Browse files
committed
LCORE-1446: Add dynamic filter support to Solr vector IO provider
Adds support for dynamic metadata filtering to the Solr vector IO provider, enabling users to filter RAG query results by metadata fields at runtime. NOTE: Dynamic filters from requests are **combined with** the existing static `chunk_filter_query` configuration (e.g., `"is_chunk:true"`) rather than replacing it. This preserves backward compatibility and maintains the internal schema filtering needed for chunk window expansion to work correctly. Both filters are joined with AND logic: fq=(is_chunk:true AND ) This approach avoids breaking changes while enabling flexible metadata filtering for use cases like filtering by platform, version, document type, etc. - Upstream PR: ogx-ai/ogx#4471
1 parent 8cd1b3d commit 476fc2a

4 files changed

Lines changed: 590 additions & 16 deletions

File tree

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ Manual procedure, assuming an existing PyPI API token available:
5959
### Prerequisites
6060

6161
- Python >= 3.12
62-
- Llama Stack == 0.4.3
62+
- Llama Stack == 0.6.0
6363
- pydantic >= 2.10.6
6464

6565
### Installation

lightspeed_stack_providers/providers/remote/solr_vector_io/solr_vector_io/README.md

Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ A read-only vector_io provider implementation for llama-stack that integrates wi
88
- **Vector similarity search** using Solr's KNN query parser
99
- **Keyword search** using Solr's text search
1010
- **Hybrid search** combining vector and keyword search with Solr's native reranking
11+
- **Dynamic metadata filtering** using OpenAI-compatible filter API (llama-stack 0.6.0+)
1112
- **Chunk window expansion** for retrieving extended context around matched chunks
1213
- **Schema-agnostic** field mapping for flexible Solr schema support
1314
- **OpenAI-compatible API** for vector store operations (read-only methods)
@@ -193,6 +194,139 @@ params = {
193194
}
194195
```
195196

197+
## Dynamic Metadata Filtering
198+
199+
The provider supports dynamic filtering of search results by metadata fields using OpenAI-compatible filter syntax (introduced in llama-stack 0.6.0). Filters are translated to Solr's filter query (`fq`) syntax and combined with the static `chunk_filter_query` from configuration.
200+
201+
### Filter Types
202+
203+
**Comparison Filters:**
204+
- `eq` - Equal to
205+
- `ne` - Not equal to
206+
- `gt` - Greater than
207+
- `gte` - Greater than or equal to
208+
- `lt` - Less than
209+
- `lte` - Less than or equal to
210+
- `in` - Value in list
211+
- `nin` - Value not in list
212+
213+
**Compound Filters:**
214+
- `and` - All filters must match
215+
- `or` - Any filter must match
216+
217+
### Usage Examples
218+
219+
**Simple equality filter:**
220+
221+
```python
222+
from llama_stack_api import ComparisonFilter
223+
224+
response = await adapter.query_chunks(
225+
vector_store_id="my-store",
226+
query="How to install ansible?",
227+
params={
228+
"k": 5,
229+
"filters": ComparisonFilter(
230+
type="eq",
231+
key="platform",
232+
value="ansible"
233+
)
234+
}
235+
)
236+
# Solr query: fq=(is_chunk:true AND platform:"ansible")
237+
```
238+
239+
**Range filter:**
240+
241+
```python
242+
response = await adapter.query_chunks(
243+
vector_store_id="my-store",
244+
query="Latest features",
245+
params={
246+
"k": 5,
247+
"filters": ComparisonFilter(
248+
type="gte",
249+
key="version",
250+
value=4.0
251+
)
252+
}
253+
)
254+
# Solr query: fq=(is_chunk:true AND version:[4.0 TO *])
255+
```
256+
257+
**Multiple values with 'in' filter:**
258+
259+
```python
260+
response = await adapter.query_chunks(
261+
vector_store_id="my-store",
262+
query="Security best practices",
263+
params={
264+
"k": 5,
265+
"filters": ComparisonFilter(
266+
type="in",
267+
key="platform",
268+
value=["openshift", "kubernetes", "ansible"]
269+
)
270+
}
271+
)
272+
# Solr query: fq=(is_chunk:true AND platform:("openshift" OR "kubernetes" OR "ansible"))
273+
```
274+
275+
**Compound filters (AND/OR):**
276+
277+
```python
278+
from llama_stack_api import CompoundFilter, ComparisonFilter
279+
280+
response = await adapter.query_chunks(
281+
vector_store_id="my-store",
282+
query="Advanced configuration",
283+
params={
284+
"k": 5,
285+
"filters": CompoundFilter(
286+
type="and",
287+
filters=[
288+
ComparisonFilter(type="eq", key="platform", value="openshift"),
289+
ComparisonFilter(type="gte", key="version", value=4.12)
290+
]
291+
)
292+
}
293+
)
294+
# Solr query: fq=(is_chunk:true AND (platform:"openshift" AND version:[4.12 TO *]))
295+
```
296+
297+
**Nested compound filters:**
298+
299+
```python
300+
response = await adapter.query_chunks(
301+
vector_store_id="my-store",
302+
query="Troubleshooting guide",
303+
params={
304+
"k": 5,
305+
"filters": CompoundFilter(
306+
type="and",
307+
filters=[
308+
ComparisonFilter(type="eq", key="doc_type", value="guide"),
309+
CompoundFilter(
310+
type="or",
311+
filters=[
312+
ComparisonFilter(type="eq", key="platform", value="openshift"),
313+
ComparisonFilter(type="eq", key="platform", value="ansible")
314+
]
315+
)
316+
]
317+
)
318+
}
319+
)
320+
# Solr query: fq=(is_chunk:true AND (doc_type:"guide" AND (platform:"openshift" OR platform:"ansible")))
321+
```
322+
323+
### Filter Behavior
324+
325+
- **Static filters preserved:** The configured `chunk_filter_query` (e.g., `"is_chunk:true"`) is always applied to maintain proper chunk/parent document separation
326+
- **Dynamic filters added:** Request filters are combined with static filters using AND logic
327+
- **String escaping:** Special Solr characters in filter values are automatically escaped
328+
- **Works with all search types:** Filters apply to vector, keyword, and hybrid search
329+
196330
## Limitations
197331

198332
This is a **read-only** provider. The following operations are not supported:

0 commit comments

Comments
 (0)