Skip to content

Conversation

@hwchase17
Copy link
Contributor

No description provided.

simonfromla and others added 2 commits June 17, 2023 09:45
Allow FAISS's similarity_search_with_score_by_vector to accept kwargs,
specifically: `score_threshold`.
Fixes error for document compression users in cases where there are no
relevant docs - for instance:

```
human_message: "hi" (<--  will return no relevant docs depending on score_threshold)
> IndexError: index 0 is out of bounds for axis 0 with size 0
```

Combined with the new merger_retriever, FAISS users can now perform more
granular vectorstore retrieving based on score_threshold:

```
store_1 = st[0][1].as_retriever(search_type="similarity_score_threshold", search_kwargs={"score_threshold": .8, "k": 3})
store_2 = st[1][1].as_retriever(search_type="similarity_score_threshold", search_kwargs={"score_threshold": .3, "k": 3})
store_3 = st[2][1].as_retriever(search_type="similarity_score_threshold", search_kwargs={"score_threshold": .5, "k": 3})

merged = MergerRetriever(retrievers=[store_1, store_2, store_3])
pipeline_compressor = prep_compressor()
compression_retriever = ContextualCompressionRetriever(base_compressor=pipeline_compressor, base_retriever=merged)

chain = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), chain_type="stuff", retriever=compression_retriever)
```

@dev2049 or @hwchase17 would appreciate a check if available. Thanks.

---------

Co-authored-by: Sims Juju <[email protected]>
@hwchase17 hwchase17 added the lgtm label Jun 17, 2023
@vercel
Copy link

vercel bot commented Jun 17, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jun 17, 2023 5:10pm

@vercel vercel bot temporarily deployed to Preview June 17, 2023 17:10 Inactive
@hwchase17 hwchase17 merged commit 61e4a1a into master Jun 17, 2023
@hwchase17 hwchase17 deleted the harrison/faiss-score branch June 17, 2023 18:00
This was referenced Jun 25, 2023
@tak-s
Copy link

tak-s commented Jul 5, 2023

This score implementation affects similarity_search_with_relevance_scores(base.py).
In _similarity_search_with_relevance_scores(base.py), score value change by relevance_score_fn and check scores. faiss.py and base.py uses same 'score_threshold' value.

In similarity_search_with_score_by_vector(faiss.py), original score value is checked by 'score_threshold'.
But in similarity_search_with_relevance_scores(base.py), score value changed by relevance_score_fn is checked by the same 'score_threshold'.

So, if call similarity_search_with_relevance_scores, score value filtering is processed twice on different criteria before and after conversion.

@tak-s
Copy link

tak-s commented Jul 14, 2023

This score implementation affects similarity_search_with_relevance_scores(base.py). In _similarity_search_with_relevance_scores(base.py), score value change by relevance_score_fn and check scores. faiss.py and base.py uses same 'score_threshold' value.

In similarity_search_with_score_by_vector(faiss.py), original score value is checked by 'score_threshold'. But in similarity_search_with_relevance_scores(base.py), score value changed by relevance_score_fn is checked by the same 'score_threshold'.

So, if call similarity_search_with_relevance_scores, score value filtering is processed twice on different criteria before and after conversion.

This was fixed by #6570. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants