Skip to content

Conversation

@anubhav94N
Copy link
Contributor

@anubhav94N anubhav94N commented Jun 15, 2023

Description

This PR adds Rockset as a vectorstore for langchain. Rockset is a real time OLAP database which provides a fast and efficient vector search functionality. Further since it is entirely schemaless, it can store metadata in separate columns thereby allowing fast metadata filters during vector similarity search (as opposed to storing the entire metadata in a single JSON column). It currently supports three distance functions: COSINE_SIMILARITY, EUCLIDEAN_DISTANCE, and DOT_PRODUCT.

This PR adds rockset client as an optional dependency.

We would love a twitter shoutout, our handle is https://twitter.com/RocksetCloud

Before submitting

  1. Integration test: https://github.com/anubhav94N/langchain/blob/master/tests/integration_tests/vectorstores/test_rocksetdb.py
  2. Example notebook: https://github.com/anubhav94N/langchain/blob/master/docs/modules/indexes/vectorstores/examples/rockset_vector_database.ipynb
  3. Ran make format and make lint locally

Who can review?

@hwchase17, @dev2049 can you help review please?

try:
from rockset import RocksetClient
except ImportError:
raise ValueError(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be
raise ImportError(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated - thanks!

@vercel
Copy link

vercel bot commented Jun 16, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jun 21, 2023 8:20am

@vercel vercel bot temporarily deployed to Preview June 16, 2023 10:57 Inactive
@vercel vercel bot temporarily deployed to Preview June 16, 2023 11:11 Inactive
Copy link
Contributor

@hwchase17 hwchase17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some merge conflicts, otherwise lgtm

@hwchase17 hwchase17 added the lgtm label Jun 18, 2023
@vercel
Copy link

vercel bot commented Jun 19, 2023

@anubhav94N is attempting to deploy a commit to the LangChain Team on Vercel.

A member of the Team first needs to authorize it.

@vercel vercel bot temporarily deployed to Preview June 21, 2023 08:20 Inactive
@dev2049 dev2049 merged commit 94c7899 into langchain-ai:master Jun 21, 2023
@dev2049
Copy link
Contributor

dev2049 commented Jun 21, 2023

thanks @anubhav94N!

tconkling added a commit to tconkling/langchain that referenced this pull request Jun 21, 2023
* master: (28 commits)
  [Feature][VectorStore] Support StarRocks as vector db (langchain-ai#6119)
  Relax string input mapper check (langchain-ai#6544)
  bump to ver 208 (langchain-ai#6540)
  Harrison/multi tool (langchain-ai#6518)
  Infino integration for simplified logs, metrics & search across LLM data & token usage (langchain-ai#6218)
  Update model token mappings/cost to include 0613 models (langchain-ai#6122)
  Fix issue with non-list `To` header in GmailSendMessage Tool (langchain-ai#6242)
  Integrate Rockset as Vectorstore (langchain-ai#6216)
  Feat: Add a prompt template parameter to qa with structure chains (langchain-ai#6495)
  Add async support for HuggingFaceTextGenInference (langchain-ai#6507)
  Be able to use Codey models on Vertex AI (langchain-ai#6354)
  Add KuzuQAChain (langchain-ai#6454)
  Update index.mdx (langchain-ai#6326)
  Export trajectory eval fn (langchain-ai#6509)
  typo(llamacpp.ipynb): 'condiser' -> 'consider' (langchain-ai#6474)
  Fix typo in docstring of format_tool_to_openai_function (langchain-ai#6479)
  Make streamlit import optional (langchain-ai#6510)
  Fixed: 'readible' -> readable (langchain-ai#6492)
  Documentation Fix: Correct the example code output in the prompt templates doc (langchain-ai#6496)
  Fix link (langchain-ai#6501)
  ...
@danielchalef danielchalef mentioned this pull request Jun 25, 2023
@danielchalef danielchalef mentioned this pull request Jun 25, 2023
rlancemartin pushed a commit that referenced this pull request Jul 14, 2023
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

Integrate [Rockset](https://rockset.com/docs/) as a document loader.

Issue: None
Dependencies: Nothing new (rockset's dependency was already added
[here](#6216))
Tag maintainer: @rlancemartin

I have added a test for the integration and an example notebook showing
its use. I ran `make lint` and everything looks good.

---------

Co-authored-by: Bagatur <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants