You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 10, 2024. It is now read-only.
<divclass="select-language">Select a language</div>
11
-
12
-
<TabsqueryStringgroupId="lang">
13
-
<TabItemvalue="py"label="Python"></TabItem>
14
-
<TabItemvalue="js"label="JavaScript"></TabItem>
15
-
</Tabs>
16
-
17
-
***
18
-
19
7
Embeddings are the A.I-native way to represent any kind of data, making them the perfect fit for working with all kinds of A.I-powered tools and algorithms. They can represent text, images, and soon audio and video. There are many options for creating embeddings, whether locally using an installed library, or by calling an API.
20
8
21
9
Chroma provides lightweight wrappers around popular embedding providers, making it easy to use them in your apps. You can set an embedding function when you create a Chroma collection, which will be used automatically, or you can call them directly yourself.
We welcome pull requests to add new Embedding Functions to the community.
31
23
24
+
***
32
25
33
26
## Default: all-MiniLM-L6-v2
34
27
35
28
By default, Chroma uses the [Sentence Transformers](https://www.sbert.net/)`all-MiniLM-L6-v2` model to create embeddings. This embedding model can create sentence and document embeddings that can be used for a wide variety of tasks. This embedding function runs locally on your machine, and may require you download the model files (this will happen automatically).
Embedding functions can linked to a collection, which are used whenever you call `add`, `update`, `upsert` or `query`. You can also call them directly which can be handy for debugging.
35
+
:::note
36
+
Embedding functions can linked to a collection, which are used whenever you call `add`, `update`, `upsert` or `query`. You can also be use them directly which can be handy for debugging.
@@ -105,208 +91,21 @@ You can pass in an optional `model_name` argument, which lets you choose which S
105
91
</Tabs>
106
92
107
93
108
-
## OpenAI
109
-
110
-
Chroma provides a convenient wrapper around OpenAI's embedding API. This embedding function runs remotely on OpenAI's servers, and requires an API key. You can get an API key by signing up for an account at [OpenAI](https://openai.com/api/).
You can pass in an optional `model_name` argument, which lets you choose which OpenAI embeddings model to use. By default, Chroma uses `text-embedding-ada-002`. You can see a list of all available models [here](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings).
162
-
163
-
## Cohere
164
-
165
-
Chroma also provides a convenient wrapper around Cohere's embedding API. This embedding function runs remotely on Cohere’s servers, and requires an API key. You can get an API key by signing up for an account at [Cohere](https://dashboard.cohere.ai/welcome/register).
You can pass in an optional `model_name` argument, which lets you choose which Cohere embeddings model to use. By default, Chroma uses `large` model. You can see the available models under `Get embeddings` section [here](https://docs.cohere.ai/reference/embed).
For more information on multilingual model you can read [here](https://docs.cohere.ai/docs/multilingual-language-models).
255
-
256
-
## Instructor models
257
-
258
-
The [instructor-embeddings](https://github.com/HKUNLP/instructor-embedding) library is another option, especially when running on a machine with a cuda-capable GPU. They are a good local alternative to OpenAI (see the [Massive Text Embedding Benchmark](https://huggingface.co/blog/mteb) rankings). The embedding function requires the InstructorEmbedding package. To install it, run ```pip install InstructorEmbedding```.
259
-
260
-
There are three models available. The default is `hkunlp/instructor-base`, and for better performance you can use `hkunlp/instructor-large` or `hkunlp/instructor-xl`. You can also specify whether to use `cpu` (default) or `cuda`. For example:
261
-
262
-
```python
263
-
#uses base model and cpu
264
-
ef = embedding_functions.InstructorEmbeddingFunction()
265
-
```
266
-
or
267
-
```python
268
-
ef = embedding_functions.InstructorEmbeddingFunction(
269
-
model_name="hkunlp/instructor-xl", device="cuda")
270
-
```
271
-
Keep in mind that the large and xl models are 1.5GB and 5GB respectively, and are best suited to running on a GPU.
272
-
273
-
## Google PaLM API models
274
-
275
-
[Google PaLM APIs](https://developers.googleblog.com/2023/03/announcing-palm-api-and-makersuite.html) are currently in private preview, but if you are part of this preview, you can use them with Chroma via the `GooglePalmEmbeddingFunction`.
276
-
277
-
To use the PaLM embedding API, you must have `google.generativeai` Python package installed and have the API key. To use:
Chroma also provides a convenient wrapper around HuggingFace's embedding API. This embedding function runs remotely on HuggingFace's servers, and requires an API key. You can get an API key by signing up for an account at [HuggingFace](https://huggingface.co/).
You can pass in an optional `model_name` argument, which lets you choose which HuggingFace model to use. By default, Chroma uses `sentence-transformers/all-MiniLM-L6-v2`. You can see a list of all available models [here](https://huggingface.co/models).
102
+
<divclass="select-language">Select a language</div>
0 commit comments