-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Code of Conduct
- I agree to follow Django's Code of Conduct
Feature Description
We propose official support in Django for the Oracle Database’s native VECTOR data type, introduced in the Oracle Database 23.4. This includes:
Features Included:
VectorField model field:
- Accepts optional dimensions, storage_format, and storage_type arguments.
- Supports Dense and Sparse vector storage.
- Auto-converts lists, arrays, and oracledb.SparseVector for insert/update.
Vector Index support:
- VectorIndex class using Meta.indexes.
- Support for HNSW and IVF index types.
- Optional parameters: distance, accuracy, parallel, etc.
Vector distance expressions and lookups:
- Custom Func class VectorDistance for VECTOR_DISTANCE(lhs, rhs, metric)
- CosineDistance, EuclideanDistance, and NegativeDotProduct etc. as lookups.
- Query syntax via filter() and order_by() for similarity search.
Testing
- Dense and Sparse vector insert/query tests added.
- Stress test scripts for repeated inserts/queries included.
This brings Django ORM in line with modern AI/ML and search workloads using vector embeddings (e.g., images, text, semantic search).
Problem
Django currently does not natively support the Oracle Database's VECTOR data type. This limits users of the Oracle 23ai who want to:
-Store and query vector embeddings directly in the database.
-Perform similarity search using the Oracle's VECTOR_DISTANCE() function.
-Leverage the Oracle's native VECTOR indexing (e.g., IVF, HNSW) for high-performance nearest neighbor search.
Without first-class Django support, developers must fall back to raw SQL or manually patch fields and expressions, leading to poor maintainability and loss of ORM benefits.
Request or proposal
proposal
Additional Details
Implementation Status
We have already implemented:
- Custom VectorField with support for DENSE and SPARSE formats
- Automatic SQL generation for model/table creation
- VectorIndex support with customizable parameters and distance metrics
- ORM expressions and lookups for vector distance queries (e.g., CosineDistance, EuclideanDistance)
- Basic tests for dense vector creation, insertion, indexing, and querying
- Integration with Oracle’s Python driver (oracledb) for runtime behavior
Example:
from django.db import models
VectorIndex = model.VectorIndex
VectorDistanceType = models.VectorDistanceType
VectorIndexType = models.VectorIndexType
class Product(models.Model):
name = models.CharField(max_length=100)
embedding = models.VectorField(dim=3, storage_format=VectorStorageFormat.FLOAT32, storage_type=VectorStorageType.DENSE)
class Meta:
indexes = [
VectorIndex(
fields=["embedding"],
name="vec_idx_product",
index_type=VectorIndexType.HNSW,
distance=VectorDistanceType.COSINE,
)
]
And a Similarity search can be performed
query_vector = array.array("f", [1.0, 2.0, 3.0])
products = Product.objects.annotate(
score=VectorDistance(
"embedding",
query_vector,
metric=VectorDistanceType.COSINE,
)
).order_by("score")[:5]
Implementation Suggestions
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status