Skip to content

Add Native Vector Support for the Oracle: VectorField, VectorIndex, and VectorDistance #60

@savansonii

Description

@savansonii

Code of Conduct

  • I agree to follow Django's Code of Conduct

Feature Description

We propose official support in Django for the Oracle Database’s native VECTOR data type, introduced in the Oracle Database 23.4. This includes:

Features Included:

VectorField model field:

  • Accepts optional dimensions, storage_format, and storage_type arguments.
  • Supports Dense and Sparse vector storage.
  • Auto-converts lists, arrays, and oracledb.SparseVector for insert/update.

Vector Index support:

  • VectorIndex class using Meta.indexes.
  • Support for HNSW and IVF index types.
  • Optional parameters: distance, accuracy, parallel, etc.

Vector distance expressions and lookups:

  • Custom Func class VectorDistance for VECTOR_DISTANCE(lhs, rhs, metric)
  • CosineDistance, EuclideanDistance, and NegativeDotProduct etc. as lookups.
  • Query syntax via filter() and order_by() for similarity search.

Testing

  • Dense and Sparse vector insert/query tests added.
  • Stress test scripts for repeated inserts/queries included.

This brings Django ORM in line with modern AI/ML and search workloads using vector embeddings (e.g., images, text, semantic search).

Problem

Django currently does not natively support the Oracle Database's VECTOR data type. This limits users of the Oracle 23ai who want to:
-Store and query vector embeddings directly in the database.
-Perform similarity search using the Oracle's VECTOR_DISTANCE() function.
-Leverage the Oracle's native VECTOR indexing (e.g., IVF, HNSW) for high-performance nearest neighbor search.

Without first-class Django support, developers must fall back to raw SQL or manually patch fields and expressions, leading to poor maintainability and loss of ORM benefits.

Request or proposal

proposal

Additional Details

Implementation Status

We have already implemented:

  • Custom VectorField with support for DENSE and SPARSE formats
  • Automatic SQL generation for model/table creation
  • VectorIndex support with customizable parameters and distance metrics
  • ORM expressions and lookups for vector distance queries (e.g., CosineDistance, EuclideanDistance)
  • Basic tests for dense vector creation, insertion, indexing, and querying
  • Integration with Oracle’s Python driver (oracledb) for runtime behavior

Example:

from django.db import models
VectorIndex = model.VectorIndex
VectorDistanceType = models.VectorDistanceType
VectorIndexType = models.VectorIndexType

class Product(models.Model):
    name = models.CharField(max_length=100)
    embedding = models.VectorField(dim=3, storage_format=VectorStorageFormat.FLOAT32, storage_type=VectorStorageType.DENSE)

    class Meta:
        indexes = [
            VectorIndex(
                fields=["embedding"],
                name="vec_idx_product",
                index_type=VectorIndexType.HNSW,
                distance=VectorDistanceType.COSINE,
            )
        ]

And a Similarity search can be performed

query_vector = array.array("f", [1.0, 2.0, 3.0])

    products = Product.objects.annotate(
        score=VectorDistance(
            "embedding",
            query_vector,
            metric=VectorDistanceType.COSINE,
        )
    ).order_by("score")[:5]

Implementation Suggestions

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Idea

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions