Dynamic index in depth

The "dynamic" index is a "best of both worlds" approach that combines the benefits of the hnsw and flat indexes.

Dynamic index requires ASYNC_INDEXING

Dynamic indexes require asynchronous indexing. To enable asynchronous indexing in a self-hosted Weaviate instance, set the ASYNC_INDEXING environment variable to true. If your instance is hosted in Weaviate Cloud, use the Weaviate Cloud console to enable asynchronous indexing.

Key ideas

Simply put, the dynamic index is a flat index that is automatically converted to an hnsw index when the number of vectors in the collection exceeds a predetermined threshold (10,000 by default).

The motivation for this is that the flat index is very efficient for small collections, but its search time increases linearly with the number of vectors in the collection. The hnsw index, on the other hand, is more efficient for large collections, but includes a memory overhead with little benefit for small collections.

The dynamic index is a good choice if you do not know how big the size of each collection will be, or if you expect some tenants to grow much larger than others.

In a multi-tenancy configuration, this will mean that all tenants will start with the flat index, but will automatically switch to the hnsw index when the number of vectors in the collection exceeds the threshold.

Currently, this is a one-way conversion, meaning that once the index is converted to hnsw, it will not be converted back to flat if it subsequently falls below the threshold.

Distance metric

Vector Distance Calculations

The distance metric used in the index determines how the distance between vectors is calculated. In an HNSW index, it impacts where each vector is placed in the graph.

You must choose a metric that suits the vectors in your collection. To find this, consult the documentation of the model that generated your vectors.

Weaviate's default metric is cosine, but you can also use any number of other available metrics.

If you are unsure, the cosine distance is a good, robust, default choice that is used by a majority of models.

Configure dynamic index in Weaviate

Each of these parameters can be provided when creating a collection in Weaviate. Note that the vector_cache_max_objects is only used if quantization is enabled with vector caching enabled within it.

Basic configuration

Set a collection to use the dynamic index as shown below.

from weaviate.classes.config import Configure

client.collections.create(
    name=collection_name,
    # ... other parameters
    multi_tenancy_config=Configure.multi_tenancy(enabled=True), # Dyanmic index works well with multi-tenancy set-ups
    vector_index_config=Configure.VectorIndex.dynamic()
)

API docs

Custom configuration

You can set the threshold at which the flat index will be converted to hnsw.

Additionally, you can specify any of the flat and hnsw index parameters that will be used depending on the state of the index.

from weaviate.classes.config import Configure, VectorDistances

client.collections.create(
    name=collection_name,
    # ... other parameters
    multi_tenancy_config=Configure.multi_tenancy(   # Dyanmic index works well with multi-tenancy set-ups
        enabled=True,
        auto_tenant_creation=True,
        auto_tenant_activation=True,
    ),
    vector_index_config=Configure.VectorIndex.dynamic(
        distance_metric=VectorDistances.COSINE,                     # Distance metric
        threshold=25000,                                            # Threshold for switching to dynamic index
        hnsw=Configure.VectorIndex.hnsw(
            # Your preferred HNSW configuration
        ),
        flat=Configure.VectorIndex.flat(
            # Your preferred flat configuration
        ),
    )
)

API docs

Further resources

Questions and feedback

If you have any questions or feedback, let us know in the user forum.

Key ideas​

Distance metric​

Configure dynamic index in Weaviate​

Basic configuration​

Custom configuration​

Further resources​

Questions and feedback​