Vector indexes

Vector indexes facilitate efficient, vector-first data storage and retrieval.

HNSW indexes

HNSW indexes are scalable and super fast at query time, but HNSW algorithms are costly when you add data during the index building process.

HNSW index parameters

Some HNSW parameters are mutable, but others cannot be modified after you create your collection.

Parameter	Type	Default	Changeable	Details
`cleanupIntervalSeconds`	integer	300	Yes	Cleanup frequency. This value does not normally need to be adjusted. A higher value means cleanup runs less frequently, but it does more in a single batch. A lower value means cleanup is more frequent, but it may be less efficient on each run.
`distance`	string	`cosine`	No	Distance metric. The metric that measures the distance between two arbitrary vectors. For available distance metrics, see supported distance metrics.
`ef`	integer	-1	Yes	Balance search speed and recall. `ef` is the size of the dynamic list that the HNSW uses during search. Search is more accurate when `ef` is higher, but it is also slower. `ef` values greater than 512 show diminishing improvements in recall. Dynamic `ef`. Weaviate automatically adjusts the `ef` value and creates a dynamic `ef` list when `ef` is set to -1. For more details, see dynamic ef.
`efConstruction`	integer	128	No	Balance index search speed and build speed. A high `efConstruction` value means you can lower your `ef` settings, but importing is slower. `efConstruction` must be greater than 0.
`maxConnections`	integer	32	No	Maximum number of connections per element. `maxConnections` is the connection limit per layer for layers above the zero layer. The zero layer can have (2 * maxConnections) connections. `maxConnections` must be greater than 0.
`dynamicEfMin`	integer	100	Yes	New in `v1.10.0`. Lower bound for dynamic `ef`. Protects against a creating search list that is too short. This setting is only used when `ef` is -1.
`dynamicEfMax`	integer	500	Yes	New in `v1.10.0`. Upper bound for dynamic `ef`. Protects against creating a search list that is too long. If `dynamicEfMax` is higher than the limit, `dynamicEfMax` does not have any effect. In this case, `ef` is the limit. This setting is only used when `ef` is -1.
`dynamicEfFactor`	integer	8	Yes	Added in `v1.10.0`. Multiplier for dynamic `ef`. Sets the potential length of the search list. This setting is only used when `ef` is -1.
`filterStrategy`	string	`sweeping`	Yes	Added in `v1.27.0`. The filter strategy to use for filtering the search results. The filter strategy can be set to `sweeping` or `acorn`. - `sweeping`: The default filter strategy. - `acorn`: Uses Weaviate's ACORN implementation. Read more
`flatSearchCutoff`	integer	40000	Yes	Optional. Threshold for the flat-search cutoff. To force a vector index search, set `"flatSearchCutoff": 0`.
`skip`	boolean	`false`	No	When true, do not index the collection. Weaviate decouples vector creation and vector storage. If you skip vector indexing, but a vectorizer is configured (or a vector is provided manually), Weaviate logs a warning each import. To skip indexing and vector generation, set `"vectorizer": "none"` when you set `"skip": true`. See When to skip indexing.
`vectorCacheMaxObjects`	integer	`1e12`	Yes	Maximum number of objects in the memory cache. By default, this limit is set to one trillion (`1e12`) objects when a new collection is created. For sizing recommendations, see Vector cache considerations.
`pq`	object	--	Yes	Enable and configure product quantization (PQ) compression. PQ assumes some data has already been loaded. You should have 10,000 to 100,000 vectors per shard loaded before you enable PQ. For PQ configuration details, see PQ configuration parameters.

Database parameters for HNSW

Note that some database-level parameters are available to configure HNSW indexing behavior.

PERSISTENCE_HNSW_MAX_LOG_SIZE is a database-level parameter that sets the maximum size of the HNSW write-ahead-log. The default value is 500MiB.

Increase this value to improve efficiency of the compaction process, but be aware that this will increase the memory usage of the database. Conversely, decreasing this value will reduce memory usage but may slow down the compaction process.

Preferably, the PERSISTENCE_HNSW_MAX_LOG_SIZE should set to a value close to the size of the HNSW graph.

Tombstone cleanup parameters

Environment variable availability

TOMBSTONE_DELETION_CONCURRENCY is available in v1.24.0 and up.
TOMBSTONE_DELETION_MIN_PER_CYCLE and TOMBSTONE_DELETION_MAX_PER_CYCLE are available in v1.24.15 / v1.25.2 and up.

Tombstones are records that mark deleted objects. In an HNSW index, tombstones are regularly cleaned up, triggered periodically by the cleanupIntervalSeconds parameter.

As the index grows in size, the cleanup process may take longer to complete and require more resources. For very large indexes, this may cause performance issues.

To control the number of tombstones deleted per cleanup cycle and prevent performance issues, set the TOMBSTONE_DELETION_MAX_PER_CYCLE and TOMBSTONE_DELETION_MIN_PER_CYCLE environment variables.

Set TOMBSTONE_DELETION_MIN_PER_CYCLE to prevent occurrences of unnecessary cleanup cycles.
Set TOMBSTONE_DELETION_MAX_PER_CYCLE to prevent the cleanup process from taking too long and consuming too many resources.

As an example, for a cluster with 300 million objects per shard, a TOMBSTONE_DELETION_MIN_PER_CYCLE value of 1000000 (1 million) and a TOMBSTONE_DELETION_MAX_PER_CYCLE value of 10000000 (10 million) may be good starting points.

You can also set the TOMBSTONE_DELETION_CONCURRENCY environment variable to limit the number of threads used for tombstone cleanup. This can help prevent prevent the cleanup process from unnecessarily consuming too many resources, or the cleanup process from taking too long.

The default value for TOMBSTONE_DELETION_CONCURRENCY is set to half the number of CPU cores available to Weaviate.

In a cluster with a large number of cores, you may want to set TOMBSTONE_DELETION_CONCURRENCY to a lower value to prevent the cleanup process from consuming too many resources. Conversely, in a cluster with a small number of cores and a large number of deletions, you may want to set TOMBSTONE_DELETION_CONCURRENCY to a higher value to speed up the cleanup process.

PQ configuration parameters

Configure pq with these parameters.

Parameter	Type	Default	Details
`enabled`	boolean	`false`	Enable PQ when `true`. The Python client v4 does not use the `enabled` parameter. To enable PQ with the v4 client, set a `quantizer` in the collection definition.
`trainingLimit`	integer	100000	The maximum number of objects, per shard, used to fit the centroids. Larger values increase the time it takes to fit the centroids. Larger values also require more memory.
`segments`	integer	--	The number of segments to use. The number of vector dimensions must be evenly divisible by the number of segments. Starting in `v1.23`, Weaviate uses the number of dimensions to optimize the number of segments.
`centroids`	integer	256	The number of centroids to use (max: 256). We generally recommend you do not change this value. Due to the data structure used, smaller centroid value will not result in smaller vectors, but may result in faster compression at cost of recall.
`encoder`	string	`kmeans`	Encoder specification. There are two encoders. You can specify the `type` of encoder as either `kmeans` (default) or `tile`.
`distribution`	string	`log-normal`	Encoder distribution type. Only used with the `tile` encoder. If you use the `tile` encoder, you can specify the `distribution` as `log-normal` (default) or `normal`.

HNSW Configuration tips

To determine reasonable settings for your use case, consider the following questions and compare your answers in the table below:

How many queries do you expect per second?
Do you expect a lot of imports or updates?
How high should the recall be?

Number of queries	Many imports or updates	Recall level	Configuration suggestions
not many	no	low	This is the ideal scenario. Keep both the `ef` and `efConstruction` settings low. You don't need a big machine and you will still be happy with the results.
not many	no	high	Here the tricky thing is that your recall needs to be high. Since you're not expecting a lot of requests or imports, you can increase both the `ef` and `efConstruction` settings. Keep increasing them until you are happy with the recall. In this case, you can get pretty close to 100%.
not many	yes	low	Here the tricky thing is the high volume of imports and updates. Be sure to keep `efConstruction` low. Since you don't need a high recall, and you're not expecting a lot of queries, you can adjust the `ef` setting until you've reached the desired recall.
not many	yes	high	The trade-offs are getting harder. You need high recall and you're dealing with a lot of imports or updates. This means you need to keep the `efConstruction` setting low, but you can significantly increase your `ef` setting because your queries per second rate is low.
many	no	low	Many queries per second means you need a low `ef` setting. Luckily you don't need high recall so you can significantly increase the `efConstruction` value.
many	no	high	Many queries per second means a low `ef` setting. Since you need a high recall but you are not expecting a lot of imports or updates, you can increase your `efConstruction` until you've reached the desired recall.
many	yes	low	Many queries per second means you need a low `ef` setting. A high number of imports and updates also means you need a low `efConstruction` setting. Luckily your recall does not have to be as close to 100% as possible. You can set `efConstruction` relatively low to support your input or update throughput, and you can use the `ef` setting to regulate the query per second speed.
many	yes	high	Aha, this means you're a perfectionist or you have a use case that needs the best of all three worlds. Increase your `efConstruction` value until you hit the time limit of imports and updates. Next, increase your `ef` setting until you reach your desired balance of queries per second versus recall. While many people think they need maximize all three dimensions, in practice that's usually not the case. We leave it up to you to decide, and you can always ask for help in our forum.

tip

This set of values is a good starting point for many use cases.

Parameter	Value
`ef`	`64`
`efConstruction`	`128`
`maxConnections`	`32`

Flat indexes

Added in v1.23

Flat indexes are recommended for use cases where the number of objects per index is low, such as in multi-tenancy use cases.

Parameter	Type	Default	Changeable	Details
`vectorCacheMaxObjects`	integer	`1e12`	Yes	Maximum number of objects in the memory cache. By default, this limit is set to one trillion (`1e12`) objects when a new collection is created. For sizing recommendations, see Vector cache considerations.
`bq`	object	--	No	Enable and configure binary quantization (BQ) compression. For BQ configuration details, see BQ configuration parameters.

BQ configuration parameters

Configure bq with these parameters.

Parameter	Type	Default	Details
`enabled`	boolean	`false`	Enable BQ. Weaviate uses binary quantization (BQ) compression when `true`.
`rescoreLimit`	integer	-1	The minimum number of candidates to fetch before rescoring.
`cache`	boolean	`false`	Whether to use the vector cache.

Dynamic indexes

Experimental feature

Available starting in v1.25. Dynamic indexing is an experimental feature. Use with caution.

Dynamic index requires ASYNC_INDEXING

Dynamic indexes require asynchronous indexing. To enable asynchronous indexing in a self-hosted Weaviate instance, set the ASYNC_INDEXING environment variable to true. If your instance is hosted in Weaviate Cloud, use the Weaviate Cloud console to enable asynchronous indexing.

Using the dynamic index will initially create a flat index and once the number of objects exceeds a certain threshold (by default 10,000 objects) it will automatically switch you over to an HNSW index.

This is only a one-way switch that converts a flat index to a HNSW, the index does not support changing back to a flat index even if the object count goes below the threshold due to deletion.

The goal of dynamic indexing is to shorten latencies during query time at the cost of a larger memory footprint.

Dynamic index parameters

Parameter	Type	Default	Details
`distance`	string	`cosine`	Distance metric. The metric that measures the distance between two arbitrary vectors.
`hnsw`	object	default HNSW	HNSW index configuration to be used.
`flat`	object	default Flat	Flat index configuration to be used.
`threshold`	integer	10000	Threshold object count at which `flat` to `hnsw` conversion happens

Index configuration parameters

Experimental feature

Available starting in v1.25. Dynamic indexing is an experimental feature. Use with caution.

Use these parameters to configure the index type and their properties. They can be set in the collection configuration.

Parameter	Type	Default	Details
`vectorIndexType`	string	`hnsw`	Optional. The index type - can be `hnsw`, `flat` or `dynamic`.
`vectorIndexConfig`	object	-	Optional. Set parameters that are specific to the vector index type.

How to select the index type

Generally, the hnsw index type is recommended for most use cases. The flat index type is recommended for use cases where the data the number of objects per index is low, such as in multi-tenancy cases. You can also opt for the dynamic index which will initially configure a flat index and once the object count exceeds a specified threshold it will automatically convert to an hnsw index.

See this section for more information about the different index types and how to choose between them.

If faster import speeds are desired, asynchronous indexing allows de-coupling of indexing from object creation.

Asynchronous indexing

Experimental

Available starting in v1.22. This is an experimental feature. Use with caution.

Starting in Weaviate 1.22, you can use asynchronous indexing by opting in.

To enable asynchronous indexing, set the ASYNC_INDEXING environment variable to true in your Weaviate configuration (the docker-compose.yml file if you use Docker Compose). This setting enables asynchronous indexing for all collections.

Example Docker Compose configuration

---
services:
  weaviate:
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8080'
    - --scheme
    - http
    image: cr.weaviate.io/semitechnologies/weaviate:1.31.3
    restart: on-failure:0
    ports:
     - 8080:8080
     - 50051:50051
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      QUERY_MAXIMUM_RESULTS: 10000
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      ENABLE_API_BASED_MODULES: 'true'
      CLUSTER_HOSTNAME: 'node1'
      AUTOSCHEMA_ENABLED: 'false'
      ASYNC_INDEXING: 'true'
...

To get the index status, check the node status endpoint.

Node status example usage

The nodes/shards/vectorQueueLength field shows the number of objects that still have to be indexed.

import weaviate

client = weaviate.connect_to_local()

    nodes_info = client.cluster.nodes(
        collection="JeopardyQuestion",  # If omitted, all collections will be returned
        output="verbose"  #  If omitted, will be "minimal"
    )
    print(nodes_info)

finally:
    client.close()

API docs

import weaviate

client = weaviate.Client("http://localhost:8080")

nodes_status = client.cluster.get_nodes_status()
print(nodes_status)

import weaviate from 'weaviate-client';

const client = await weaviate.connectToLocal()

const response = await client.cluster.nodes({
  collection: 'JeopardyQuestion',
  output: 'minimal'
})

console.log(response)

import weaviate from 'weaviate-ts-client';

const client = weaviate.client({
  scheme: 'http',
  host: 'localhost:8080',
});

const response = await client.cluster
  .nodesStatusGetter()
  .do();
console.log(response);

package main

import (
  "context"
  "fmt"

  "github.com/weaviate/weaviate-go-client/v5/weaviate"
)

func main() {
  cfg := weaviate.Config{
    Host:   "localhost:8080",
    Scheme: "http",
  }
  client, err := weaviate.NewClient(cfg)
  if err != nil {
    panic(err)
  }

  nodesStatus, err := client.Cluster().
    NodesStatusGetter().
    Do(context.Background())

  if err != nil {
    panic(err)
  }
  fmt.Printf("%v", nodesStatus)
}

package io.weaviate;

import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
import io.weaviate.client.v1.cluster.model.NodesStatusResponse;

public class App {
  public static void main(String[] args) {
    Config config = new Config("http", "localhost:8080");
    WeaviateClient client = new WeaviateClient(config);

    Result<NodesStatusResponse> result = client.cluster()
      .nodesStatusGetter()
      .run();

    if (result.hasErrors()) {
      System.out.println(result.getError());
      return;
    }
    System.out.println(result.getResult());
  }
}

curl http://localhost:8080/v1/nodes

Then, you can check the status of the vector index queue by inspecting the output.

The vectorQueueLength field will show the number of remaining objects to be indexed. In the example below, the vector index queue has 425 objects remaining to be indexed on the TestArticle shard, out of a total of 1000 objects.

{
  "nodes": [
    {
      "batchStats": {
        "ratePerSecond": 0
      },
      "gitHash": "e6b37ce",
      "name": "weaviate-0",
      "shards": [
        {
          "class": "TestArticle",
          "name": "nq1Bg9Q5lxxP",
          "objectCount": 1000,
          "vectorIndexingStatus": "INDEXING",
          "vectorQueueLength": 425
        },
      ],
      "stats": {
        "objectCount": 1000,
        "shardCount": 1
      },
      "status": "HEALTHY",
      "version": "1.22.1"
    },
  ]
}

Multiple vector embeddings (named vectors)

Added in v1.24.0

Weaviate collections support multiple named vectors.

Collections can have multiple named vectors.

The vectors in a collection can have their own configurations. Each vector space can set its own index, its own compression algorithm, and its own vectorizer. This means you can use different vectorization models, and apply different distance metrics, to the same object.

To work with named vectors, adjust your queries to specify a target vector for vector search or hybrid search queries.

Questions and feedback

If you have any questions or feedback, let us know in the user forum.

HNSW indexes​

HNSW index parameters​

Database parameters for HNSW​

Tombstone cleanup parameters​

PQ configuration parameters​

HNSW Configuration tips​

Flat indexes​

BQ configuration parameters​

Dynamic indexes​

Dynamic index parameters​

Index configuration parameters​

Asynchronous indexing​

Multiple vector embeddings (named vectors)​

Related pages​

Questions and feedback​