Skip to main content

Indexes

Vector indexes​

Weaviate uses a vector index to facilitate efficient, vector-first data storage and retrieval. A vector index is a data structure that stores vectors and supports fast similarity searches.

Weaviate supports the Hierarchical Navigable Small Worlds (HNSW) indexing algorithm.

If you want to contribute to developing a new index type at Weaviate, please contact us or make a pull request in our GitHub project. Stay tuned for updates!

Index configuration parameters​

These parameters configure Weaviate indexing across index types. The vectorIndexConfig parameter provides a way to configure specific details for different types of index. Currently the only index type available is HNSW.

ParameterTypeDefaultDetails
vectorIndexTypestringhnswOptional. The algorithm that creates your index. HNSW is the only index type currently available.
vectorIndexConfigobject-Optional. Set parameters that are specific to the vector index type. See HSNW specific parameters

HNSW vector indexes​

HNSW indexes are scalable and super fast at query time, but HNSW algorithms are costly when you add data during the index building process.

For an alternative approach to building indexes that may help with some use cases, see Asynchronous indexing.

HNSW index parameters​

HNSW indexes use a combination of techniques to improve search speed. At build time, the HNSW algorithm creates a series of layers. At query time, the HNSW algorithm uses the layers to build a list of approximate nearest neighbors (ANN) quickly and efficiently. This two-phase approach means you can update some HNSW parameters at run time, but others cannot be modified after you create your collection.

The ef parameter controls the size of the nearest neighbors list and helps to balance search speed and recall. You can set an explicit ef value or let Weaviate set a dynamic ef. These parameters let you tune ef, dynamic ef, and other aspects of the HNSW algorithm.

ParameterTypeDefaultChangeableDetails
cleanupIntervalSecondsinteger300YesCleanup frequency. This value does not normally need to be adjusted. A higher value means cleanup runs less frequently, but it does more in a single batch. A lower value means cleanup is more frequent, but it may be less efficient on each run.
distancestringcosineNoDistance metric. The metric that measures the distance between two arbitrary vectors. For available distance metrics, see supported distance metrics.
efinteger-1YesBalance search speed and recall. ef is the size of the dynamic list that the HNSW uses during search. Search is more accurate when ef is higher, but it is also slower. ef values greater than 512 show diminishing improvements in recall.

Dynamic ef. Weaviate automatically adjusts the ef value and creates a dynamic ef list when ef is set to -1. For more details, see dynamic ef.
efConstructioninteger128NoBalance index search speed and build speed. A high efConstruction value means you can lower your ef settings, but importing is slower.

efConstruction must be greater than 0.
maxConnectionsinteger64NoMaximum number of connections per element. maxConnections is the connection limit per layer for layers above the zero layer. The zero layer can have (2 * maxConnections) connections.

maxConnections must be greater than 0.
dynamicEfMininteger100YesNew in v1.10.0.

Lower bound for dynamic ef. Protects against a creating search list that is too short.

This setting is only used when ef is -1.
dynamicEfMaxinteger500YesNew in v1.10.0.

Upper bound for dynamic ef. Protects against creating a search list that is too long.

If dynamicEfMax is higher than the limit, dynamicEfMax does not have any effect. In this case, ef is the limit.

This setting is only used when ef is -1.
dynamicEfFactorinteger8YesNew in v1.10.0.

Multiplier for dynamic ef. Sets the potential length of the search list.

This setting is only used when ef is -1.
flatSearchCutoffinteger40000YesOptional. Threshold for the flat-search cutoff. To force a vector index search, set "flatSearchCutoff": 0.
skipbooleanfalseNoWhen true, do not index the collection.

Weaviate decouples vector creation and vector storage. If you skip vector indexing, but a vectorizer is configured (or a vector is provided manually), Weaviate logs a warning each import.

To skip indexing and vector generation, set "vectorizer": "none" when you set "skip": true.

See When to skip indexing.
vectorCacheMaxObjectsinteger1e12YesMaximum number of objects in the memory cache. By default, this limit is set to one trillion (1e12) objects when a new collection is created. For sizing recommendations, see Vector cache considerations.
pqdocument--YesEnable and configure product quantization (PQ) compression.

PQ assumes some data has already been loaded. You should have 10,000 to 100,000 vectors per shard loaded before you enable PQ.

For PQ configuration details, see PQ configuration parameters.

PQ configuration parameters​

Product quantization (PQ) is a form of data compression that reduces the memory footprint of a vector index. HNSW is an in-memory vector index, so enabling PQ for HNSW lets you work with larger datasets. For a discussion of how PQ saves memory, see this concepts section.

PQ relies on a codebook to compress the original vectors. The codebook defines "centroids" that are used to calculate the compressed vector. Weaviate’s PQ implementation uses existing data to train the codebook. You must have some vectors loaded before you enable PQ so Weaviate can use them to define the centroids. You should have 10,000 to 100,000 vectors loaded before you enable PQ.

These parameters let you fine tune pq.

ParameterTypeDefaultDetails
enabledbooleanfalseEnable PQ. Weaviate use product quantization (PQ) compression when true.
trainingLimitinteger100000Object limit. The maximum number of objects, per shard, used to fit the centroids. Larger values increase the time it takes to fit the centroids. Larger values also require more memory.
segmentsinteger--The number of segments to use. By default segments is equal to the number of vector dimensions. Reducing the number of segments reduces the size of the quantized (PQ compressed) vectors.

The number of vector dimensions must be evenly divisible by the number of segments.
centroidsinteger256The number of centroids to use. Reducing the number of centroids reduces the size of the quantized (PQ compressed) vectors at the price of recall.

If you use the kmeans encoder, centroids is set to 256 (one byte) by default.
encoderstringkmeansEncoder specification. There are two encoders. You can specify the type of encoder as either kmeans(default) or tile.
distributionstringlog-normalEncoder distribution type. Only used with the tile encoder. If you use the tile encoder, you can specify the distribution as log-normal (default) or normal.

Collection configuration example​

This is a sample of collection that shows the data schema:

{
"class": "Article",
"description": "string",
"properties": [
{
"name": "title",
"description": "string",
"dataType": ["text"]
}
],
"vectorIndexType": "hnsw",
"vectorIndexConfig": {
"skip": false,
"ef": 100,
"efConstruction": 128,
"maxConnections": 64,
}
}

Dynamic ef​

The ef parameter controls the size of the approximate nearest neighbors (ANN) list at query time. You can configure a specific list size or else let Weaviate configure the list dynamically. If you choose dynamic ef, Weaviate provides several options to control the size of the ANN list.

The length of the list is determined by the query response limit that you set in your query. Weaviate uses the query limit as an anchor and modifies the size of ANN list according to the values you set for the dynamicEf parameters.

  • dynamicEfMin sets a lower bound on the list length.
  • dynamicEfMax sets an upper bound on the list length.
  • dynamicEfFactor sets a range for the list.

To keep search recall high, the actual dynamic ef value stays above dynamicEfMin even if the query limit is small enough to suggest a lower value.

To keep search speed reasonable even when retrieving large result sets, the dynamic ef value is limited to dynamicEfMax. Weaviate doesn't exceed dynamicEfMax even if the query limit is large enough to suggest a higher value. If the query limit is higher than dynamicEfMax, dynamicEfMax does not have any effect. In this case, dynamic ef value is equal to the query limit.

To determine the length of the ANN list, Weaviate multiples the query limit by dynamicEfFactor. The list range is modified by dynamicEfMin and dynamicEfMax.

Consider this GraphQL query that sets a limit of 4.

{
Get {
JeopardyQuestion(limit: 4) {
answer
question
}
}
}

Imagine the collection has dynamic ef configured.

  "vectorIndexConfig": {
"ef": -1,
"dynamicEfMin": 5
"dynamicEfMax": 25
"dynamicEfFactor": 10
}

The resulting search list has these characteristics.

  • A potential length of 40 objects ( ("dynamicEfFactor": 10) * (limit: 4) ).
  • A minimum length of 5 objects ("dynamicEfMin": 5).
  • A maximum length of 25 objects ("dynamicEfMax": 25).
  • An actual size of 5 to 25 objects.

If you use the docker-compose.yml file from Weavaite to run your local instance, the QUERY_DEFAULTS_LIMIT environment variable sets a reasonable default query limit. To prevent out of memory errors,QUERY_DEFAULTS_LIMIT is significantly lower than QUERY_MAXIMUM_RESULTS.

To change the default limit, edit the value for QUERY_DEFAULTS_LIMIT when you configure your Weaviate instance.

Configuration tips​

To determine reasonable settings for your use case, consider the following questions and compare your answers in the table below:

  1. How many queries do you expect per second?
  2. Do you expect a lot of imports or updates?
  3. How high should the recall be?
Number of queriesMany imports or updatesRecall levelConfiguration suggestions
not manynolowThis is the ideal scenario. Keep both the ef and efConstruction settings low. You don't need a big machine and you will still be happy with the results.
not manynohighHere the tricky thing is that your recall needs to be high. Since you're not expecting a lot of requests or imports, you can increase both the ef and efConstruction settings. Keep increasing them until you are happy with the recall. In this case, you can get pretty close to 100%.
not manyyeslowHere the tricky thing is the high volume of imports and updates. Be sure to keep efConstruction low. Since you don't need a high recall, and you're not expecting a lot of queries, you can adjust the ef setting until you've reached the desired recall.
not manyyeshighThe trade-offs are getting harder. You need high recall and you're dealing with a lot of imports or updates. This means you need to keep the efConstruction setting low, but you can significantly increase your ef setting because your queries per second rate is low.
manynolowMany queries per second means you need a low ef setting. Luckily you don't need high recall so you can significantly increase the efConstruction value.
manynohighMany queries per second means a low ef setting. Since you need a high recall but you are not expecting a lot of imports or updates, you can increase your efConstruction until you've reached the desired recall.
manyyeslowMany queries per second means you need a low ef setting. A high number of imports and updates also means you need a low efConstruction setting. Luckily your recall does not have to be as close to 100% as possible. You can set efConstruction relatively low to support your input or update throughput, and you can use the ef setting to regulate the query per second speed.
manyyeshighAha, this means you're a perfectionist or you have a use case that needs the best of all three worlds. Increase your efConstruction value until you hit the time limit of imports and updates. Next, increase your ef setting until you reach your desired balance of queries per second versus recall.

While many people think they need maximize all three dimensions, in practice that's usually not the case. We leave it up to you to decide, and you can always ask for help in our forum.
tip

This set of values is a good starting point for many use cases.

ParameterValue
ef64
efConstruction128
maxConnections32

Vector index types​

The vectorIndexType parameter only specifies how the vectors of data objects are indexed. The index is used for data retrieval and similarity search.

The vectorizer parameter determines how the data vectors are created (which numbers the vectors contain). vectorizer specifies a module, such as text2vec-contextionary, that Weaviate uses to create the vectors. (You can also set to vectorizer to none if you want to import your own vectors).

To learn more about configuring the data schema, see How to configure a schema.

Vector cache considerations​

For optimal search and import performance, previously imported vectors need to be in memory. A disk lookup for a vector is orders of magnitudes slower than memory lookup, so the disk cache should be used sparingly. However, Weaviate can limit the number of vectors in memory. By default, this limit is set to one trillion (1e12) objects when a new collection is created.

During import set vectorCacheMaxObjects high enough that all vectors can be held in memory. Each import requires multiple searches. Import performance drops drastically when there isn't enough memory to hold all of the vectors in the cache.

After import, when your workload is mostly querying, experiment with vector cache limits that are less than your total dataset size.

Vectors that aren't currently in cache are added to the cache if there is still room. If the cache fills, Weaviate drops the whole cache. All future vectors have to be read from disk for the first time. Then, subsequent queries run against the cache until it fills again and the procedure repeats. Note that the cache can be a very valuable tool if you have a large dataset, and a large percentage of users only query a specific subset of vectors. In this case you might be able to serve the largest user group from cache while requiring disk lookups for "irregular" queries.

Deletions​

Cleanup is an async process runs that rebuilds the HNSW graph after deletes and updates. Prior to cleanup, objects are marked as deleted, but they are still connected to the HNSW graph. During cleanup, the edges are reassigned and the objects are deleted for good.

When to skip indexing​

There are situations where it doesn't make sense to vectorize a collection. For example, if the collection consists solely of references between two other collections, or if the collection contains mostly duplicate elements.

Importing duplicate vectors into HNSW is very expensive. The import algorithm checks early on if a candidate vector's distance is greater than the worst candidate's distance. When there are lots of duplicate vectors, this early exit condition is never met so each import or query results in an exhaustive search.

To avoid indexing a collection, set "skip" to "true". By default, collections are indexed.

Asynchronous indexing (experimental)​

Experimental

Available starting in v1.22. This is an experimental feature. Please use with caution.

Starting in Weaviate 1.22, you can use asynchronous indexing by opting in.

Asynchronous indexing decouples object creation from vector index updates. Objects are created faster, and the vector index updates in the background. Asynchronous indexing is especially useful for importing large amounts of data.

Asynchronous indexing is off by default. To enable asynchronous indexing, set the ASYNC_INDEXING environment variable to true in your Weaviate configuration (the docker-compose.yml file if you use Docker Compose). This setting enables asynchronous indexing for all collections.

Example Docker Compose configuration
---
version: '3.4'
services:
weaviate:
command:
- --host
- 0.0.0.0
- --port
- '8080'
- --scheme
- http
image: semitechnologies/weaviate:1.22.5
restart: on-failure:0
ports:
- "8080:8080"
- "50051:50051"
environment:
QUERY_DEFAULTS_LIMIT: 25
QUERY_MAXIMUM_RESULTS: 10000
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
DEFAULT_VECTORIZER_MODULE: 'text2vec-openai'
ENABLE_MODULES: 'text2vec-cohere,text2vec-huggingface,text2vec-openai,text2vec-palm,generative-cohere,generative-openai,generative-palm'
CLUSTER_HOSTNAME: 'node1'
AUTOSCHEMA_ENABLED: 'false'
ASYNC_INDEXING: 'true'
...

While the vector index is updating, Weaviate can search a maximum of 100,000 un-indexed objects by brute force, that is, without using the vector index. This means that the search performance is slower until the vector index has been fully updated. Also, any additional new objects beyond the first 100,000 in the queue are not include in the search.

To get the index status, call the node status endpoint. The nodes/shards/vectorQueueLength field shows the number of objects that still have to be indexed.

import weaviate

client = weaviate.Client("http://localhost:8080")

nodes_status = client.cluster.get_nodes_status()
print(nodes_status)

Then, you can check the status of the vector index queue by inspecting the output. The vectorQueueLength field will show the number of remaining objects to be indexed. In the example below, the vector index queue has 425 objects remaining to be indexed on the TestArticle shard, out of a total of 1000 objects.

{
"nodes": [
{
"batchStats": {
"ratePerSecond": 0
},
"gitHash": "e6b37ce",
"name": "weaviate-0",
"shards": [
{
"class": "TestArticle",
"name": "nq1Bg9Q5lxxP",
"objectCount": 1000,
"vectorIndexingStatus": "INDEXING",
"vectorQueueLength": 425
},
],
"stats": {
"objectCount": 1000,
"shardCount": 1
},
"status": "HEALTHY",
"version": "1.22.1"
},
]
}

Inverted index​

Configure the inverted index​

There are two indexes for filtering or searching the data, where the first (filterable) is for building a fast, Roaring Bitmaps index, and the second (searchable) index is for a BM25 or hybrid search.

The indexFilterable and indexSearchable keys can be set to true (on) or false (off) on a property level. Both are on by default.

The filterable index is only capable of filtering, while the searchable index can be used for both searching and filtering (though not as fast as the filterable index).

So, setting "indexFilterable": false and "indexSearchable": true (or not setting it at all) will have the trade-off of worse filtering performance but faster imports (due to only needing to update one index) and lower disk usage.

You can set these keys in the schema like shown below, at a property level:

{
"class": "Author",
"properties": [ // <== note that the inverted index is set per property
{
"indexFilterable": false, // <== turn off the filterable (Roaring Bitmap index) by setting `indexFilterable` to false
"indexSearchable": false, // <== turn off the searchable (for BM25/hybrid) by setting `indexSearchable` to false
"dataType": [
"text"
],
"name": "name"
}
]
}

A rule of thumb to follow when determining whether to switch off indexing is: if you will never perform queries based on this property, you can turn it off.

Data types and indexes

Both indexFilterable and indexSearchable are available for all types of data. However, indexSearchable is only relevant for text/text[], and in other cases it will be ignored.

You can also enable an inverted index to search based on timestamps.

{
"class": "Author",
"invertedIndexConfig": {
"indexTimestamps": true // <== false by default
},
"properties": []
}

Collections without indices​

If you don't want to set an index at all, neither ANN nor inverted, this is possible too.

To create the Authors collection without any indexes, skip indexing (vector and inverted) on the collection and on the properties.

{
"class": "Author",
"description": "A description of this collection, in this case, it's about authors",
"vectorIndexConfig": {
"skip": true // <== disable vector index
},
"properties": [
{
"indexFilterable": false, // <== disable filterable index for this property
"indexSearchable": false, // <== disable searchable index for this property
"dataType": [
"text"
],
"description": "The name of the Author",
"name": "name"
},
{
"indexFilterable": false, // <== disable filterable index for this property
"dataType": [
"int"
],
"description": "The age of the Author",
"name": "age"
},
{
"indexFilterable": false, // <== disable filterable index for this property
"dataType": [
"date"
],
"description": "The date of birth of the Author",
"name": "born"
},
{
"indexFilterable": false, // <== disable filterable index for this property
"dataType": [
"boolean"
],
"description": "A boolean value if the Author won a nobel prize",
"name": "wonNobelPrize"
},
{
"indexFilterable": false, // <== disable filterable index for this property
"indexSearchable": false, // <== disable searchable index for this property
"dataType": [
"text"
],
"description": "A description of the author",
"name": "description"
}
]
}