Skip to main content

Replication

Weaviate instances can be replicated to increase availability and read throughput, and to enable zero-downtime upgrades. On this page, you will learn how to set replication for your Weaviate instance.

For more about how replication is designed and built in Weaviate, see the Replication Architecture pages.

How to configure

Replication factor change in v1.25

In Weaviate v1.25, a replication factor cannot be changed once it is set.

This is due to the schema consensus algorithm change in v1.25. This will be improved in future versions.

Replication is disabled by default and can be enabled per data class in the collection configuration. This means you can set different replication factors per class in your dataset. To enable replication on a class, the replication factor has to be set, which looks like the following:

{
"class": "ExampleClass",
"properties": [
{
"name": "exampleProperty",
"dataType": [
"text"
]
}
],
"replicationConfig": {
"factor": 3 # Integer, default 1. How many copies of this class will be stored.
}
}

Here's an example for all clients:

from weaviate.classes.config import Configure

client.collections.create(
"Article",
replication_config=Configure.replication(
factor=3
)
)

When you set this replication factor in the data schema before you add data, you will have 3 replicas of the data stored. Weaviate can also handle changing this setting after you imported the data. Then the data is copied to the new replica nodes (if there are enough nodes), but note that this is experimental and will be more stable in the future.

Note:

Changing the replication factor after adding data is an experimental feature as of v1.17 and will become more stable in the future.

The data schema has a write consistency level of ALL, which means when you upload or update a schema, this will be sent to ALL nodes (via a coordinator node). The coordinator node waits for a successful acknowledgement from ALL nodes before sending a success message back to the client. This ensures a highly consistent schema in your distributed Weaviate setup.

How to use: Queries

When you add (write) or query (read) data, one or more replica nodes in the cluster will respond to the request. How many nodes need to send a successful response and acknowledgement to the coordinator node depends on the consistency_level. Available consistency levels are ONE, QUORUM (replication_factor / 2 + 1) and ALL.

The consistency_level can be specified at query time:

# Get an object by ID, with consistency level ONE
curl "http://localhost:8080/v1/objects/{ClassName}/{id}?consistency_level=ONE"
note

In v1.17, only read queries that get data by ID had a tunable consistency level. All other object-specific REST endpoints (read or write) used the consistency level ALL. Starting with v1.18, all write and read queries are tunable to either ONE, QUORUM (default) or ALL. GraphQL endpoints use the consistency level ONE (in both versions).

from weaviate.classes.config import ConsistencyLevel

questions = client.collections.get(collection_name).with_consistency_level(
consistency_level=ConsistencyLevel.QUORUM
)
response = collection.query.fetch_object_by_id("36ddd591-2dee-4e7e-a3cc-eb86d30a4303")

# The parameter passed to `withConsistencyLevel` can be one of:
# * 'ALL',
# * 'QUORUM' (default), or
# * 'ONE'.
#
# It determines how many replicas must acknowledge a request
# before it is considered successful.

for o in response.objects:
print(o.properties) # Inspect returned objects

Questions and feedback

If you have any questions or feedback, let us know in the user forum.