Skip to main content

Replication

Weaviate instances can be replicated to increase availability and read throughput, and to enable zero-downtime upgrades. On this page, you will learn how to set replication for your Weaviate instance.

For more about how replication is designed and built in Weaviate, see the Replication Architecture pages.

How to configure

Replication is disabled by default and can be enabled per data class in the collection configuration. This means you can set different replication factors per class in your dataset. To enable replication on a class, the replication factor has to be set, which looks like the following:

{
"class": "ExampleClass",
"properties": [
{
"name": "exampleProperty",
"dataType": [
"text"
]
}
],
"replicationConfig": {
"factor": 3 # Integer, default 1. How many copies of this class will be stored.
}
}

Here's an example for all clients:

import weaviate
from weaviate.classes.config import Property, DataType, Configure

client = weaviate.connect_to_local(port=8180, grpc_port=50151)

try:
articles = client.collections.create(
name="Article",
properties=[
Property(name="title", data_type=DataType.TEXT)
],
replication_config=Configure.replication(factor=3)
)

finally:
client.close()

When you set this replication factor in the data schema before you add data, you will have 3 replicas of the data stored. Weaviate can also handle changing this setting after you imported the data. Then the data is copied to the new replica nodes (if there are enough nodes), but note that this is experimental and will be more stable in the future.

Note:

Changing the replication factor after adding data is an experimental feature as of v1.17 and will become more stable in the future.

The data schema has a write consistency level of ALL, which means when you upload or update a schema, this will be sent to ALL nodes (via a coordinator node). The coordinator node waits for a successful acknowledgement from ALL nodes before sending a success message back to the client. This ensures a highly consistent schema in your distributed Weaviate setup.

How to use: Queries

When you add (write) or query (read) data, one or more replica nodes in the cluster will respond to the request. How many nodes need to send a successful response and acknowledgement to the coordinator node depends on the consistency_level. Available consistency levels are ONE, QUORUM (replication_factor / 2 + 1) and ALL.

The consistency_level can be specified at query time:

# Get an object by ID, with consistency level ONE
curl "http://localhost:8080/v1/objects/{ClassName}/{id}?consistency_level=ONE"
note

In v1.17, only read queries that get data by ID had a tunable consistency level. All other object-specific REST endpoints (read or write) used the consistency level ALL. Starting with v1.18, all write and read queries are tunable to either ONE, QUORUM (default) or ALL. GraphQL endpoints use the consistency level ONE (in both versions).

import weaviate
from weaviate.classes.config import Property, DataType, Configure
from weaviate.classes import ConsistencyLevel

client = weaviate.connect_to_local(port=8180, grpc_port=50151)

try:
articles.with_consistency_level(ConsistencyLevel.ONE) # Set the consistency level
response = articles.query.fetch_object_by_id(uuid="36ddd591-2dee-4e7e-a3cc-eb86d30a4303")

finally:
client.close()