Scalability is one of Weaviate’s core features. The following roadmap aims to give you an understanding of where we are taking Weaviate from a scalability and implementation perspective.
The way that objects and the inverted index are stored within Weaviate are migrated from a B+Tree-based approach to an LSM-Tree approach. This can speed up import times up to 50%. Also addresses import times degrading over time.
Multi-shard indices
status: done, to be released with next milestone
A monolithic index (one index per class) can be broken up into smaller independent shards. This allows utilizing resources on large (single) machines better and allows for tweaking storage settings for specific large-scale cases.
Horizontal Scalability without replication
status: done, released in v1.8.0
An index, comprised of many shards, can be distributed among multiple nodes. A search will touch multiple shards on multiple nodes and combine the results. Major benefit: If a use case does not fit on a single node, you can use *n* nodes to achieve *n* times the use case size. At this point every node in the cluster is still a potential single point of failure.
A node can contain shards which are already present on other nodes as well. This means if a node goes down, another node can take up the load without the loss of availability or data. Note that the design plans for a leaderless replication, so there is no distinction between primary and secondary shards. Removes all single point of failures.
Dynamic scaling
pending
Instead of starting out with a cluster with *n* nodes, the cluster size can be increased or shrunk at runtime. Weaviate automatically distributes the existing shards accordingly.