Persistence & Backups

Weaviate on Stackoverflow badge Weaviate issues on Github badge Weaviate total Docker pulls badge

đź’ˇ You are looking at older or release candidate documentation. The current Weaviate version is v1.15.1


Because Weaviate is run using Docker or Kubernetes, you can create a backup of your data by mounting a volume to store the data outside of the containers. When restarting a Weaviate instance, the data from the mounted volume is used to restore the dataset.

Docker Compose

Creating backups is divided into two sections. First, we want to make the setup persistent. Second, we can create backups by copying the folder outside the container that contains the Weaviate DB.


When running Weaviate with docker-compose, you can set the volumes variable under the weaviate service and a unique cluster hostname as an environment variable.

      - /var/weaviate:/var/lib/weaviate
      CLUSTER_HOSTNAME: 'node1' 
  • About the volumes
    • /var/weaviate is the location where you want to store the data on the local machine
    • /var/lib/weaviate (after the colon) is the location inside the container, don’t change this
  • About the hostname
    • The CLUSTER_HOSTNAME can be any arbitrarily chosen name

In the case you want a more verbose output, you can change the environment variable for the LOG_LEVEL

      LOG_LEVEL: 'debug'

A complete example of a Weaviate without modules but with an externally mounted volume and more verbose output:

version: '3.4'
    - --host
    - --port
    - '8080'
    - --scheme
    - http
    image: semitechnologies/weaviate:v1.15.1
    - 8080:8080
    restart: on-failure:0
      - /var/weaviate:/var/lib/weaviate # <== set a volume here
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      CLUSTER_HOSTNAME: 'node1' # <== this can be set to an arbitrary name


The folder that you’ve chosen to contain your external Docker volume contains the Weaviate DB. You can simply copy it and store it.

For example:

$ mkdir /var/weaviate.BAK
$ cp /var/weaviate /var/weaviate.BAK

Running vs. stopped instance

  • Ideally, the setup should be stopped first (docker-compose down), because an orderly shutdown will flush everything to disk and make sure it can be read easily.
  • If you create a backup from a running setup, no data is lost, but not all segments have been flushed yet. This means the next startup will recover the data from an active commit log. ​This will result in a ​message: “did Weaviate crash? Trying to recover”​​.​ This is slightly slower than an ​orderly​ shutdown.​


For Kubernetes setup, the only thing to bear in mind is that Weaviate needs a PersistentVolumes through PersistentVolumeClaims (more info) but the Helm chart is already configured to store the data on an external volume.

Disk Pressure Warnings and Limits

Starting with v1.12.0 there are two levels of disk usage notifications and actions configured through environment variables. Both variables are optional. If not set they default to the values outlined below:

VariableDefault ValueDescription
DISK_USE_WARNING_PERCENTAGE80If disk usage is higher than the given percentage a warning will be logged by all shards on the affected node’s disk
DISK_USE_READONLY_PERCENTAGE90If disk usage is higher than the given percentage all shards on the affected node will be marked as READONLY, meaning all future write requests will fail.

If a shard was marked READONLY due to disk pressure and you want to mark the shard as ready again (either because you have made more space available or changed the thresholds) you can use the Shards API to do so.

More Resources

If you can’t find the answer to your question here, please look at the:

  1. Frequently Asked Questions. Or,
  2. Knowledge base of old issues. Or,
  3. For questions: Stackoverflow. Or,
  4. For issues: Github. Or,
  5. Ask your question in the Slack channel: Slack.