Core Knowledge
Getting started

Installation
Configuration
Client libraries

Schema

GraphQL references
RESTful API references

Modules

Roadmap
Architecture
Vector Index (ANN) Plugins
Benchmarks

Tutorials
More resources

Docker Compose

Weaviate on Stackoverflow badge Weaviate issues on Github badge Weaviate v1.15.2 version badge Weaviate v1.15.2 version badge Weaviate total Docker pulls badge

Weaviate can be run using Docker-compose. This setup is ideal to develop locally or in the cloud. The handy configurator can be used to configure a setup to your liking.


Configurator

You can use the configuration tool below to customize your Weaviate setup for your desired runtime.

Configuration

💡 If you are new to Docker (Compose) and containerization, check out our Docker Introduction for Weaviate Users.

To start Weaviate with docker-compose, you need a docker-compose configuration file. You can obtain it from the configuration tool above or alternatively pick one of the examples below. Additional environment variables can be set in this file, which regulate your Weaviate setup, authentication and authorization, module settings, and data storage settings.

Persistent Volume

It’s recommended to set a persistent volume to avoid data loss and improve reading and writing speeds.

Add the following snippet to your Docker Compose YAML file:

services:
  weaviate:
    volumes:
      - /var/weaviate:/var/lib/weaviate
    # etc

Make sure to run $ docker-compose down when shutting down, this writes all the files from memory to disk.

Weaviate without any modules

An example docker-compose setup for Weaviate without any modules can be found below. In this case, no model inference is performed at either import or search time. You will need to provide your own vectors (e.g. from an outside ML model) at import and search time:

version: '3.4'
services:
  weaviate:
    image: semitechnologies/weaviate:1.14.0
    ports:
    - 8080:8080
    restart: on-failure:0
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      DEFAULT_VECTORIZER_MODULE: 'none'
      CLUSTER_HOSTNAME: 'node1'

Weaviate with the text2vec-transformers module

An example docker-compose setup file with the transformers model sentence-transformers/msmarco-distilroberta-base-v2 is:

version: '3.4'
services:
  weaviate:
    image: semitechnologies/weaviate:1.14.0
    restart: on-failure:0
    ports:
     - "8080:8080"
    environment:
      QUERY_DEFAULTS_LIMIT: 20
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: "./data"
      DEFAULT_VECTORIZER_MODULE: text2vec-transformers
      ENABLE_MODULES: text2vec-transformers
      TRANSFORMERS_INFERENCE_API: http://t2v-transformers:8080
      CLUSTER_HOSTNAME: 'node1'
  t2v-transformers:
    image: semitechnologies/transformers-inference:sentence-transformers-msmarco-distilroberta-base-v2
    environment:
      ENABLE_CUDA: 0 # set to 1 to enable
      # NVIDIA_VISIBLE_DEVICES: all # enable if running with CUDA

Note that transformer models are Neural Networks built to run on GPUs. Running Weaviate with the text2vec-transformers module and without GPU is possible, but it will be slower. Enable CUDA if you have a GPU available (ENABLE_CUDA=1).

For more information on how to set up the environment with the text2vec-transformers module, see this page.

The text2vec-transformers module requires at least Weaviate version v1.2.0.

Attaching to the log output of Weaviate only

The output of docker-compose up is quite verbose as it attaches to the logs of all containers.

You can attach the logs only to Weaviate itself, for example, by running the following command instead of docker-compose up:

# Run Docker Compose
$ docker-compose up -d && docker-compose logs -f weaviate

Alternatively you can run docker-compose entirely detached with docker-compose up -d and then poll {bindaddress}:{port}/v1/meta until you receive a status 200 OK.

Environment variables

An overview of environment variables in the docker-compose file:

VariableDescriptionTypeExample Value
GOMEMLIMITSet the memory limit for the Go runtime. This should match your available memory. The Go runtime tries to make sure that long-lived and temporary memory allocations do not exceed this value by making the Gargabe Collector more aggressive as the memory usage approaches the limit. Learn more about GOMEMLIMIT.string - memory limit in SI uints4096MiB
ORIGINSet the http(s) origin for Weaviatestring - HTTP originhttps://my-weaviate-deployment.com
CONTEXTIONARY_URLService-Discovery for the contextionary containerstring - URLhttp://contextionary
PERSISTENCE_DATA_PATHWhere should Weaviate Standalone store its data?string - file path/var/lib/weaviate
ENABLE_MODULESWhich modules to enable in the setup?stringtext2vec-contextionary
TRANSFORMERS_INFERENCE_APIThe endpoint where to reach the transformers module if enabledstringhttp://t2v-transformers:8080
DEFAULT_VECTORIZER_MODULEDefault vectorizer module, so this doesn’t need to be defined per class in the schemastringtext2vec-contextionary
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLEDAllow users to interact with weaviate without authstring - true/falsetrue
AUTHENTICATION_OIDC_ENABLEDEnable OIDC Authstring - true/falsefalse
AUTHENTICATION_OIDC_ISSUEROIDC Token Issuerstring - URLhttps://myissuer.com
AUTHENTICATION_OIDC_CLIENT_IDOIDC Client IDstringmy-client-id
AUTHENTICATION_OIDC_USERNAME_CLAIMOIDC Username Claimstringemail
AUTHENTICATION_OIDC_GROUPS_CLAIMOIDC Groups Claimstringgroups
AUTHORIZATION_ADMINLIST_ENABLEDEnable AdminList Authorization modestring - true/falsetrue
AUTHORIZATION_ADMINLIST_USERSUsers with admin permissionstring - comma-separated listjane@example.com,john@example.com
AUTHORIZATION_ADMINLIST_READONLY_USERSUsers with read-only permissionstring - comma-separated listalice@example.com,dave@example.com
DISK_USE_WARNING_PERCENTAGEIf disk usage is higher than the given percentage a warning will be logged by all shards on the affected node’s disk. See Disk Pressure Warnings and Limits for details.string - number80
DISK_USE_READONLY_PERCENTAGEIf disk usage is higher than the given percentage all shards on the affected node will be marked as READONLY, meaning all future write requests will fail. See Disk Pressure Warnings and Limits for details.string - number90
PROMETHEUS_MONITORING_ENABLEDIf set, Weaviate will collect metrics in a Prometheus-compatible formatstring - true/falsefalse
BACKUP_*Various configuration variables for backup provider modules. They are outlined in detail on the Backups page.  

More Resources

If you can’t find the answer to your question here, please look at the:

  1. Frequently Asked Questions. Or,
  2. Knowledge base of old issues. Or,
  3. For questions: Stackoverflow. Or,
  4. For issues: Github. Or,
  5. Ask your question in the Slack channel: Slack.