Installation

Weaviate on Stackoverflow badge Weaviate issues on Github badge Weaviate total Docker pulls badge

đź’ˇ You are looking at older or release candidate documentation. The current Weaviate version is v1.15.2

Weaviate is completely containerized, you can use Docker Compose and/or Kubernetes to run it.


Introduction

There are multiple ways to set up a Weaviate instance. For a try-out setup, we recommend you start with docker-compose. Cloud deployment can be used for small and larger projects. For production setup and/or large scale projects, we encourage you to use Kubernetes.

Docker Compose

If you want to try out Weaviate locally and on a small scale, you can use Docker Compose.

To start Weaviate with docker-compose, you need a docker-compose configuration file. Environment variables can be set in this file, which regulate your Weaviate setup, authentication and authorization, module settings, and data storage settings.

Configuration tool

Configure the docker-compose setup file. You can retrieve a docker-compose.yml file from https://configuration.semi.technology. Use the drop-down menus to generate the url with parameters, and perform the curl command to retrieve the file. At the moment, it is not possible yet to turn off the text2vec-contextionary module. This will be supported in the near future, so you could add custom vectorizers and other modules.

    
  

Next, you can run the setup as follows:

$ docker-compose up

Notes:

  • Default parameters can be omitted.
  • text2vec-contextionary other than English are experimental. Any feedback? Let share it with us on Github or StackOverflow.
  • For more information about Compound Splitting and other Contextionary parameters, click here.
  • You can modify the configuration file to add for example authentication or authorization.

Environment variables

An overview of environment variables in the docker-compose file:

VariableDescriptionTypeExample Value
ORIGINSet the http(s) origin for Weaviatestring - HTTP originhttps://my-weaviate-deployment.com
CONTEXTIONARY_URLService-Discovery for the contextionary containerstring - URLhttp://contextionary
PERSISTENCE_DATA_PATHWhere should Weaviate Standalone store its data?string - file path/var/lib/weaviate
DEFAULT_VECTORIZER_MODULEDefault vectorizer module, so this doens’t need to be defined per class in the schemastringtext2vec-contextionary
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLEDAllow users to interact with weaviate without authstring - true/falsetrue
AUTHENTICATION_OIDC_ENABLEDEnable OIDC Authstring - true/falsefalse
AUTHENTICATION_OIDC_ISSUEROIDC Token Issuerstring - URLhttps://myissuer.com
AUTHENTICATION_OIDC_CLIENT_IDOIDC Client IDstringmy-client-id
AUTHENTICATION_OIDC_USERNAME_CLAIMOIDC Username Claimstringemail
AUTHENTICATION_OIDC_GROUPS_CLAIMOIDC Groups Claimstringgroups
AUTHORIZATION_ADMINLIST_ENABLEDEnable AdminList Authorization modestring - true/falsetrue
AUTHORIZATION_ADMINLIST_USERSUsers with admin permissionstring - comma-separated listjane@example.com,john@example.com
AUTHORIZATION_ADMINLIST_READONLY_USERSUsers with read-only permissionstring - comma-separated listalice@example.com,dave@example.com

Attaching to the log output of only Weaviate

The output is quite verbose. You can attach the logs only to Weaviate itself, for example by running the following command instead of docker-compose up:

# Run Docker Compose
$ docker-compose up -d && docker-compose logs -f weaviate

Alternatively you can run docker-compose entirely detached with_ docker-compose up -d _and poll {bindaddress}:{port}/v1/meta until you receive status 200 OK.

Manual installation

You can also download the files manually if you have trouble with the above script.

  1. $ mkdir weaviate && cd weaviate
  2. Save the docker-compose configuration file as docker-compose.yml.
  3. Run docker-compose up in the same location you’ve downloaded the files (or for less verbose, attach to the log output of only Weaviate).

Cloud deployment

Weaviate is available on Google Cloud Marketplace, where you can find more details on deployment on the cloud.

Weaviate Cloud Service

You can create a free Weaviate sandbox cluster that lasts for 5 days completely for free. You can try it out here and if you do, we would love to hear your feedback.

Kubernetes

Note I: the Kubernetes setup is only for large scale deployments of Weaviate. In case you want to work with smaller deployments, you can always user Docker Compose or deployment on the cloud.

Note II: tested until Kubernetes 1.14.x

Note III: In case your are running a very small setup. We would advice to use Docker Compose, but you can also this minimal configuration.

To run Weaviate with Kubernetes take the following steps.

# Check if helm is installed
$ helm version
# Check if pods are running properly
$ kubectl -n kube-system get pods

Get the Helm Chart

Get the Helm chart and configuration files.

# Set the Weaviate chart version
export CHART_VERSION="v12.0.0"
# Download Helm charts
wget https://github.com/semi-technologies/weaviate-helm/releases/download/$CHART_VERSION/weaviate.tgz
# Download configuration values
wget https://raw.githubusercontent.com/semi-technologies/weaviate-helm/$CHART_VERSION/weaviate/values.yaml

K8s configuration

In the values.yaml file you can tweak the configuration to align it with your setup. The yaml file is extensively documented to help you align the configuration with your setup.

Out of the box, the configuration file is setup for:

  • 1 Weaviate replica.
  • anonymous_access = enabled.
  • 3 esvector replicas.
  • 3 etcd replicas.

As a rule of thumb, you can:

  • increase Weaviate replicas if you have a high load.
  • increase esvector replicas if you have a high load and/or a lot of data.

Deploy

You can deploy the helm charts as follows:

# Create a Weaviate namespace
$ kubectl create namespace weaviate
# Deploy
$ helm upgrade \
  "weaviate" \
  weaviate.tgz \
  --install \
  --namespace "weaviate" \
  --values ./values.yaml

Additional Configuration Help

More Resources

If you can’t find the answer to your question here, please look at the:

  1. Frequently Asked Questions. Or,
  2. Knowledge base of old issues. Or,
  3. For questions: Stackoverflow. Or,
  4. For issues: Github. Or,
  5. Ask your question in the Slack channel: Slack.