Skip to main content

Modules

Weaviate's functionality can be customized by using modules. This page explains how to enable and configure modules.

Instance-level configuration

At the instance (i.e. Weaviate cluster) level, you can:

  • Enable modules
  • Configure the default vectorizer module
  • Configure module-specific variables (e.g. API keys), where applicable

This can be done by setting the appropriate environment variables as shown below.

What about WCD?

Weaviate Cloud (WCD) instances come with modules pre-configured. See this page for details.

Enable individual modules

You can enable modules by specifying the list of modules in the ENABLE_MODULES variable. For example, this code enables the text2vec-transformers module.

services:
weaviate:
environment:
ENABLE_MODULES: 'text2vec-transformers'

To enable multiple modules, add them in a comma-separated list.

This example code enables the 'text2vec-huggingface, generative-cohere, and qna-openai modules.

services:
weaviate:
environment:
ENABLE_MODULES: 'text2vec-huggingface,generative-cohere,qna-openai'

Enable all API-based modules

Experimental feature

Available starting in v1.26.0. This is an experimental feature. Use with caution.

You can enable all API-based modules by setting the ENABLE_API_BASED_MODULES variable to true. This will enable all API-based model integrations, such as those for Anthropic, Cohere, OpenAI and so on by enabling the relevant modules. These modules are lightweight, so enabling them all will not significantly increase resource usage.

services:
weaviate:
environment:
ENABLE_API_BASED_MODULES: 'true'

The list of API-based modules can be found on the model provider integrations page. You can also inspect the source code where the list is defined.

This can be combined with enabling individual modules. For example, the example below enables all API-based modules, Ollama modules and the backup-s3 module.

services:
weaviate:
environment:
ENABLE_API_BASED_MODULES: 'true'
ENABLE_MODULES: 'text2vec-ollama,generative-ollama,backup-s3'

Note that enabling multiple vectorizer (e.g. text2vec, multi2vec) modules will disable the Explore functionality. If you need to use Explore, you should only enable one vectorizer module.

Module-specific variables

You may need to specify additional environment variables to configure each module where applicable. For example, the backup-s3 module requires the backup S3 bucket to be set via BACKUP_S3_BUCKET, and the text2vec-contextionary module requires the inference API location via TRANSFORMERS_INFERENCE_API.

Refer to the individual module documentation for more details.

Vectorizer modules

The vectorization modules enable Weaviate to vectorize data at import, and to perform near<Media> searches such as nearText or nearImage.

List of available vectorizer (xxx2vec-xxx) modules

Can be found in this section.

Enable vectorizer modules

You can enable vectorizer modules by adding them to the ENABLE_MODULES environment variable. For example, this code enables the text2vec-cohere, text2vec-huggingface, and text2vec-openai vectorizer modules.

services:
weaviate:
environment:
ENABLE_MODULES: 'text2vec-cohere,text2vec-huggingface,text2vec-openai'

Default vectorizer module

You can specify a default vectorization module with the DEFAULT_VECTORIZER_MODULE variable as below.

If a default vectorizer module is not set, you must set a vectorizer in the schema before you can use near<Media> or vectorization at import time.

This code sets text2vec-huggingface as the default vectorizer. Thus, text2vec-huggingface module will be used unless another vectorizer is specified for that class.

services:
weaviate:
environment:
DEFAULT_VECTORIZER_MODULE: text2vec-huggingface

Generative modules

The generative modules enable generative search functions.

List of available generative (generative-xxx) modules

Can be found in this section.

Enable a generative module

You can enable generative modules by adding the desired module to the ENABLE_MODULES environment variable. For example, this code enables the generative-cohere module and the text2vec-huggingface vectorizer module.

services:
weaviate:
environment:
ENABLE_MODULES: 'text2vec-huggingface,generative-cohere'
generative module selection unrelated to text2vec module selection

Your choice of the text2vec module does not restrict your choice of generative module, or vice versa.

Tenant offload modules

offload-s3 module

The offload-s3 module enables you to offload tenants to an S3 bucket.

To use the offload-s3 module, add offload-s3 to the ENABLE_MODULES environment variable.

services:
weaviate:
environment:
ENABLE_MODULES: 'text2vec-cohere,generative-cohere,offload-s3'

The offload-s3 module reads the following environment variables:

  • OFFLOAD_S3_BUCKET: The S3 bucket where INACTIVE tenants are offloaded.
    • The default is weaviate-offload.
    • If the bucket does not exist, and OFFLOAD_S3_BUCKET_AUTO_CREATE is set to true, Weaviate creates the bucket automatically.
  • OFFLOAD_S3_BUCKET_AUTO_CREATE: When true, Weaviate automatically creates an S3 bucket if it does not exist. The default is false.
  • OFFLOAD_S3_CONCURRENCY: The number of concurrent offload operations. The default is 25.
  • OFFLOAD_TIMEOUT: The timeout for offloading operations (create bucket, upload, download). The default is 120 (in seconds)
    • Offload operations are asynchronous. As a result, the timeout is for the operation to start, not to complete.
    • Each operation will retry up to 10 times on timeouts, except on authentication/authorization errors.
AWS permissions

The Weaviate instance must have the necessary permissions to access the S3 bucket.

  • The provided AWS identity must be able to write to the bucket.
  • If OFFLOAD_S3_BUCKET_AUTO_CREATE is set to true, the AWS identity must have permission to create the bucket.

Custom modules

See here how you can create and use your own modules.

Questions and feedback

If you have any questions or feedback, let us know in the user forum.