Modules
Weaviate's functionality can be customized by using modules. This page explains how to enable and configure modules.
Instance-level configuration
At the instance (i.e. Weaviate cluster) level, you can:
- Enable modules
- Configure the default vectorizer module
- Configure module-specific variables (e.g. API keys), where applicable
This can be done by setting the appropriate environment variables as shown below.
Weaviate Cloud (WCD) instances come with modules pre-configured. See this page for details.
Enable individual modules
You can enable modules by specifying the list of modules in the ENABLE_MODULES
variable. For example, this code enables the text2vec-transformers
module.
services:
weaviate:
environment:
ENABLE_MODULES: 'text2vec-transformers'
To enable multiple modules, add them in a comma-separated list.
This example code enables the 'text2vec-huggingface
, generative-cohere
, and qna-openai
modules.
services:
weaviate:
environment:
ENABLE_MODULES: 'text2vec-huggingface,generative-cohere,qna-openai'
Enable all API-based modules
Available starting in v1.26.0
. This is an experimental feature. Use with caution.
You can enable all API-based modules by setting the ENABLE_API_BASED_MODULES
variable to true
. This will enable all API-based model integrations, such as those for Anthropic, Cohere, OpenAI and so on by enabling the relevant modules. These modules are lightweight, so enabling them all will not significantly increase resource usage.
services:
weaviate:
environment:
ENABLE_API_BASED_MODULES: 'true'
The list of API-based modules can be found on the model provider integrations page. You can also inspect the source code where the list is defined.
This can be combined with enabling individual modules. For example, the example below enables all API-based modules, Ollama modules and the backup-s3
module.
services:
weaviate:
environment:
ENABLE_API_BASED_MODULES: 'true'
ENABLE_MODULES: 'text2vec-ollama,generative-ollama,backup-s3'
Note that enabling multiple vectorizer (e.g. text2vec
, multi2vec
) modules will disable the Explore
functionality. If you need to use Explore
, you should only enable one vectorizer module.
Module-specific variables
You may need to specify additional environment variables to configure each module where applicable. For example, the backup-s3
module requires the backup S3 bucket to be set via BACKUP_S3_BUCKET
, and the text2vec-contextionary
module requires the inference API location via TRANSFORMERS_INFERENCE_API
.
Refer to the individual module documentation for more details.
Vectorizer modules
The vectorization modules enable Weaviate to vectorize data at import, and to perform near<Media>
searches such as nearText
or nearImage
.
xxx2vec-xxx
) modulesCan be found in this section.
Enable vectorizer modules
You can enable vectorizer modules by adding them to the ENABLE_MODULES
environment variable. For example, this code enables the text2vec-cohere
, text2vec-huggingface
, and text2vec-openai
vectorizer modules.
services:
weaviate:
environment:
ENABLE_MODULES: 'text2vec-cohere,text2vec-huggingface,text2vec-openai'
Default vectorizer module
You can specify a default vectorization module with the DEFAULT_VECTORIZER_MODULE
variable as below.
If a default vectorizer module is not set, you must set a vectorizer in the schema before you can use near<Media>
or vectorization at import time.
This code sets text2vec-huggingface
as the default vectorizer. Thus, text2vec-huggingface
module will be used unless another vectorizer is specified for that class.
services:
weaviate:
environment:
DEFAULT_VECTORIZER_MODULE: text2vec-huggingface
Generative modules
The generative modules enable generative search functions.
generative-xxx
) modulesCan be found in this section.
Enable a generative module
You can enable generative modules by adding the desired module to the ENABLE_MODULES
environment variable. For example, this code enables the generative-cohere
module and the text2vec-huggingface
vectorizer module.
services:
weaviate:
environment:
ENABLE_MODULES: 'text2vec-huggingface,generative-cohere'
generative
module selection unrelated to text2vec
module selectionYour choice of the text2vec
module does not restrict your choice of generative
module, or vice versa.
Tenant offload modules
offload-s3
module
The offload-s3
module enables you to offload tenants to an S3 bucket.
To use the offload-s3
module, add offload-s3
to the ENABLE_MODULES
environment variable.
services:
weaviate:
environment:
ENABLE_MODULES: 'text2vec-cohere,generative-cohere,offload-s3'
The offload-s3
module reads the following environment variables:
OFFLOAD_S3_BUCKET
: The S3 bucket whereINACTIVE
tenants are offloaded.- The default is
weaviate-offload
. - If the bucket does not exist, and
OFFLOAD_S3_BUCKET_AUTO_CREATE
is set totrue
, Weaviate creates the bucket automatically.
- The default is
OFFLOAD_S3_BUCKET_AUTO_CREATE
: Whentrue
, Weaviate automatically creates an S3 bucket if it does not exist. The default isfalse
.OFFLOAD_S3_CONCURRENCY
: The number of concurrent offload operations. The default is25
.OFFLOAD_TIMEOUT
: The timeout for offloading operations (create bucket, upload, download). The default is120
(in seconds)- Offload operations are asynchronous. As a result, the timeout is for the operation to start, not to complete.
- Each operation will retry up to 10 times on timeouts, except on authentication/authorization errors.
The Weaviate instance must have the necessary permissions to access the S3 bucket.
- The provided AWS identity must be able to write to the bucket.
- If
OFFLOAD_S3_BUCKET_AUTO_CREATE
is set totrue
, the AWS identity must have permission to create the bucket.
Custom modules
See here how you can create and use your own modules.
Related pages
Questions and feedback
If you have any questions or feedback, let us know in the user forum.