text2vec-transformers module allows you to use a pre-trained language transformer model as Weaviate vectorization module. Transformer models differ from the Contextionary as they allow you to plug in a pretrained NLP module specific to your use case. This means models like
DilstilROBERTa, etc. can be used out-of-the box with Weaviate. Transformer models handle text as sequential data, which is a different learning method than the text2vec-contextionary.
To use transformers with weaviate the
text2vec-transformers module needs to be enabled. The models are encapsulated in Docker containers. This allows for efficient scaling and resource planning. Neural-Network-based models run most efficiently on GPU-enabled serves, yet Weaviate is CPU-optimized. This separate-container microservice setup allows you to very easily host (and scale) the model independently on GPU-enabled hardware while keeping Weaviate on cheap CPU-only hardware.
To choose your specific model, you simply need to select the correct Docker container. There is a selection of pre-built Docker images available, but you can also build your own with a simple two-line Dockerfile.
How to use
Note: you can also use the Weaviate configuration tool.
Option 1: With an example docker-compose file
You can find an example Docker-compose file below, which will spin up Weaviate with the transformers module. In this example we have selected the
sentence-transformers/msmarco-distilroberta-base-v2 which works great for asymmetric semantic search. See below for how to select an alternative model.
version: '3.4' services: weaviate: image: semitechnologies/weaviate:1.2.1 restart: on-failure:0 ports: - "8080:8080" environment: QUERY_DEFAULTS_LIMIT: 20 AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true' PERSISTENCE_DATA_PATH: "./data" DEFAULT_VECTORIZER_MODULE: text2vec-transformers ENABLE_MODULES: text2vec-transformers TRANSFORMERS_INFERENCE_API: http://t2v-transformers:8080 t2v-transformers: image: sentence-transformers/msmarco-distilroberta-base-v2 environment: ENABLE_CUDA: 0 # set to 1 to enable
Note that running a Weaviate with a the text2vec-transformer module without GPU will be slow. Enable CUDA if you have a GPU available (
Note: at the moment, text vectorization modules cannot be combined in a single setup. This means that you can either enable the
text2vec-transformers or no text vectorization module.
Option 2: Configure your custom setup
Step 1: Enable the
Make sure you set the
ENABLE_MODULES=text2vec-transformers environment variable. Additionally make this module the default vectorizer, so you don’t have to specify it on each schema class:
Important: This setting is now a requirement, if you plan on using any module. So, when using the
text2vec-contextionary module, you need to have
ENABLE_MODULES=text2vec-contextionary set. All our configuration-generators / Helm charts will be updated as part of the Weaviate
Step 2: Run your favorite model
Choose any of our pre-built transformers models (for building your own model container, see below) and spin it up (for example using
docker run -itp "8000:8080" semitechnologies/transformers-inference:sentence-transformers-msmarco-distilroberta-base-v2) . Use a CUDA-enabled machine for optimal performance.
Step 3: Tell Weaviate where to find the inference
Set the Weaviate environment variable
TRANSFORMERS_INFERENCE_API to where your inference container is running, for example
You can now use Weaviate normally and all vectorization during import and search time will be done with the selected transformers model.
You can download a selection of pre-built images directly from Dockerhub. We have chosen publically available models that in our opinion are well suited for semantic search.
The pre-built models include:
|Model Name||Image Name|
The above image names always point to the latest version of the inference
container including the model. You can also make that explicit by appending
-latest to the image name. Additionally, you can pin the version to one of
the existing git tags of this repository. E.g. to pin
1.0.0, you can use
Your favorite model is not included? Open a pull-request to include it or build a custom image as outlined below.
Run with any transformers module
You have three options to select your desired model:
- Use any of our pre-built transformers model containers The models selected in this list have proven to work well with semantic search in the past. (If you think we should support another model out-of-the-box please open an issue or pull request here.
- Use any model from Hugging Face Model Hub. Click here to learn how.
- Use any PyTorch or Tensorflow model from your local disk. Click here to learn how.
Transformers-specific module configuration (on classes and properties)
You can use the same module-configuration on your classes and properties which you already know from the
text2vec-contextionary module. This includes
In addition you can use a class-level module config to select the pooling strategy with
poolingStrategy. Allowed values are
cls. They refer to different techniques to obtain a sentence-vector from individual word vectors as outlined in the Sentence-BERT paper.
Custom build with a private or local model
You can build a docker image which supports any model which is compatible with
In the following example, we are going to build a custom image for a non-public
model which we have locally stored at
Create a new
Dockerfile (you do not need to clone this repository, any folder
on your machine is fine), we will name it
my-model.Dockerfile. Add the
following lines to it:
FROM semitechnologies/transformers-inference:custom COPY ./my-model /app/models/model
The above will make sure that your model end ups in the image at
/app/models/model. This path is important, so that the application can find the
Now you just need to build and tag your Dockerfile, we will tag it as
$ docker build -f my-model.Dockerfile -t my-model-inference .
That’s it! You can now push your image to your favorite registry or reference
it locally in your Weaviate
docker-compose.yaml using the docker tag
If you can’t find the answer to your question here, please look at the: