Skip to main content

Locally Hosted Transformers Embeddings + Weaviate (Custom Image)

New Documentation

The model provider integration pages are new and still undergoing improvements. We appreciate any feedback on this forum thread.

Weaviate's integration with the Hugging Face Transformers library allows you to access their models' capabilities directly from Weaviate.

Configure a Weaviate vector index to use the Transformers integration, and configure the Weaviate instance with a model image, and Weaviate will generate embeddings for various operations using the specified model in the Transformers inference container. This feature is called the vectorizer.

This page shows how to build a custom Transformers model image and configure Weaviate with it, for users whose desired model is not available in the pre-built images.

Once a custom image is built and configured, usage patterns are identical to the pre-built images.

Build a custom Transformers model image

You can build a custom Transformers model image to use with Weaviate. This can be a public model from the Hugging Face model hub, or a compatible private or local model.

Embedding (also called 'feature extraction') models from the Hugging Face model hub can be used with Weaviate by building a custom Docker image.

The steps to build a custom image are:

Create a Dockerfile

The Dockerfile to create depends on whether you are using a public model from the Hugging Face model hub, or a private or local model.

This example creates a custom image for the distilroberta-base model. Replace distilroberta-base with the model name you want to use.


To build an image with a model from the Hugging Face Hub, create a new Dockerfile similar to the following.


Save the Dockerfile as my-inference-image.Dockerfile. (You can name it anything you like.)


FROM semitechnologies/transformers-inference:custom
RUN MODEL_NAME=distilroberta-base ./download.py

Build and tag the Dockerfile.

Tag the Dockerfile with a name, for example my-inference-image:

docker build -f my-inference-image.Dockerfile -t my-inference-image .

(Optional) Push the image to a registry

If you want to use the image in a different environment, you can push it to a Docker registry:

docker push my-inference-image

Use the image

Specify the image in your Weaviate configuration, such as in docker-compose.yml, using the chosen local Docker tag (e.g. my-inference-image), or the image from the registry.

(Optional) Use the sentence-transformers vectorizer

Experimental feature

This is an experimental feature. Use with caution.

When using a custom image, you may set the USE_SENTENCE_TRANSFORMERS_VECTORIZER environment variable to use the sentence-transformers vectorizer instead of the default vectorizer from the transformers library.

Configure the Weaviate instance

Once you have built and configured the custom Transformers model image, continue on to the Transformers embeddings integrations guide to use the model with Weaviate.

Following the above example, set the image parameter in the t2v-transformers service as the name of the custom image, e.g. my-inference-image.

(Optional) Test the inference container

Once the inference container is configured and running, you can send queries it directly to test its functionality.

First, expose the inference container. If deployed using Docker, forward the port by adding the following to the t2v-transformers service in your docker-compose.yml:

version: '3.4'
services:
weaviate:
# Additional settings not shown
t2v-transformers:
# Additional settings not shown
ports:
- "9090:8080" # Add this line to expose the container

Once the container is running and exposed, you can send REST requests to it directly, e.g.:

curl localhost:9090/vectors -H 'Content-Type: application/json' -d '{"text": "foo bar"}'

If the container is running and configured correctly, you should receive a response with the vector representation of the input text.

External resources

If you have any questions or feedback, let us know in the user forum.