Skip to main content

Support for Hugging Face Inference API in Weaviate

ยท 6 min read
Sebastian Witalec

Support for Hugging Face Inference API in Weaviate

Vector databases use Machine Learning models to offer incredible functionality to operate on your data. We are looking at anything from summarizers (that can summarize any text into a short) sentence), through auto-labelers (that can classify your data tokens), to transformers and vectorizers (that can convert any data โ€“ text, image, audio, etc. โ€“ into vectors and use that for context-based queries) and many more use cases.

All of these use cases require Machine Learning model inference โ€“ a process of running data through an ML model and calculating an output (e.g. take a paragraph, and summarize into to a short sentence) โ€“ which is a compute-heavy process.

The elephant in the roomโ€‹

Running model inference in production is hard.

  • It requires expensive specialized hardware.
  • You need a lot more computing power during the initial data import.
  • Hardware tends to be underutilized once the bulk of the heavy work is done.
  • Sharing and prioritizing resources with other teams is hard.

The good news is, there are companies โ€“ like Hugging Face, OpenAI, and Cohere โ€“ that offer running model inference as a service.

"Running model inference in production is hard, let them do it for you."

Support for Hugging Face Inference API in Weaviateโ€‹

Starting from Weaviate v1.15, Weaviate includes a Hugging Face module, which provides support for Hugging Face Inference straight from the vector database.

The Hugging Face module, allows you to use the Hugging Face Inference service with sentence similarity models, to vectorize and query your data, straight from Weaviate. No need to run the Inference API yourself.

You can choose between text2vec-huggingface (Hugging Face) and text2vec-openai (OpenAI) modules to delegate your model inference tasks.
Both modules are enabled by default in the Weaviate Cloud.



The Hugging Face module is quite incredible, for many reasons.

Public modelsโ€‹

You get access to over 1600 pre-trained sentence similarity models. No need to train your own models, if there is already one that works well for your use case.

In case you struggle with picking the right model, see our blog post on choosing a sentence transformer from Hugging Face.

Private modelsโ€‹

If you have your own models, trained specially for your data, then you can upload them to Hugging Face (as private modules), and use them in Weaviate.

We are working on an article that will guide you on how to create your own model and upload it to Hugging Face.

Fully automated and optimizedโ€‹

Weaviate manages the whole process for you. From the perspective of writing your code โ€“ once you have your schema configuration โ€“ you can almost forget that Hugging Face is involved at all.

For example, when you import data into Weaviate, Weaviate will automatically extract the relevant text fields, send them Hugging Face to vectorize, and store the data with the new vectors in the database.

Ready to use with a minimum of fussโ€‹

Every new Weaviate instance created with the Weaviate Cloud has the Hugging Face module enabled out of the box. You don't need to update any configs or anything, it is there ready and waiting.

On the other hand, to use the Hugging Face module in Weaviate open source (v1.15 or newer), you only need to set text2vec-huggingface as the default vectorizer. Like this:

DEFAULT_VECTORIZER_MODULE: text2vec-huggingface
ENABLE_MODULES: text2vec-huggingface

How to get startedโ€‹


This article is not meant as a hands-on tutorial. For more detailed instructions please check the documentation.

The overall process to use a Hugging Face module with Weaviate is fairly straightforward.

Recipe for using the Hugging Face module If this was a cooking class and you were following a recipe.

You would need the following ingredients:

  • Raw Data
  • Hugging Face API token โ€“ which you can request from their website
  • A working Weaviate instance with the text2vec-huggingface enabled

Then you would follow these steps.

Step 1 โ€“ initial preparation โ€“ create schema and select the hf modelsโ€‹

Once you have a Weaviate instance up and running. Define your schema (standard stuff โ€“ pick a class name, select properties, and data types). As a part of the schema definition, you also need to provide, which Hugging Face model you want to use for each schema class.

This is done by adding a moduleConfig property with the model name, to the schema definition, like this:

"class": "Notes",
"moduleConfig": {
"text2vec-huggingface": {
"model": "sentence-transformers/all-MiniLM-L6-v2", # model name
"vectorizer": "text2vec-huggingface", # vectorizer for hugging face

If you are wondering, yes, you can use a different model for each class.

Step 2 โ€“ cook for some time โ€“ import dataโ€‹

Start importing data into Weaviate.

For this, you need your Hugging Face API token, which is used to authorize all calls with ๐Ÿค—.

Add your token, to a Weaviate client configuration. For example in Python, you do it like this:

client = weaviate.Client(

Then import the data the same way as always. And Weaviate will handle all the communication with Hugging Face.

Step 3 โ€“ serving portions โ€“ querying dataโ€‹

Once, you imported some or all of the data, you can start running queries. (yes, you can start querying your database even during the import).

Running queries also requires the same token. But you can reuse the same client, so you are good to go.

Then, you just run the queries, as per usual:

nearText = {
"concepts": ["How to use Hugging Face modules with Weaviate?"],
"distance": 0.6,

result = (
.get("Notes", [
"_additional {certainty distance} "])


Now you can use Hugging Face or OpenAI modules in Weaviate to delegate model inference out.

Just pick the model, provide your API key and start working with your data.

Weaviate optimizes the communication process with the Inference API for you, so that you can focus on the challenges and requirements of your applications. No need to run the Inference API yourself.

What nextโ€‹

Check out the text2vec-huggingface documentation to learn more about the new module.

Ready to start building?โ€‹

Check out the Quickstart tutorial, and begin building amazing apps with the free trial of Weaviate Cloud (WCD).

Don't want to miss another blog post?

Sign up for our bi-weekly newsletter to stay updated!

By submitting, I agree to the Terms of Service and Privacy Policy.