Skip to main content


This section of the documentation is deprecated and will be removed in the future.
See the relevant model provider integration page for the most up-to-date information.


The text2vec-huggingface module enables Weaviate to obtain vectors using the Hugging Face Inference API.

Key notes:

  • As it uses a third-party API, you will need an API key.
  • Its usage may incur costs.
    • See the inference pricing page, especially before vectorizing large amounts of data.
  • This module is available on Weaviate Cloud (WCD).
  • Enabling this module will enable the nearText search operator.
  • This module only supports sentence similarity models.

Weaviate instance configuration


If you use Weaviate Cloud (WCD), this module is already enabled and pre-configured. You cannot edit the configuration in WCD.

Docker Compose file

To use text2vec-huggingface, you must enable it in your Docker Compose file (docker-compose.yml). You can do so manually, or create one using the Weaviate configuration tool.


  • ENABLE_MODULES (Required): The modules to enable. Include text2vec-huggingface to enable the module.
  • DEFAULT_VECTORIZER_MODULE (Optional): The default vectorizer module. You can set this to text2vec-huggingface to make it the default for all classes.
  • HUGGINGFACE_APIKEY (Optional): Your Hugging Face API key. You can also provide the key at query time.


This configuration enables text2vec-huggingface, sets it as the default vectorizer, and sets the API keys.

version: '3.4'
restart: on-failure:0
- 8080:8080
- 50051:50051
ENABLE_MODULES: text2vec-huggingface
DEFAULT_VECTORIZER_MODULE: text2vec-huggingface
HUGGINGFACE_APIKEY: sk-foobar # Setting this parameter is optional, you can also provide the API key at query time.

Class configuration

You can configure how the module will behave in each class through the Weaviate schema.

API settings


The following parameters are available for the API.

Note that you should only set one of:

  • model,
  • passageModel and queryModel, or
  • endpointURL
modelstringThe model to use. Do not use with queryModel nor passageModel."bert-base-uncased"Can be any public or private Hugging Face model, sentence similarity models work best for vectorization.
passageModelstringDPR passage model.

Should be set together with queryModel, but without model.
queryModelstringDPR query model.

Should be set together with passageModel, but without model.
endpointURLstring(Private or public) Endpoint URL to use

Note: when this variable is set, the module will ignore model settings like model queryModel and passageModel.
Read more on how to deploy your own Hugging Face Inference Endpoint.
options.waitForModelbooleanIf the model is not ready, wait for it instead of receiving 503.
options.useGPUbooleanUse GPU instead of CPU for inference.
(If your account plan supports it)
options.useCachebooleanUse the HF cache to speed up results.If you use a non-deterministic model, you can set this parameter to prevent the caching mechanism from being used.


The following example configures the Document class by setting the vectorizer to text2vec-huggingface, model to sentence-transformers/all-MiniLM-L6-v2 as well as to wait for the model to load, use GPU and use the cache.

"classes": [
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-huggingface",
"moduleConfig": {
"text2vec-huggingface": {
"model": "sentence-transformers/all-MiniLM-L6-v2",
"options": {
"waitForModel": true,
"useGPU": true,
"useCache": true

Vectorization settings

You can set vectorizer behavior using the moduleConfig section under each class and property:


  • vectorizer - what module to use to vectorize the data.
  • vectorizeClassName – whether to vectorize the class name. Default: true.


  • skip – whether to skip vectorizing the property altogether. Default: false
  • vectorizePropertyName – whether to vectorize the property name. Default: false


"classes": [
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-huggingface",
"moduleConfig": {
"text2vec-huggingface": {
"model": "sentence-transformers/all-MiniLM-L6-v2",
"options": {
"waitForModel": true,
"useGPU": true,
"useCache": true
"vectorizeClassName": false
"properties": [
"name": "content",
"dataType": ["text"],
"description": "Content that will be vectorized",
"moduleConfig": {
"text2vec-huggingface": {
"skip": false,
"vectorizePropertyName": false

Query-time parameters

API key

You can supply the API key at query time by adding it to the HTTP header:

  • "X-Huggingface-Api-Key": "YOUR-HUGGINGFACE-API-KEY"

Additional information

API rate limits

Since this module uses your API key, your account's corresponding rate limits will also apply to the module. Weaviate will output any rate-limit related error messages generated by the API.

Import throttling

One potential solution to rate limiting would be to throttle the import within your application. We include an example below.

See code example
from weaviate import Client
import time

def configure_batch(client: Client, batch_size: int, batch_target_rate: int):
Configure the weaviate client's batch so it creates objects at `batch_target_rate`.

client : Client
The Weaviate client instance.
batch_size : int
The batch size.
batch_target_rate : int
The batch target rate as # of objects per second.

def callback(batch_results: dict) -> None:

# you could print batch errors here
time_took_to_create_batch = batch_size * (client.batch.creation_time/client.batch.recommended_num_objects)
max(batch_size/batch_target_rate - time_took_to_create_batch + 1, 0)


Support for Hugging Face Inference Endpoints

The text2vec-huggingface module also supports Hugging Face Inference Endpoints, where you can deploy your own model as an endpoint.

To use your own Hugging Face Inference Endpoint for vectorization with the text2vec-huggingface module, pass the endpoint url in the class configuration as the endpointURL setting.

Note that only the feature extraction inference endpoint types are supported.

Usage example

import weaviate
from weaviate.classes.query import MetadataQuery, Move
import os

client = weaviate.connect_to_local(

publications = client.collections.get("Publication")

response = publications.query.near_text(
move_to=Move(force=0.85, concepts="haute couture"),
move_away=Move(force=0.45, concepts="finance"),

for o in response.objects:


Model license(s)

The text2vec-huggingface module is compatible with various models, each with their own license. For detailed information, see the license of the model you are using in the Hugging Face Model Hub.

It is your responsibility to evaluate whether the terms of its license(s), if any, are appropriate for your intended use.

Questions and feedback

If you have any questions or feedback, let us know in the user forum.