Skip to main content


LICENSEย Weaviate on Stackoverflow badgeย Weaviate issues on GitHub badgeย Weaviate version badgeย Weaviate total Docker pulls badgeย Go Report Card


The text2vec-huggingface module allows you to use Hugging Face models directly in Weaviate as a vectorization module. When you create a Weaviate class that is set to use this module, it will automatically vectorize your data using the chosen module.

  • Note: this module uses a third-party API.
  • Note: make sure to check the Inference pricing page before vectorizing large amounts of data.
  • Note: Weaviate automatically parallelizes requests to the Inference API when using the batch endpoint.
  • Note: This module only supports sentence similarity models.

How to enableโ€‹

Request a Hugging Face API Token via their website.

Weaviate Cloud Servicesโ€‹

This module is enabled by default on the WCS.

Weaviate open sourceโ€‹

Here is an example Docker-compose file, which will spin up Weaviate with the Hugging Face module.

version: '3.4'
image: semitechnologies/weaviate:1.19.6
restart: on-failure:0
- "8080:8080"
DEFAULT_VECTORIZER_MODULE: text2vec-huggingface
ENABLE_MODULES: text2vec-huggingface
HUGGINGFACE_APIKEY: sk-foobar # request a key on, setting this parameter is optional, you can also provide the API key at runtime

How to configureโ€‹

In your Weaviate schema, you must define how you want this module to vectorize your data. If you are new to Weaviate schemas, you might want to check out the tutorial on the Weaviate schema first.

For example, the following schema configuration will set Weaviate to vectorize the Document class with text2vec-huggingface using the all-MiniLM-L6-v2 model.

"classes": [
"class": "Document",
"description": "A class called document",
"moduleConfig": {
"text2vec-huggingface": {
"model": "sentence-transformers/all-MiniLM-L6-v2",
"options": {
"waitForModel": true,
"useGPU": true,
"useCache": true
"properties": [
"dataType": [
"description": "Content that will be vectorized",
"moduleConfig": {
"text2vec-huggingface": {
"skip": false,
"vectorizePropertyName": false
"name": "content"
"vectorizer": "text2vec-huggingface"


  • If the Hugging Face API key is not set in the text2vec-huggingface module, you can set the API key at query time by adding the following to the HTTP header: X-Huggingface-Api-Key: YOUR-HUGGINGFACE-API-KEY.
  • Using this module will enable GraphQL vector search operators.


nearText: {
concepts: ["fashion"],
distance: 0.6 # prior to v1.14 use "certainty" instead of "distance"
moveAwayFrom: {
concepts: ["finance"],
force: 0.45
moveTo: {
concepts: ["haute couture"],
force: 0.85
_additional {
certainty # only supported if distance==cosine.
distance # always supported

Additional informationโ€‹

Support for Hugging Face Inference Endpointsโ€‹

The text2vec-huggingface module also supports Hugging Face Inference Endpoints, where you can deploy your own model as an endpoint. To use your own Hugging Face Inference Endpoint for vectorization with the text2vec-huggingface module, just pass the endpoint url in the class configuration as the endpointURL setting. Please note that only feature extraction inference endpoint types are supported.

Available settingsโ€‹

In the schema, on a class level, the following settings can be added:

modelstringThis can be any public or private Hugging Face model, sentence similarity models work best for vectorization.

Don't use with queryModel nor passageModel.
passageModelstringDPR passage model.

Should be set together with queryModel, but without model.
queryModelstringDPR query model.

Should be set together with passageModel, but without model.
options.waitForModelbooleanIf the model is not ready, wait for it instead of receiving 503.
options.useGPUbooleanUse GPU instead of CPU for inference.
(requires Hugginface's Startup plan or higher)
options.useCachebooleanThere is a cache layer on the inference API to speedup requests we have already seen. Most models can use those results as is as models are deterministic (meaning the results will be the same anyway). However if you use a non-deterministic model, you can set this parameter to prevent the caching mechanism from being used resulting in a real new query.
endpointURLstringThis can be any public or private Hugging Face Inference URL. To find out how to deploy your own Hugging Face Inference Endpoint click here.

Note: when this variable is set, the module will ignore model settings like model queryModel and passageModel.

More resourcesโ€‹

If you can't find the answer to your question here, please look at the:

  1. Frequently Asked Questions. Or,
  2. Knowledge base of old issues. Or,
  3. For questions: Stackoverflow. Or,
  4. For more involved discussion: Weaviate Community Forum. Or,
  5. We also have a Slack channel.