Skip to main content

text2vec-huggingface

LICENSEΒ Weaviate on Stackoverflow badgeΒ Weaviate issues on Github badgeΒ Weaviate version badgeΒ Weaviate total Docker pulls badgeΒ Go Report Card

Introduction​

The text2vec-huggingface module allows you to use Hugging Face models directly in Weaviate as a vectorization module. ​When you create a Weaviate class that is set to use this module, it will automatically vectorize your data using the chosen module.

  • Note: this module uses a third-party API.
  • Note: make sure to check the Inference pricing page before vectorizing large amounts of data.
  • Note: Weaviate automatically parallelizes requests to the Inference-API when using the batch endpoint, see the previous note.
  • Note: This module only supports sentence similarity models.

How to enable​

Request a Huggingface API Token via their website.

Weaviate Cloud Service​

This module is enabled by default on the WCS.

Weaviate open source​

Here is an example Docker-compose file, which will spin up Weaviate with the Hugging Face module.

version: '3.4'
services:
weaviate:
image: semitechnologies/weaviate:1.17.2
restart: on-failure:0
ports:
- "8080:8080"
environment:
QUERY_DEFAULTS_LIMIT: 20
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: "./data"
DEFAULT_VECTORIZER_MODULE: text2vec-huggingface
ENABLE_MODULES: text2vec-huggingface
HUGGINFACE_APIKEY: sk-foobar # request a key on huggingface.co, setting this parameter is optional, you can also provide the API key on runtime
CLUSTER_HOSTNAME: 'node1'

How to configure​

​In your Weaviate schema, you must define how you want this module to vectorize your data. If you are new to Weaviate schemas, you might want to check out the quickstart tutorial on the Weaviate schema first.

The following schema configuration uses the all-MiniLM-L6-v2 model.

{
"classes": [
{
"class": "Document",
"description": "A class called document",
"moduleConfig": {
"text2vec-huggingface": {
"model": "sentence-transformers/all-MiniLM-L6-v2",
"options": {
"waitForModel": true,
"useGPU": true,
"useCache": true
}
}
},
"properties": [
{
"dataType": [
"text"
],
"description": "Content that will be vectorized",
"moduleConfig": {
"text2vec-huggingface": {
"skip": false,
"vectorizePropertyName": false
}
},
"name": "content"
}
],
"vectorizer": "text2vec-huggingface"
}
]
}

How to use​

  • When sending a request to Weaviate, you can set the API key on query time: X-Huggingface-Api-Key: <huggingface-api-key>.
  • New GraphQL vector search parameters made available by this module can be found here.

Example​

{
Get{
Publication(
nearText: {
concepts: ["fashion"],
distance: 0.6 # prior to v1.14 use "certainty" instead of "distance"
moveAwayFrom: {
concepts: ["finance"],
force: 0.45
},
moveTo: {
concepts: ["haute couture"],
force: 0.85
}
}
){
name
_additional {
certainty # only supported if distance==cosine.
distance # always supported
}
}
}
}

🟒 Try out this GraphQL example in the Weaviate Console.

Additional information​

Support for Hugging Face Inference Endpoints​

The text2vec-huggingface module also supports HuggingFace Inference Endpoints, where you can deploy your own model as an endpoint. To use your own HuggingFace Inference Endpoint for vectorization with the text2vec-huggingface module, just pass the endpoint url in the class configuration as the endpointURL setting. Please note that only feature extraction inference endpoint types are supported.

Available settings​

​In the schema, on a class level, the following settings can be added:

​settingtypedescription
modelstringThis can be any public or private Huggingface model, sentence similarity models work best for vectorization.

Don't use with queryModel nor passageModel.
"bert-base-uncased"
passageModelstringDPR passage model.

Should be set together with queryModel, but without model.
"sentence-transformers/facebook-dpr-ctx_encoder-single-nq-base"
queryModelstringDPR query model.

Should be set together with passageModel, but without model.
"sentence-transformers/facebook-dpr-question_encoder-single-nq-base"
options.waitForModelbooleanIf the model is not ready, wait for it instead of receiving 503.​
options.useGPUbooleanUse GPU instead of CPU for inference.
(requires Hugginface's Startup plan or higher)
options.useCachebooleanThere is a cache layer on the inference API to speedup requests we have already seen. Most models can use those results as is as models are deterministic (meaning the results will be the same anyway). However if you use a non-deterministic model, you can set this parameter to prevent the caching mechanism from being used resulting in a real new query.
endpointURLstringThis can be any public or private Huggingface Inference URL. To find out how to deploy your own Hugging Face Inference Endpoint click here.

Note: when this variable is set, the module will ignore model settings like model queryModel and passageModel.

More resources​

If you can't find the answer to your question here, please look at the:

  1. Frequently Asked Questions. Or,
  2. Knowledge base of old issues. Or,
  3. For questions: Stackoverflow. Or,
  4. For issues: Github. Or,
  5. Ask your question in the Slack channel: Slack.