License badge Weaviate on Stackoverflow badge Weaviate issues on Github badge Weaviate v1.16.5 version badge Weaviate v1.16.5 version badge Weaviate total Docker pulls badge

💡 This module is standard enabled on the Weaviate Cloud Service​


The text2vec-huggingface module allows you to use Hugging Face models directly in Weaviate as a vectorization module. ​When you create a Weaviate class that is set to use this module, it will automatically vectorize your data using the chosen module.

  • Note: this module uses a third-party API.
  • Note: make sure to check the Inference pricing page before vectorizing large amounts of data.
  • Note: Weaviate automatically parallelizes requests to the Inference-API when using the batch endpoint, see the previous note.
  • Note: This module only supports sentence similarity models.

How to enable

Request a Huggingface API Token via their website.

Weaviate Cloud Service

This module is enabled by default on the WCS.

Weaviate open source

Here is an example Docker-compose file, which will spin up Weaviate with the Hugging Face module.

version: '3.4'
    image: semitechnologies/weaviate:1.16.5
    restart: on-failure:0
     - "8080:8080"
      DEFAULT_VECTORIZER_MODULE: text2vec-huggingface
      ENABLE_MODULES: text2vec-huggingface
      HUGGINFACE_APIKEY: sk-foobar # request a key on huggingface.co, setting this parameter is optional, you can also provide the API key on runtime
      CLUSTER_HOSTNAME: 'node1'

How to configure

​In your Weaviate schema, you must define how you want this module to vectorize your data. If you are new to Weaviate schemas, you might want to check out the getting started guide on the Weaviate schema first.

The following schema configuration uses the all-MiniLM-L6-v2 model.

  "classes": [
      "class": "Document",
      "description": "A class called document",
      "moduleConfig": {
        "text2vec-huggingface": {
          "model": "sentence-transformers/all-MiniLM-L6-v2",
          "options": {
            "waitForModel": true,
            "useGPU": true,
            "useCache": true
      "properties": [
          "dataType": [
          "description": "Content that will be vectorized",
          "moduleConfig": {
            "text2vec-huggingface": {
              "skip": false,
              "vectorizePropertyName": false
          "name": "content"
      "vectorizer": "text2vec-huggingface"

How to use

  • When sending a request to Weaviate, you can set the API key on query time: X-Huggingface-Api-Key: <huggingface-api-key>.
  • New GraphQL vector search parameters made available by this module can be found here.


      nearText: {
        concepts: ["fashion"],
        distance: 0.6 # prior to v1.14 use "certainty" instead of "distance"
        moveAwayFrom: {
          concepts: ["finance"],
          force: 0.45
        moveTo: {
          concepts: ["haute couture"],
          force: 0.85
      _additional {
        certainty # only supported if distance==cosine.
        distance  # always supported
  import weaviate

client  = weaviate.Client(
        'X-HuggingFace-Api-Key': '<THE-KEY>'

nearText = {
  "concepts": ["fashion"],
  "distance": 0.6, # prior to v1.14 use "certainty" instead of "distance"
  "moveAwayFrom": {
    "concepts": ["finance"],
    "force": 0.45
  "moveTo": {
    "concepts": ["haute couture"],
    "force": 0.85

result = (
  .get("Publication", ["name", "_additional {certainty distance} "]) # note that certainty is only supported if distance==cosine

  const weaviate = require("weaviate-client");

const client = weaviate.client({
  scheme: 'http',
  host: 'localhost:8080',
  headers: {'X-HuggingFace-Api-Key': '<THE-KEY>'},

  .withFields('name _additional{certainty distance}') // note that certainty is only supported if distance==cosine
    concepts: ["fashion"],
    distance: 0.6, // prior to v1.14 use certainty instead of distance
    moveAwayFrom: {
      concepts: ["finance"],
      force: 0.45
    moveTo: {
      concepts: ["haute couture"],
      force: 0.85
  package main

import (


func main() {
  cfg := weaviate.Config{
    Host:    "localhost:8080",
    Scheme:  "http",
    Headers: map[string]string{"X-HuggingFace-Api-Key": "<THE-KEY>"},
  client := weaviate.New(cfg)

  className := "Publication"

  name := graphql.Field{Name: "name"}
  _additional := graphql.Field{
    Name: "_additional", Fields: []graphql.Field{
      {Name: "certainty"}, // only supported if distance==cosine
      {Name: "distance"},  // always supported

  concepts := []string{"fashion"}
  distance := float32(0.6)
  moveAwayFrom := &graphql.MoveParameters{
    Concepts: []string{"finance"},
    Force:    0.45,
  moveTo := &graphql.MoveParameters{
    Concepts: []string{"haute couture"},
    Force:    0.85,
  nearText := client.GraphQL().NearTextArgBuilder().
    WithDistance(distance). // use WithCertainty(certainty) prior to v1.14

  ctx := context.Background()

  result, err := client.GraphQL().Get().
    WithFields(name, _additional).

  if err != nil {
  fmt.Printf("%v", result)
  package technology.semi.weaviate;

import technology.semi.weaviate.client.Config;
import technology.semi.weaviate.client.WeaviateClient;
import technology.semi.weaviate.client.base.Result;
import technology.semi.weaviate.client.v1.graphql.model.GraphQLResponse;
import technology.semi.weaviate.client.v1.graphql.query.argument.NearTextArgument;
import technology.semi.weaviate.client.v1.graphql.query.argument.NearTextMoveParameters;
import technology.semi.weaviate.client.v1.graphql.query.fields.Field;

import java.util.HashMap;
import java.util.Map;

public class App {
  public static void main(String[] args) {
    Map<String, String> headers = new HashMap<String, String>() { {
      put("X-HuggingFace-Api-Key", "<THE-KEY>");
    } };
    Config config = new Config("http", "localhost:8080", headers);
    WeaviateClient client = new WeaviateClient(config);

    NearTextMoveParameters moveTo = NearTextMoveParameters.builder()
      .concepts(new String[]{ "haute couture" }).force(0.85f).build();

    NearTextMoveParameters moveAway = NearTextMoveParameters.builder()
      .concepts(new String[]{ "finance" }).force(0.45f)

    NearTextArgument nearText = client.graphQL().arguments().nearTextArgBuilder()
      .concepts(new String[]{ "fashion" })
      .distance(0.6f) // use .certainty(0.7f) prior to v1.14

    Field name = Field.builder().name("name").build();
    Field _additional = Field.builder()
      .fields(new Field[]{
        Field.builder().name("certainty").build(), // only supported if distance==cosine
        Field.builder().name("distance").build(),  // always supported

    Result<GraphQLResponse> result = client.graphQL().get()
      .withFields(name, _additional)

    if (result.hasErrors()) {
  $ echo '{
  "query": "{
        nearText: {
          concepts: [\"fashion\"],
          distance: 0.6, // use certainty instead of distance prior to v1.14
          moveAwayFrom: {
            concepts: [\"finance\"],
            force: 0.45
          moveTo: {
            concepts: [\"haute couture\"],
            force: 0.85
        _additional {
          certainty // only supported if distance==cosine
          distance  // always supported
}' | curl \
    -X POST \
    -H 'Content-Type: application/json' \
    -H "X-HuggingFace-Api-Key: <THE-KEY>" \
    -d @- \

🟢 Click here to try out this graphql example in the Weaviate Console.

Additional information

Support for Hugging Face Inference Endpoints

The text2vec-huggingface module also supports HuggingFace Inference Endpoints, where you can deploy your own model as an endpoint. To use your own HuggingFace Inference Endpoint for vectorization with the text2vec-huggingface module, just pass the endpoint url in the class configuration as the endpointURL setting. Please note that only feature extraction inference endpoint types are supported.

Available settings

​In the schema, on a class level, the following settings can be added:

 modelstringThis can be any public or private Huggingface model, sentence similarity models work best for vectorization.

Don’t use with queryModel nor passageModel.
 passageModelstringDPR passage model.

Should be set together with queryModel, but without model.
 queryModelstringDPR query model.

Should be set together with passageModel, but without model.
 options.waitForModelbooleanIf the model is not ready, wait for it instead of receiving 503.​ 
 options.useGPUbooleanUse GPU instead of CPU for inference.
(requires Hugginface’s Startup plan or higher)
 options.useCachebooleanThere is a cache layer on the inference API to speedup requests we have already seen. Most models can use those results as is as models are deterministic (meaning the results will be the same anyway). However if you use a non-deterministic model, you can set this parameter to prevent the caching mechanism from being used resulting in a real new query. 
 endpointURLstringThis can be any public or private Huggingface Inference URL. To find out how to deploy your own Hugging Face Inference Endpoint click here.

Note: when this variable is set, the module will ignore model settings like model queryModel and passageModel.

More resources

If you can’t find the answer to your question here, please look at the:

  1. Frequently Asked Questions. Or,
  2. Knowledge base of old issues. Or,
  3. For questions: Stackoverflow. Or,
  4. For issues: Github. Or,
  5. Ask your question in the Slack channel: Slack.