text2vec-huggingface
Overview
The text2vec-huggingface
module enables Weaviate to obtain vectors using the Hugging Face Inference API.
Key notes:
- As it uses a third-party API, you will need an API key.
- Its usage may incur costs.
- Please check the inference pricing page, especially before vectorizing large amounts of data.
- This module is available on Weaviate Cloud Services (WCS).
- Enabling this module will enable the
nearText
search operator. - This module only supports sentence similarity models.
Weaviate instance configuration
This module is enabled and pre-configured on Weaviate Cloud Services.
Docker Compose file
To use text2vec-huggingface
, you must enable it in your Docker Compose file (docker-compose.yml
). You can do so manually, or create one using the Weaviate configuration tool.
Parameters
ENABLE_MODULES
(Required): The modules to enable. Includetext2vec-huggingface
to enable the module.DEFAULT_VECTORIZER_MODULE
(Optional): The default vectorizer module. You can set this totext2vec-huggingface
to make it the default for all classes.HUGGINGFACE_APIKEY
(Optional): Your Hugging Face API key. You can also provide the key at query time.
Example
This configuration enables text2vec-huggingface
, sets it as the default vectorizer, and sets the API keys.
version: '3.4'
services:
weaviate:
image: semitechnologies/weaviate:1.22.6
restart: on-failure:0
ports:
- "8080:8080"
environment:
QUERY_DEFAULTS_LIMIT: 20
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: "./data"
ENABLE_MODULES: text2vec-huggingface
DEFAULT_VECTORIZER_MODULE: text2vec-huggingface
HUGGINGFACE_APIKEY: sk-foobar # Setting this parameter is optional, you can also provide the API key at query time.
CLUSTER_HOSTNAME: 'node1'
Class configuration
You can configure how the module will behave in each class through the Weaviate schema.
API settings
Parameters
The following parameters are available for the API.
Note that you should only set one of:
model
,passageModel
andqueryModel
, orendpointURL
setting | type | description | example | notes |
---|---|---|---|---|
model | string | The model to use. Do not use with queryModel nor passageModel . | "bert-base-uncased" | Can be any public or private Hugging Face model, sentence similarity models work best for vectorization. |
passageModel | string | DPR passage model. Should be set together with queryModel , but without model . | "sentence-transformers/facebook-dpr-ctx_encoder-single-nq-base" | |
queryModel | string | DPR query model. Should be set together with passageModel , but without model . | "sentence-transformers/facebook-dpr-question_encoder-single-nq-base" | |
endpointURL | string | (Private or public) Endpoint URL to use Note: when this variable is set, the module will ignore model settings like model queryModel and passageModel . | Read more on how to deploy your own Hugging Face Inference Endpoint. | |
options.waitForModel | boolean | If the model is not ready, wait for it instead of receiving 503. | ||
options.useGPU | boolean | Use GPU instead of CPU for inference. (If your account plan supports it) | ||
options.useCache | boolean | Use the HF cache to speed up results. | If you use a non-deterministic model, you can set this parameter to prevent the caching mechanism from being used. |
Example
The following example configures the Document
class by setting the vectorizer to text2vec-huggingface
, model to sentence-transformers/all-MiniLM-L6-v2
as well as to wait for the model to load, use GPU and use the cache.
{
"classes": [
{
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-huggingface",
"moduleConfig": {
"text2vec-huggingface": {
"model": "sentence-transformers/all-MiniLM-L6-v2",
"options": {
"waitForModel": true,
"useGPU": true,
"useCache": true
}
}
}
}
]
}
Vectorization settings
You can set vectorizer behavior using the moduleConfig
section under each class and property:
Class-level
vectorizer
- what module to use to vectorize the data.vectorizeClassName
– whether to vectorize the class name. Default:true
.
Property-level
skip
– whether to skip vectorizing the property altogether. Default:false
vectorizePropertyName
– whether to vectorize the property name. Default:false
Example
{
"classes": [
{
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-huggingface",
"moduleConfig": {
"text2vec-huggingface": {
"model": "sentence-transformers/all-MiniLM-L6-v2",
"options": {
"waitForModel": true,
"useGPU": true,
"useCache": true
},
"vectorizeClassName": false
}
},
"properties": [
{
"name": "content",
"dataType": ["text"],
"description": "Content that will be vectorized",
"moduleConfig": {
"text2vec-huggingface": {
"skip": false,
"vectorizePropertyName": false
}
}
}
]
}
]
}
Query-time parameters
API key
You can supply the API key at query time by adding it to the HTTP header:
"X-Huggingface-Api-Key": "YOUR-HUGGINGFACE-API-KEY"
Additional information
API rate limits
Since this module uses your API key, your account's corresponding rate limits will also apply to the module. Weaviate will output any rate-limit related error messages generated by the API.
Import throttling
One potential solution to rate limiting would be to throttle the import within your application. We include an example below.
See code example
- Python
- Go
from weaviate import Client
import time
def configure_batch(client: Client, batch_size: int, batch_target_rate: int):
"""
Configure the weaviate client's batch so it creates objects at `batch_target_rate`.
Parameters
----------
client : Client
The Weaviate client instance.
batch_size : int
The batch size.
batch_target_rate : int
The batch target rate as # of objects per second.
"""
def callback(batch_results: dict) -> None:
# you could print batch errors here
time_took_to_create_batch = batch_size * (client.batch.creation_time/client.batch.recommended_num_objects)
time.sleep(
max(batch_size/batch_target_rate - time_took_to_create_batch + 1, 0)
)
client.batch.configure(
batch_size=batch_size,
timeout_retries=5,
callback=callback,
)
package main
import (
"context"
"time"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate/entities/models"
)
var (
// adjust to your liking
targetRatePerMin = 600
batchSize = 50
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
// replace those 10000 empty objects with your actual data
objects := make([]*models.Object, 10000)
// we aim to send one batch every tickInterval second.
tickInterval := time.Duration(batchSize/targetRatePerMinute) * time.Minute
t := time.NewTicker(tickInterval)
before := time.Now()
for i := 0; i < len(objects); i += batchSize {
// create a fresh batch
batch := client.Batch().ObjectsBatcher()
// add batchSize objects to the batch
for j := i; j < i+batchSize; j++ {
batch = batch.WithObject(objects[i+j])
}
// send off batch
res, err := batch.Do(context.Background())
// TODO: inspect result for individual errors
_ = res
// TODO: check request error
_ = err
// we wait for the next tick. If the previous batch took longer than
// tickInterval, we won't need to wait, effectively making this an
// unthrottled import.
<-t.C
}
}
Support for Hugging Face Inference Endpoints
The text2vec-huggingface
module also supports Hugging Face Inference Endpoints, where you can deploy your own model as an endpoint.
To use your own Hugging Face Inference Endpoint for vectorization with the text2vec-huggingface
module, pass the endpoint url in the class configuration as the endpointURL
setting.
Please note that only feature extraction
inference endpoint types are supported.
Usage example
- Python
- JavaScript/TypeScript
- Go
- Java
- Curl
- GraphQL
import weaviate
client = weaviate.Client(
url="http://localhost:8080",
additional_headers={
"X-HuggingFace-Api-Key": "YOUR-HUGGINGFACE-API-KEY"
}
)
nearText = {
"concepts": ["fashion"],
"distance": 0.6, # prior to v1.14 use "certainty" instead of "distance"
"moveAwayFrom": {
"concepts": ["finance"],
"force": 0.45
},
"moveTo": {
"concepts": ["haute couture"],
"force": 0.85
}
}
result = (
client.query
.get("Publication", "name")
.with_additional(["certainty OR distance"]) # note that certainty is only supported if distance==cosine
.with_near_text(nearText)
.do()
)
print(result)
import weaviate from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080',
headers: { 'X-HuggingFace-Api-Key': 'YOUR-HUGGINGFACE-API-KEY' },
});
const response = await client.graphql
.get()
.withClassName('Publication')
.withFields('name _additional{ certainty distance }') // note that certainty is only supported if distance==cosine
.withNearText({
concepts: ['fashion'],
distance: 0.6, // prior to v1.14 use certainty instead of distance
moveAwayFrom: {
concepts: ['finance'],
force: 0.45,
},
moveTo: {
concepts: ['haute couture'],
force: 0.85,
},
})
.do();
console.log(response);
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate-go-client/v4/weaviate/graphql"
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
Headers: map[string]string{"X-HuggingFace-Api-Key": "YOUR-HUGGINGFACE-API-KEY"},
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
className := "Publication"
name := graphql.Field{Name: "name"}
_additional := graphql.Field{
Name: "_additional", Fields: []graphql.Field{
{Name: "certainty"}, // only supported if distance==cosine
{Name: "distance"}, // always supported
},
}
concepts := []string{"fashion"}
distance := float32(0.6)
moveAwayFrom := &graphql.MoveParameters{
Concepts: []string{"finance"},
Force: 0.45,
}
moveTo := &graphql.MoveParameters{
Concepts: []string{"haute couture"},
Force: 0.85,
}
nearText := client.GraphQL().NearTextArgBuilder().
WithConcepts(concepts).
WithDistance(distance). // use WithCertainty(certainty) prior to v1.14
WithMoveTo(moveTo).
WithMoveAwayFrom(moveAwayFrom)
ctx := context.Background()
result, err := client.GraphQL().Get().
WithClassName(className).
WithFields(name, _additional).
WithNearText(nearText).
Do(ctx)
if err != nil {
panic(err)
}
fmt.Printf("%v", result)
}
package io.weaviate;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
import io.weaviate.client.v1.graphql.model.GraphQLResponse;
import io.weaviate.client.v1.graphql.query.argument.NearTextArgument;
import io.weaviate.client.v1.graphql.query.argument.NearTextMoveParameters;
import io.weaviate.client.v1.graphql.query.fields.Field;
import java.util.HashMap;
import java.util.Map;
public class App {
public static void main(String[] args) {
Map<String, String> headers = new HashMap<String, String>() { {
put("X-HuggingFace-Api-Key", "YOUR-HUGGINGFACE-API-KEY");
} };
Config config = new Config("http", "localhost:8080", headers);
WeaviateClient client = new WeaviateClient(config);
NearTextMoveParameters moveTo = NearTextMoveParameters.builder()
.concepts(new String[]{ "haute couture" }).force(0.85f).build();
NearTextMoveParameters moveAway = NearTextMoveParameters.builder()
.concepts(new String[]{ "finance" }).force(0.45f)
.build();
NearTextArgument nearText = client.graphQL().arguments().nearTextArgBuilder()
.concepts(new String[]{ "fashion" })
.distance(0.6f) // use .certainty(0.7f) prior to v1.14
.moveTo(moveTo)
.moveAwayFrom(moveAway)
.build();
Field name = Field.builder().name("name").build();
Field _additional = Field.builder()
.name("_additional")
.fields(new Field[]{
Field.builder().name("certainty").build(), // only supported if distance==cosine
Field.builder().name("distance").build(), // always supported
}).build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName("Publication")
.withFields(name, _additional)
.withNearText(nearText)
.run();
if (result.hasErrors()) {
System.out.println(result.getError());
return;
}
System.out.println(result.getResult());
}
}
# Note: Under nearText, use `certainty` instead of distance prior to v1.14
# Under _additional, `certainty` is only supported if distance==cosine, but `distance` is always supported
echo '{
"query": "{
Get {
Publication(
nearText: {
concepts: [\"fashion\"],
distance: 0.6,
moveAwayFrom: {
concepts: [\"finance\"],
force: 0.45
},
moveTo: {
concepts: [\"haute couture\"],
force: 0.85
}
}
) {
name
_additional {
certainty
distance
}
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer learn-weaviate' \
-H "X-HuggingFace-Api-Key: $HUGGINGFACE_API_KEY" \
-d @- \
https://edu-demo.weaviate.network/v1/graphql
{
Get{
Publication(
nearText: {
concepts: ["fashion"],
distance: 0.6 # prior to v1.14 use "certainty" instead of "distance"
moveAwayFrom: {
concepts: ["finance"],
force: 0.45
},
moveTo: {
concepts: ["haute couture"],
force: 0.85
}
}
){
name
_additional {
certainty # only supported if distance==cosine.
distance # always supported
}
}
}
}
Model license(s)
The text2vec-huggingface
module is compatible with various models, each with their own license. For detailed information, please review the license of the model you are using in the Hugging Face Model Hub.
It is your responsibility to evaluate whether the terms of its license(s), if any, are appropriate for your intended use.