multi2vec-clip
Overview
The multi2vec-clip
module enables Weaviate to obtain vectors locally from text or images using a Sentence-BERT CLIP model.
multi2vec-clip
encapsulates the model in a Docker container, which allows independent scaling on GPU-enabled hardware while keeping Weaviate on CPU-only hardware, as Weaviate is CPU-optimized.
Key notes:
- This module is not available on Weaviate Cloud Services (WCS).
- Enabling this module will enable the
nearText
andnearImage
search operators. - Model encapsulated in a Docker container.
- This module is not compatible with Auto-schema. You must define your classes manually as shown below.
Weaviate instance configuration
This module is not available on Weaviate Cloud Services.
Docker Compose file
To use multi2vec-clip
, you must enable it in your Docker Compose file (e.g. docker-compose.yml
).
While you can do so manually, we recommend using the Weaviate configuration tool to generate the Docker Compose
file.
Parameters
Weaviate:
ENABLE_MODULES
(Required): The modules to enable. Includemulti2vec-clip
to enable the module.DEFAULT_VECTORIZER_MODULE
(Optional): The default vectorizer module. You can set this tomulti2vec-clip
to make it the default for all classes.CLIP_INFERENCE_API
(Required): The URL of the inference container.
Inference container:
image
(Required): The image name of the inference container.ENABLE_CUDA
(Optional): Set to1
to enable GPU usage. Default is0
(CPU only).
Example
This configuration enables multi2vec-clip
, sets it as the default vectorizer, and sets the parameters for the Docker container, including setting it to use multi2vec-clip:sentence-transformers-clip-ViT-B-32-multilingual-v1
image and to disable CUDA acceleration.
version: '3.4'
services:
weaviate:
image: semitechnologies/weaviate:1.22.5
restart: on-failure:0
ports:
- "8080:8080"
environment:
QUERY_DEFAULTS_LIMIT: 20
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: "./data"
ENABLE_MODULES: multi2vec-clip
DEFAULT_VECTORIZER_MODULE: multi2vec-clip
CLIP_INFERENCE_API: http://multi2vec-clip:8080
CLUSTER_HOSTNAME: 'node1'
multi2vec-clip:
image: semitechnologies/multi2vec-clip:sentence-transformers-clip-ViT-B-32-multilingual-v1
environment:
ENABLE_CUDA: 0 # set to 1 to enable
...
This module will benefit greatly from GPU usage. Make sure to enable CUDA if you have a compatible GPU available (ENABLE_CUDA=1
).
Alternative: Run a separate container
As an alternative, you can run the inference container independently from Weaviate. To do so, you can:
- Enable
multi2vec-clip
in your Docker Compose file, - Omit
multi2vec-clip
parameters, - Run the inference container separately, e.g. using Docker, and
- Set
CLIP_INFERENCE_API
to the URL of the inference container.
Then, for example if Weaviate is running outside of Docker, set CLIP_INFERENCE_API="http://localhost:8000"
. Alternatively if Weaviate is part of the same Docker network, e.g. because they are part of the same docker-compose.yml
file, you can use Docker networking/DNS, such as CLIP_INFERENCE_API=http://multi2vec-clip:8080
.
Class configuration
You can configure how the module will behave in each class through the Weaviate schema.
Vectorization settings
You can set vectorizer behavior using the moduleConfig
section under each class and property:
Class-level
vectorizer
- what module to use to vectorize the data.vectorizeClassName
– whether to vectorize the class name. Default:true
.<media>Fields
- property names to map for different modalities (undermoduleConfig.multi2vec-clip
).- i.e. one or more of [
textFields
,imageFields
]
- i.e. one or more of [
weights
- optional parameter to weigh the different modalities in producing the final vector.
Property-level
skip
– whether to skip vectorizing the property altogether. Default:false
vectorizePropertyName
– whether to vectorize the property name. Default:false
dataType
- the data type of the property. For use in the appropriate<media>Fields
, must be set totext
orblob
as appropriate.
Example
The following example class definition sets the multi2vec-clip
module as the vectorizer
for the class ClipExample
. It also sets:
name
property as atext
datatype and as the text field,image
property as ablob
datatype and as the image field,
{
"classes": [
{
"class": "ClipExample",
"description": "An example class for multi2vec-clip",
"vectorizer": "multi2vec-clip",
"moduleConfig": {
"multi2vec-clip": {
"textFields": ["name"],
"imageFields": ["image"],
}
},
"properties": [
{
"dataType": ["text"],
"name": "name"
},
{
"dataType": ["blob"],
"name": "image"
}
],
}
]
}
Example with weights
The following example adds weights for various properties, with the textFields
at 0.4, and the imageFields
, audioFields
, and videoFields
at 0.2 each.
{
"classes": [
{
"class": "ClipExample",
"moduleConfig": {
"multi2vec-clip": {
...
"weights": {
"textFields": [0.7],
"imageFields": [0.3],
}
}
}
}
]
}
blob
properties must be in base64-encoded data.Adding blob
data objects
Any blob
property type data must be base64 encoded. To obtain the base64-encoded value of an image for example, you can use the helper methods in the Weaviate clients or run the following command:
cat my_image.png | base64
Additional search operators
The multi2vec-clip
vectorizer module will enable the following nearText
and nearImage
search operators.
These operators can be used to perform cross-modal search and retrieval.
This means that when using the multi2vec-clip
module any query using one modality (e.g. text) will include results in all available modalities, as all objects will be encoded into a single vector space.
Usage example
NearText
- Python
- JavaScript/TypeScript
- Go
- Java
- Curl
- GraphQL
import weaviate
client = weaviate.Client("http://localhost:8080")
nearText = {
"concepts": ["fashion"],
"distance": 0.6, # prior to v1.14 use "certainty" instead of "distance"
"moveAwayFrom": {
"concepts": ["finance"],
"force": 0.45
},
"moveTo": {
"concepts": ["haute couture"],
"force": 0.85
}
}
result = (
client.query
.get("Publication", "name")
.with_additional(["certainty OR distance"]) # note that certainty is only supported if distance==cosine
.with_near_text(nearText)
.do()
)
print(result)
import weaviate from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080',
});
const response = await client.graphql
.get()
.withClassName('Publication')
.withFields('name _additional{certainty distance}') // note that certainty is only supported if distance==cosine
.withNearText({
concepts: ['fashion'],
distance: 0.6, // prior to v1.14 use certainty instead of distance
moveAwayFrom: {
concepts: ['finance'],
force: 0.45,
},
moveTo: {
concepts: ['haute couture'],
force: 0.85,
},
})
.do();
console.log(response);
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate-go-client/v4/weaviate/graphql"
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
className := "Publication"
name := graphql.Field{Name: "name"}
_additional := graphql.Field{
Name: "_additional", Fields: []graphql.Field{
{Name: "certainty"}, // only supported if distance==cosine
{Name: "distance"}, // always supported
},
}
concepts := []string{"fashion"}
distance := float32(0.6)
moveAwayFrom := &graphql.MoveParameters{
Concepts: []string{"finance"},
Force: 0.45,
}
moveTo := &graphql.MoveParameters{
Concepts: []string{"haute couture"},
Force: 0.85,
}
nearText := client.GraphQL().NearTextArgBuilder().
WithConcepts(concepts).
WithDistance(distance). // use WithCertainty(certainty) prior to v1.14
WithMoveTo(moveTo).
WithMoveAwayFrom(moveAwayFrom)
ctx := context.Background()
result, err := client.GraphQL().Get().
WithClassName(className).
WithFields(name, _additional).
WithNearText(nearText).
Do(ctx)
if err != nil {
panic(err)
}
fmt.Printf("%v", result)
}
package io.weaviate;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
import io.weaviate.client.v1.graphql.model.GraphQLResponse;
import io.weaviate.client.v1.graphql.query.argument.NearTextArgument;
import io.weaviate.client.v1.graphql.query.argument.NearTextMoveParameters;
import io.weaviate.client.v1.graphql.query.fields.Field;
public class App {
public static void main(String[] args) {
Config config = new Config("http", "localhost:8080");
WeaviateClient client = new WeaviateClient(config);
NearTextMoveParameters moveTo = NearTextMoveParameters.builder()
.concepts(new String[]{ "haute couture" }).force(0.85f).build();
NearTextMoveParameters moveAway = NearTextMoveParameters.builder()
.concepts(new String[]{ "finance" }).force(0.45f)
.build();
NearTextArgument nearText = client.graphQL().arguments().nearTextArgBuilder()
.concepts(new String[]{ "fashion" })
.distance(0.6f) // use .certainty(0.7f) prior to v1.14
.moveTo(moveTo)
.moveAwayFrom(moveAway)
.build();
Field name = Field.builder().name("name").build();
Field _additional = Field.builder()
.name("_additional")
.fields(new Field[]{
Field.builder().name("certainty").build(), // only supported if distance==cosine
Field.builder().name("distance").build(), // always supported
}).build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName("Publication")
.withFields(name, _additional)
.withNearText(nearText)
.run();
if (result.hasErrors()) {
System.out.println(result.getError());
return;
}
System.out.println(result.getResult());
}
}
# Note: Under nearText, use `certainty` instead of distance prior to v1.14
# Under _additional, `certainty` is only supported if distance==cosine, but `distance` is always supported
echo '{
"query": "{
Get {
Publication(
nearText: {
concepts: [\"fashion\"],
distance: 0.6,
moveAwayFrom: {
concepts: [\"finance\"],
force: 0.45
},
moveTo: {
concepts: [\"haute couture\"],
force: 0.85
}
}
) {
name
_additional {
certainty
distance
}
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer learn-weaviate' \
-H "X-OpenAI-Api-Key: $OPENAI_API_KEY" \
-d @- \
https://edu-demo.weaviate.network/v1/graphql
{
Get{
Publication(
nearText: {
concepts: ["fashion"],
distance: 0.6 # prior to v1.14 use "certainty" instead of "distance"
moveAwayFrom: {
concepts: ["finance"],
force: 0.45
},
moveTo: {
concepts: ["haute couture"],
force: 0.85
}
}
){
name
_additional {
certainty # only supported if distance==cosine.
distance # always supported
}
}
}
}
NearImage
- Python
- JavaScript/TypeScript
- Go
- Java
- Curl
- GraphQL
import weaviate
client = weaviate.Client("http://localhost:8080")
nearImage = {"image": "/9j/4AAQSkZJRgABAgE..."}
result = (
client.query
.get("FashionItem", "image")
.with_near_image(nearImage)
.do()
)
print(result)
import weaviate from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080',
});
const response = await client.graphql
.get()
.withClassName('FashionItem')
.withFields('image')
.withNearImage({ image: '/9j/4AAQSkZJRgABAgE...' })
.do();
console.log(JSON.stringify(response, null, 2));
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate-go-client/v4/weaviate/graphql"
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
className := "FashionItem"
image := graphql.Field{Name: "image"}
nearImage := client.GraphQL().NearImageArgBuilder().WithImage("/9j/4AAQSkZJRgABAgE...")
ctx := context.Background()
result, err := client.GraphQL().Get().
WithClassName(className).
WithFields(image).
WithNearImage(nearImage).
Do(ctx)
if err != nil {
panic(err)
}
fmt.Printf("%v", result)
}
package io.weaviate;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
import io.weaviate.client.v1.graphql.model.GraphQLResponse;
import io.weaviate.client.v1.graphql.query.argument.NearImageArgument;
import io.weaviate.client.v1.graphql.query.fields.Field;
public class App {
public static void main(String[] args) {
Config config = new Config("http", "localhost:8080");
WeaviateClient client = new WeaviateClient(config);
String className = "FashionItem";
Field image = Field.builder().name("image").build();
NearImageArgument nearImage = client.graphQL().arguments().nearImageArgBuilder()
.image("/9j/4AAQSkZJRgABAgE...")
.build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName(className)
.withFields(image)
.withNearImage(nearImage)
.run();
if (result.hasErrors()) {
System.out.println(result.getError());
return;
}
System.out.println(result.getResult());
}
}
echo '{
"query": "{
Get {
FashionItem(nearImage: {
image: "/9j/4AAQSkZJRgABAgE..."
}) {
image
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-d @- \
http://localhost:8080/v1/graphql
{
Get {
FashionItem(nearImage: {
image: "/9j/4AAQSkZJRgABAgE..."
}) {
image
}
}
}
Model selection
To select a model, please point multi2vec-clip
to the appropriate Docker container.
You can use our pre-built Docker image as shown above, or build your own (with just a few lines of code).
This allows you to use any suitable model from the Hugging Face model hub or your own custom model.
Using a public Hugging Face model
You can build a Docker image to use any public SBERT CLIP model from the Hugging Face model hub with a two-line Dockerfile. In the following example, we are going to build a custom image for the clip-ViT-B-32
model.
Step 1: Create a Dockerfile
Create a new Dockerfile
. We will name it clip.Dockerfile
. Add the following lines to it:
FROM semitechnologies/multi2vec-clip:custom
RUN CLIP_MODEL_NAME=clip-ViT-B-32 TEXT_MODEL_NAME=clip-ViT-B-32 ./download.py
Step 2: Build and tag your Dockerfile.
We will tag our Dockerfile as clip-inference
:
docker build -f clip.Dockerfile -t clip-inference .
Step 3: Use the image
You can now push your image to your favorite registry or reference it locally in your Weaviate docker-compose.yml
using the docker tag clip-inference
.
Using a private or local model
You can build a Docker image which supports any model which is compatible with Hugging Face's SentenceTransformers
and ClIPModel
.
To ensure that text embeddings will output compatible vectors to image embeddings, you should only use models that have been specifically trained for use with CLIP models.
In the following example, we are going to build a custom image for a non-public model which we have locally stored at ./my-clip-model
and ./my-text-model
.
Both models were trained to produce embeddings which are compatible with one another.
Create a new Dockerfile
(you do not need to clone this repository, any folder on your machine is fine), we will name it my-models.Dockerfile
. Add the following lines to it:
FROM semitechnologies/transformers-inference:custom
COPY ./my-text-model /app/models/text
COPY ./my-clip-model /app/models/clip
The above will make sure that your model ends up in the image at /app/models/clip
and /app/models/text
respectively.. This path is important, so that the application can find the model.
Now you just need to build and tag your Dockerfile, we will tag it as my-models-inference
:
docker build -f my-models.Dockerfile -t my-models-inference .
That's it! You can now push your image to your favorite registry or reference it locally in your Weaviate docker-compose.yml
using the Docker tag my-models-inference
.
To debug if your inference container is working correctly, you can send queries to the vectorizer module's inference container directly, so you can see exactly what vectors it would produce for which input.
To do so – you need to expose the inference container in your Docker Compose file – add something like this:
ports:
- "9090:8080"
to your multi2vec-clip
.
Then you can send REST requests to it directly, e.g.:
localhost:9090/vectorize -d '{"texts": ["foo bar"], "images":[]}'
and it will print the created vector(s) directly.
Model license(s)
The multi2vec-clip
module uses the clip-ViT-B-32
model from the Hugging Face model hub. Please see the model page for the license information.
It is your responsibility to evaluate whether the terms of its license(s), if any, are appropriate for your intended use.