Image search
Overview
This page covers additional, unique aspects related to similarity searches using an image as an input.
If you wish to search for images using a vector or another object, please refer to the How-to: similarity search page.
Image-based search is currently not available in WCS, as the required modules are not available.
Target object types
To search using an image as an input, you must use the img2vec-neural
or the multi2vec-clip
vectorizer module. More specifically:
- To find similar images, you can use
img2vec-neural
ormulti2vec-clip
- To find related text and image objects (i.e. for multi-modal search), you must use
multi2vec-clip
Requirements
To search using an input image, you must:
- Configure Weaviate with an image vectorizer module (
img2vec-neural
ormulti2vec-clip
), and - Configure the target class to use the image vectorizer module
How do I configure Weaviate with an image vectorizer module?
You must enable the desired vectorizer module and specify the inference API address in the relevant Docker Compose file (e.g. docker-compose.yml
). You can generate this file using the Weaviate configuration tool.
An example img2vec-neural
configuration is shown below:
services:
weaviate:
environment:
IMAGE_INFERENCE_API: "http://i2v-neural:8080"
DEFAULT_VECTORIZER_MODULE: 'img2vec-neural'
ENABLE_MODULES: 'img2vec-neural'
i2v-neural:
image: semitechnologies/img2vec-pytorch:resnet50
And an example multi2vec-clip
configuration is shown below:
services:
weaviate:
environment:
CLIP_INFERENCE_API: 'http://multi2vec-clip:8080'
DEFAULT_VECTORIZER_MODULE: 'multi2vec-clip'
ENABLE_MODULES: 'multi2vec-clip'
multi2vec-clip:
image: semitechnologies/multi2vec-clip:sentence-transformers-clip-ViT-B-32-multilingual-v1
environment:
ENABLE_CUDA: '0'
How do I configure the target class with the image vectorizer module?
You must configure the target class to:
- Ensure that the target class is configured to use the image vectorizer module, such as by explicitly setting it as the vectorizer for the class. And
- Specify in the
imageFields
property the blob field(s) that will store the images.
For using img2vec-neural
, an example class definition may look as follows:
{
"classes": [
{
"class": "ImageExample",
"moduleConfig": {
"img2vec-neural": {
"imageFields": [
"image"
]
}
},
"properties": [
{
"dataType": [
"blob"
],
"description": "Grayscale image",
"name": "image"
}
],
"vectorizer": "img2vec-neural"
}
]
}
For using multi2vec-clip
, an example class definition may look as follows:
{
"classes": [
{
"class": "ClipExample",
"moduleConfig": {
"multi2vec-clip": {
"imageFields": [
"image"
]
}
},
"properties": [
{
"dataType": [
"blob"
],
"name": "image"
}
],
"vectorizer": "multi2vec-clip"
}
]
}
Note that for the multi2vec-clip vectorizer module, there are additional settings available such as how to balance text and image-derived vectors.
See the relevant module page for:
base64 nearImage search
You can find similar images by performing a nearImage
search for the based64-encoded representation of the image.
You can obtain this representation (a long string) as below:
- Python
- JavaScript/TypeScript
- Shell
base64_string = base64.b64encode(content).decode('utf-8') # standard library module
base64String = content.toString('base64');
base64 -i Corgi.jpg
Then, you can search for similar images as follows:
- Python
- JavaScript/TypeScript
import weaviate
import requests
import base64
import json
client = weaviate.Client(
'http://localhost:8080', # Replace with your Weaviate URL
# Uncomment if authentication is on and replace w/ your Weaviate instance API key.
# auth_client_secret=weaviate.AuthApiKey("YOUR-WEAVIATE-API-KEY"),
)
# Fetch URL into `content` variable
image_url = 'https://upload.wikimedia.org/wikipedia/commons/thumb/f/fb/Welchcorgipembroke.JPG/640px-Welchcorgipembroke.JPG'
image_response = requests.get(image_url)
content = image_response.content
# Encode content into base64 string
base64_string = base64.b64encode(content).decode('utf-8')
# Perform query
response = (
client.query
.get('Dog', 'breed')
.with_near_image(
{'image': base64_string},
encode=False # False because the image is already base64-encoded
)
.with_limit(1)
.do()
)
print(json.dumps(response, indent=2))
import weaviate from 'weaviate-ts-client';
import fetch from 'node-fetch';
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080', // Replace with your Weaviate URL
// Uncomment if authentication is on, and replace w/ your Weaviate instance API key.
// apiKey: new weaviate.ApiKey('YOUR-WEAVIATE-API-KEY'),
});
const imageUrl = 'https://upload.wikimedia.org/wikipedia/commons/thumb/f/fb/Welchcorgipembroke.JPG/640px-Welchcorgipembroke.JPG'
// Fetch URL into `content` variable
const response = await fetch(imageUrl);
const content = await response.buffer();
const base64String = content.toString('base64');
// Perform query
let result = await client.graphql
.get()
.withClassName('Dog')
.withNearImage({
image: base64String,
})
.withLimit(1)
.withFields('breed')
.do();
console.log(JSON.stringify(result, null, 2));
Example response
{
"data": {
"Get": {
"Dog": [
{
"breed": "Corgi"
}
]
}
}
}
Specify image by filename
If your target image is stored in a file, you can use the Python client to search for the image by its filename.
- Python
- JavaScript/TypeScript
response = (
client.query
.get('Dog', 'breed')
.with_near_image({'image': 'image.jpg'}) # default `encode=True` reads & encodes the file
.with_limit(1)
.do()
)
Not available yet. Vote for the feature request. DYI code below.
import fs from 'fs';
// Read the file into a base-64 encoded string
const contentsBase64 = await fs.promises.readFile('image.jpg', { encoding: 'base64' });
// Query based on base64-encoded image
result = await client.graphql
.get()
.withClassName('Dog')
.withNearImage({
image: contentsBase64,
})
.withLimit(1)
.withFields('breed')
.do();
console.log(JSON.stringify(result, null, 2));
Example response
{
"data": {
"Get": {
"Dog": [
{
"breed": "Corgi"
}
]
}
}
}
Distance threshold
You can set a threshold for similarity search by setting a maximum distance
. The distance indicates how dissimilar two images are.
The syntax is the same as for the other nearXXX
operators.
More Resources
For additional information, try these sources.