Skip to main content

Image search

LICENSE Weaviate on Stackoverflow badge Weaviate issues on GitHub badge Weaviate version badge Weaviate total Docker pulls badge Go Report Card

Overview

This page covers additional, unique aspects related to similarity searches using an image as an input.

If you wish to search for images using a vector or another object, please refer to the How-to: similarity search page.

Not available in WCS

Image-based search is currently not available in WCS, as the required modules are not available.

Target object types

To search using an image as an input, you must use the img2vec-neural or the multi2vec-clip vectorizer module. More specifically:

  • To find similar images, you can use img2vec-neural or multi2vec-clip
  • To find related text and image objects (i.e. for multi-modal search), you must use multi2vec-clip

Requirements

To search using an input image, you must:

  • Configure Weaviate with an image vectorizer module (img2vec-neural or multi2vec-clip), and
  • Configure the target class to use the image vectorizer module
How do I configure Weaviate with an image vectorizer module?

You must enable the desired vectorizer module and specify the inference API address in the relevant Docker Compose file (e.g. docker-compose.yml). You can generate this file using the Weaviate configuration tool.

An example img2vec-neural configuration is shown below:

services:
weaviate:
environment:
IMAGE_INFERENCE_API: "http://i2v-neural:8080"
DEFAULT_VECTORIZER_MODULE: 'img2vec-neural'
ENABLE_MODULES: 'img2vec-neural'
i2v-neural:
image: semitechnologies/img2vec-pytorch:resnet50

And an example multi2vec-clip configuration is shown below:

services:
weaviate:
environment:
CLIP_INFERENCE_API: 'http://multi2vec-clip:8080'
DEFAULT_VECTORIZER_MODULE: 'multi2vec-clip'
ENABLE_MODULES: 'multi2vec-clip'
multi2vec-clip:
image: semitechnologies/multi2vec-clip:sentence-transformers-clip-ViT-B-32-multilingual-v1
environment:
ENABLE_CUDA: '0'
How do I configure the target class with the image vectorizer module?

You must configure the target class to:

  • Ensure that the target class is configured to use the image vectorizer module, such as by explicitly setting it as the vectorizer for the class. And
  • Specify in the imageFields property the blob field(s) that will store the images.

For using img2vec-neural, an example class definition may look as follows:

{
"classes": [
{
"class": "ImageExample",
"moduleConfig": {
"img2vec-neural": {
"imageFields": [
"image"
]
}
},
"properties": [
{
"dataType": [
"blob"
],
"description": "Grayscale image",
"name": "image"
}
],
"vectorizer": "img2vec-neural"
}
]
}

For using multi2vec-clip, an example class definition may look as follows:

{
"classes": [
{
"class": "ClipExample",
"moduleConfig": {
"multi2vec-clip": {
"imageFields": [
"image"
]
}
},
"properties": [
{
"dataType": [
"blob"
],
"name": "image"
}
],
"vectorizer": "multi2vec-clip"
}
]
}

Note that for the multi2vec-clip vectorizer module, there are additional settings available such as how to balance text and image-derived vectors.

For more detail

See the relevant module page for:

You can find similar images by performing a nearImage search for the based64-encoded representation of the image.

You can obtain this representation (a long string) as below:

base64_string = base64.b64encode(content).decode('utf-8')  # standard library module

Then, you can search for similar images as follows:

import weaviate
import requests
import base64
import json

client = weaviate.Client(
'http://localhost:8080', # Replace with your Weaviate URL
# Uncomment if authentication is on and replace w/ your Weaviate instance API key.
# auth_client_secret=weaviate.AuthApiKey("YOUR-WEAVIATE-API-KEY"),
)

# Fetch URL into `content` variable
image_url = 'https://upload.wikimedia.org/wikipedia/commons/thumb/f/fb/Welchcorgipembroke.JPG/640px-Welchcorgipembroke.JPG'
image_response = requests.get(image_url)
content = image_response.content

# Encode content into base64 string
base64_string = base64.b64encode(content).decode('utf-8')

# Perform query
response = (
client.query
.get('Dog', 'breed')
.with_near_image(
{'image': base64_string},
encode=False # False because the image is already base64-encoded
)
.with_limit(1)
.do()
)

print(json.dumps(response, indent=2))
Example response
{
"data": {
"Get": {
"Dog": [
{
"breed": "Corgi"
}
]
}
}
}

Specify image by filename

If your target image is stored in a file, you can use the Python client to search for the image by its filename.

response = (
client.query
.get('Dog', 'breed')
.with_near_image({'image': 'image.jpg'}) # default `encode=True` reads & encodes the file
.with_limit(1)
.do()
)
Example response
{
"data": {
"Get": {
"Dog": [
{
"breed": "Corgi"
}
]
}
}
}

Distance threshold

You can set a threshold for similarity search by setting a maximum distance. The distance indicates how dissimilar two images are. The syntax is the same as for the other nearXXX operators.

More Resources

For additional information, try these sources.

  1. Frequently Asked Questions
  2. Weaviate Community Forum
  3. Knowledge base of old issues
  4. Stackoverflow
  5. Weaviate slack channel