Multimodal search
With Weaviate, you can perform semantic searches to find similar items based on their meaning. This is done by comparing the vector embeddings of the items in the database.
As we are using a multimodal model, we can search for objects based on their similarity to any of the supported modalities. Meaning that we can search for movies based on their similarity to a text or an image.
Image query
Code
This example finds entries in "Movie" based on their similarity to this image of the International Space Station, and prints out the title and release year of the top 5 matches.
Query image
import weaviate, { WeaviateClient, WeaviateReturn } from "weaviate-client";
let client: WeaviateClient;
let response: WeaviateReturn<undefined>
// Instantiate your client (not shown). e.g.:
// const requestHeaders = {'X-VoyageAI-Api-Key': process.env.VOYAGEAI_API_KEY as string,}
// client = weaviate.connectToWeaviateCloud(..., headers: requestHeaders) or
// client = weaviate.connectToLocal(..., headers: requestHeaders)
async function urlToBase64(imageUrl: string) {
const response = await fetch(imageUrl);
const arrayBuffer = await response.arrayBuffer();
const content = Buffer.from(arrayBuffer);
return content.toString('base64');
}
// Get the collection
const movies = client.collections.get("Movie")
// Perform query
const srcImgPath = "https://github.com/weaviate-tutorials/edu-datasets/blob/main/img/International_Space_Station_after_undocking_of_STS-132.jpg?raw=true"
const queryB64 = await urlToBase64(srcImgPath)
response = await movies.query.nearImage(queryB64, {
limit: 5,
returnMetadata: ['distance'],
returnProperties: ["title", "tmdb_id", "release_date", "poster"]
},
)
// Inspect the response
for (let item of response.objects) {
// Print the title and release year (note the release date is a datetime object)
console.log(`${item.properties.title} - ${item.properties.release_date}`)
// Print the distance of the object from the query
console.log(`Distance to query: ${item.metadata?.distance}`)
}
client.close()
Explain the code
The results are based on similarity of the vector embeddings between the query and the database object. In this case, the vectorizer module generates an embedding of the input image.
The limit
parameter here sets the maximum number of results to return.
The returnMetadata
parameter takes an array of strings to set metadata to return in the search results. The current query returns the vector distance to the query.
Example results
Posters for the top 5 matches:
Weaviate output:
Interstellar 2014 157336
Distance to query: 0.354
Gravity 2013 49047
Distance to query: 0.384
Arrival 2016 329865
Distance to query: 0.386
Armageddon 1998 95
Distance to query: 0.400
Godzilla 1998 929
Distance to query: 0.441
Response object
The returned object is an instance of a custom class. Its objects
attribute is a list of search results, each object being an instance of another custom class.
Each returned object will:
- Include all properties and its UUID by default except those with blob data types.
- Since the
poster
property is a blob, it is not included by default. - To include the
poster
property, you must specify it and the other properties to fetch in thereturnProperties
parameter.
- Since the
- Not include any other information (e.g. references, metadata, vectors.) by default.
Text search
Code
This example finds entries in "Movie" based on their similarity to the query "red", and prints out the title and release year of the top 5 matches.
import weaviate, { WeaviateClient, WeaviateReturn } from "weaviate-client";
let client: WeaviateClient;
let response: WeaviateReturn<undefined>
// Instantiate your client (not shown). e.g.:
// const requestHeaders = {'X-VoyageAI-Api-Key': process.env.VOYAGEAI_API_KEY as string,}
// client = weaviate.connectToWeaviateCloud(..., headers: requestHeaders) or
// client = weaviate.connectToLocal(..., headers: requestHeaders)
async function urlToBase64(imageUrl: string) {
const response = await fetch(imageUrl);
const arrayBuffer = await response.arrayBuffer();
const content = Buffer.from(arrayBuffer);
return content.toString('base64');
}
// Get the collection
const movies = client.collections.get("Movie")
// Perform query
const srcImgPath = "https://github.com/weaviate-tutorials/edu-datasets/blob/main/img/International_Space_Station_after_undocking_of_STS-132.jpg?raw=true"
const queryB64 = await urlToBase64(srcImgPath)
response = await movies.query.nearText("red", {
limit: 5,
returnMetadata: ['distance'],
returnProperties: ["title", "tmdb_id", "release_date"]
},
)
// Inspect the response
for (let item of response.objects) {
// Print the title and release year (note the release date is a datetime object)
console.log(`${item.properties.title} - ${item.properties.release_date}`)
// Print the distance of the object from the query
console.log(`Distance to query: ${item.metadata?.distance}`)
}
client.close()
Explain the code
The results are based on similarity of the vector embeddings between the query and the database object. In this case, the vectorizer module generates an embedding of the input text.
The remaining parameters are the same as in the previous example.
Example results
Posters for the top 5 matches:
Weaviate output:
Deadpool 2 2018 383498
Distance to query: 0.670
Bloodshot 2020 338762
Distance to query: 0.677
Deadpool 2016 293660
Distance to query: 0.678
300 2007 1271
Distance to query: 0.682
The Hunt for Red October 1990 1669
Distance to query: 0.683
Response object
The returned object is in the same format as in the previous example.
Questions and feedback
If you have any questions or feedback, let us know in the user forum.