Skip to main content

Multiple vectors

Added in v1.24.0

Weaviate collections support multiple, named vectors.

Collections can have multiple, named vectors. Each vector is independent. Each vector space has its own index, its own compression, and its own vectorizer. This means you can create vectors for properties, use different vectorization models, and apply different metrics to the same object.

You do not have to use multiple vectors in your collections, but if you do, you need to adjust your queries to specify a target vector for vector or hybrid queries.

Syntax

Single vector collections are valid and continue to use the original collection syntax. However, if you configure multiple vectors, you must use the new, named vector syntax.

Python client v4

Starting in v4.5.0, the Python client supports named vectors.

Create a schema

Weaviate collections require a schema. Use the schema definition to configure the vector spaces for each data object.

  • To configure named vectors, use NamedVectors.
  • To specify which inputs go to which vectorizers, set source_properties.
import weaviate
import weaviate.classes as wvc
import os, json, requests

client = weaviate.connect_to_local(
headers={
"X-OpenAI-Api-Key": os.environ["OPENAI_API_KEY"],
"X-Cohere-Api-Key": os.environ["COHERE_API_KEY"],
}
)

# Define a new schema
collection = client.collections.create(
name="Named_Vector_Jeopardy_Collection",
description="Jeopardy game show questions",
vectorizer_config=[
wvc.config.Configure.NamedVectors.text2vec_cohere(
name="jeopardy_questions_vector",
source_properties=["question"],
vectorize_collection_name=False,
),
wvc.config.Configure.NamedVectors.text2vec_openai(
name="jeopardy_answers_vector",
source_properties=["answer"],
vectorize_collection_name=False,
),
],
properties=[
wvc.config.Property(name="category", data_type=wvc.config.DataType.TEXT),
wvc.config.Property(name="question", data_type=wvc.config.DataType.TEXT),
wvc.config.Property(name="answer", data_type=wvc.config.DataType.TEXT),
],
)

## CHECK VALUES - uncomment the next line to see the collection definition
# print(collection)

Data values can be stored as properties, vectors or both. In this example, each data object has two named vectors, jeopardy_questions_vector and jeopardy_answers_vector. Each object also has three properties, question, answer, and category. The schema specifies how Weaviate manages your data.

Data fieldPropertyVectorizer
categoryyesnone
questionyestext2vec_cohere
answeryestext2vec_openai

Query a named vector

Keyword searches in collections with named vectors use the same syntax as keyword searches in collections without named vectors. However, if you run a vector search on a collection with named vectors, specify the vector space to search.

Use named vectors with vector similarity searches (near_text, near_object, near_vector, near_image) and hybrid search.

To run the example query, first create the sample collection.

Create sample collection.

This code creates a sample collection and imports a small amount of data.

To run the code, you must have an OpenAI API key and a Cohere API key defined as local variables on your system.

OpenAI and Cohere are third party services. You may incur a cost if you exceed the limits of their free tiers.

import weaviate
import weaviate.classes as wvc
import os, json, requests

client = weaviate.connect_to_local(
headers={
"X-OpenAI-Api-Key": os.environ["OPENAI_API_KEY"],
"X-Cohere-Api-Key": os.environ["COHERE_API_KEY"],
}
)

import requests

# Define a new schema
collection = client.collections.create(
name="Named_Vector_Jeopardy_Collection",
description="Jeopardy game show questions",
vectorizer_config=[
wvc.config.Configure.NamedVectors.text2vec_cohere(
name="jeopardy_questions_vector",
source_properties=["question"],
vectorize_collection_name=False,
),
wvc.config.Configure.NamedVectors.text2vec_openai(
name="jeopardy_answers_vector",
source_properties=["answer"],
vectorize_collection_name=False,
),
],
properties=[
wvc.config.Property(name="category", data_type=wvc.config.DataType.TEXT),
wvc.config.Property(name="question", data_type=wvc.config.DataType.TEXT),
wvc.config.Property(name="answer", data_type=wvc.config.DataType.TEXT),
],
)

## CHECK VALUES - uncomment the next line to see the collection definition
# print(collection)


# Get the sample data set
resp = requests.get(
"https://raw.githubusercontent.com/weaviate-tutorials/quickstart/main/data/jeopardy_tiny.json"
)
data = json.loads(resp.text)

# Prepare the sample data for upload
question_objects = list()
for row in data:
question_objects.append(
{
"question": row["Question"],
"answer": row["Answer"],
"category": row["Category"],
}
)

# Upload the sample data
nvjc_collection = client.collections.get("Named_Vector_Jeopardy_Collection")
with nvjc_collection.batch.dynamic() as batch:
for q in question_objects:
batch.add_object(properties=q)
import weaviate
import weaviate.classes as wvc
import os, json, requests

client = weaviate.connect_to_local(
headers={
"X-OpenAI-Api-Key": os.environ["OPENAI_API_KEY"],
"X-Cohere-Api-Key": os.environ["COHERE_API_KEY"],
}
)

nvjc_collection = client.collections.get("Named_Vector_Jeopardy_Collection")

response = nvjc_collection.query.near_text(
query="what's a crocodile",
include_vector="True",
target_vector="jeopardy_questions_vector",
limit=1,
)

for r in response.objects:
print(r.properties)



client.close()

REST API

The legacy, single vector syntax is valid for use with collections that don't have named vectors:

{
"class": "Article",
"vector": [0.3, -0.012, 0.071, ..., -0.09],
"properties": {
"content": Really cool things",
}
}

To specify named vectors in collections with multiple, named vectors use the new syntax.

{
"class": "ArticleNamedVector",
"vectors": {
"title_vector": [0.3, 0.2, 0.6, ..., 0.1]},
"image_vector": [1,2,3,4]
},
"properties": {
"content": "Really cool things",
}
}

To retrieve all vectors at once, use this endpoint:

GET /v1/objects/<ClassName>/<uuid>?include=vector

GraphQL

If a collection has one vector, you don't have to specify a vector name. For example, a nearVector query with a single vector looks like this:

{
Get {
Publication(
nearVector: {
vector: [0.1, -0.15, 0.3]
}
){
content
_additional {
vector # backward compatible if only one vector present
vectors {
title
}
distance
}
}
}
}

If a collection has multiple vectors, use the _additional {vectors {name}} field in the query to specify the vector that you want to search.

These examples show GraphQL queries:

{
Get {
Publication(
nearVector: {
vector: [0.1, -0.15, 0.3]
targetVectors: ["title"] # array field
}
){
content
_additional {
vectors {
title
}
distance
}
}
}
}

Named vector collections support hybrid search, but only for one vector at a time. To run a hybrid search, specify the vector to use:

reviews = client.collections.get("WineReviewNV")
response = reviews.query.hybrid(
query="A French Riesling",
target_vector="title_country",
limit=3
)

for o in response.objects:
print(o.properties)

Questions and feedback

If you have any questions or feedback, let us know in our user forum.