Manage relationships with cross-references
This tutorial is currently being updated to reflect the latest features and improvements in Weaviate. We appreciate your patience and invite you to check back soon for the updated content.
Queries involving cross-references can be slower than queries that do not involve cross-references, especially at scale such as for multiple objects or complex queries.
At the first instance, we strongly encourage you to consider whether you can avoid using cross-references in your data schema. As a scalable AI-native database, Weaviate is well-placed to perform complex queries with vector, keyword and hybrid searches involving filters. You may benefit from rethinking your data schema to avoid cross-references where possible.
For example, instead of creating separate "Author" and "Book" collections with cross-references, consider embedding author information directly in Book objects and using searches and filters to find books by author characteristics.
In this tutorial, you will learn how to use cross-references to manage relationships between objects, and to use them to enhance your queries.
Many applications require the ability to manage relationships between objects. For example, a blog application might need to store information about the author of each post. Or, a document store may chunk documents into smaller pieces and store them in separate objects, each with a reference to the original document.
In Weaviate, you can use cross-references to manage relationships between objects. In the preceding examples, a blog post class could have a cross-reference property called hasAuthor
to link each post to its author, or a chunk class could have a cross-reference property called sourceDocument
to link each chunk to its original document.
We will refer to the originating object as the source object, and the object that is being linked to (cross-referenced object) as the target object.
Prerequisites​
This tutorial assumes that you have completed the QuickStart tutorial and have access to a Weaviate instance with write access.
When to use cross-references​
Cross-references are useful when you need to establish relationships between objects in Weaviate. For example, you might want to link:
- A blog post (source) to its author (target).
- A document chunk (source) to its original document (target).
- A product (source) to its manufacturer (target).
- A quiz item (source) to its category (target).
In each of these cases, you can use a cross-reference property to link the objects together.
How cross-references work​
In Weaviate, cross-reference relationships are defined at the source class as a property. Each of these properties is characterized by a name and a data type. Each cross-reference property can be directed to one or more target classes. For example, a hasAuthor
cross-reference property might be directed to the Author
class, while a sourceDocument
cross-reference property might be directed to the Document
class.
Directionality of cross-references​
Cross-references are uni-directional; to establish a bi-directional relationship, two distinct cross-reference properties are required, facilitating linkage in both directions.
Cross references and vectors​
Linking objects with cross-references does not affect the vectorization of the objects.
For example, linking a blog post to its author with a cross-reference will not affect the vector of the author object or the blog object.
Managing cross-references​
Each cross-reference can be created, updated, and deleted independently of the objects that it links. This allows you to manage relationships between objects without having to modify the objects themselves.
To create a cross-reference, you must include the cross-reference property in the source class, and then add the cross-reference to the source object.
Include a cross-reference property​
A cross-reference property must be included in the class definition. For example, to create a hasAuthor
cross-reference property in the BlogPost
class, you would include the following in the class definition:
{
"class": "BlogPost",
"properties": [
... // other class properties
{
"name": "hasAuthor",
"dataType": ["Author"],
},
],
... // other class attributes (e.g. vectorizer)
}
Create a cross-reference​
To create a cross-reference, Weaviate requires the following information:
- The class and UUID of the source (from) object.
- The class and UUID of the target (to) object.
- The name of the cross-reference property.
An example syntax is shown below:
- Python
- JS/TS Client v2
- Java
- Go
questions = client.collections.get("JeopardyQuestion")
questions.data.reference_add(
from_uuid=question_obj_id,
from_property="hasCategory",
to=category_obj_id
)
const jeopardy = client.collections.get('JeopardyCategory')
await jeopardy.data.referenceAdd({
fromProperty: 'hasCategory',
fromUuid: questionObjectId,
to: categoryObjectId,
})
String sfId = "00ff6900-e64f-5d94-90db-c8cfa3fc851b";
String usCitiesId = "20ffc68d-986b-5e71-a680-228dba18d7ef";
client.data().referenceCreator()
.withClassName("JeopardyQuestion")
.withID(sfId)
.withReferenceProperty("hasCategory")
.withReference(client.data()
.referencePayloadBuilder()
.withClassName("JeopardyCategory")
.withID(usCitiesId)
.payload())
.run();
sfID := "00ff6900-e64f-5d94-90db-c8cfa3fc851b"
usCitiesID := "20ffc68d-986b-5e71-a680-228dba18d7ef"
client.Data().ReferenceCreator().
WithClassName("JeopardyQuestion").
WithID(sfID).
WithReferenceProperty("hasCategory").
WithReference(client.Data().ReferencePayloadBuilder().
WithClassName("JeopardyCategory").
WithID(usCitiesID).
Payload()).
Do(ctx)
Queries with cross-references​
Once you have established cross-references between objects, you can use them to enhance your search queries. For example, you can use cross-references to:
- Retrieve properties of target objects.
- Filter objects based on properties of target objects.
Retrieve properties of a target object​
You can retrieve properties of a target object just as you would retrieve properties of the source object.
For example, where a document chunk includes a cross-reference to its original document, you can use the cross-reference to retrieve properties of the original document. Accordingly, you can retrieve the title of the document or the author of the document, just as you would retrieve a property of the chunk itself such as the text of the chunk.
Take a look at the snippet below in which we retrieve objects from the JeopardyQuestion
class. Here, the JeopardyQuestion
class contains the hasCategory
cross-reference property, linking objects to the JeopardyCategory
class. This query retrieves the title
property of the target JeopardyCategory
class, as well as the question
property of the source JeopardyQuestion
class.
- Python
- JS/TS Client v2
- GraphQL
from weaviate.classes.query import QueryReference
jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.fetch_objects(
return_references=[
QueryReference(
link_on="hasCategory",
return_properties=["title"]
),
],
limit=2
)
for o in response.objects:
print(o.properties["question"])
# print referenced objects
for ref_obj in o.references["hasCategory"].objects:
print(ref_obj.properties)
const myCollection = client.collections.get('JeopardyQuestion');
const result = await myCollection.query.fetchObjects({
limit: 2,
returnReferences: [{
linkOn: 'hasCategory',
returnProperties: ['title'],
}]
})
result.objects.forEach(item =>
console.log(JSON.stringify(item.references, null, 2))
);
Filter using cross-references​
You can configure a filter to include or exclude objects based on properties of the target object.
For example, you can filter articles based on an attribute of the author, such as the author's name or the author's location.
Take a look at the snippet below, This query looks through the JeopardyQuestion
class, but the results are filtered using the title
property of its cross-referenced JeopardyCategory
class. The title
property must include the substring Sport
.
- Python
- JS/TS Client v2
- GraphQL
from weaviate.classes.query import Filter, QueryReference
jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.fetch_objects(
filters=Filter.by_ref(link_on="hasCategory").by_property("title").like("*Sport*"),
return_references=QueryReference(link_on="hasCategory", return_properties=["title"]),
limit=3
)
for o in response.objects:
print(o.properties)
print(o.references["hasCategory"].objects[0].properties["title"])
const jeopardy = client.collections.get('JeopardyQuestion');
const result = await jeopardy.query.fetchObjects({
filters: jeopardy.filter.byProperty('round').equal('Double Jeopardy!'),
limit: 3,
})
for (let object of result.objects) {
console.log(JSON.stringify(object.properties, null, 2));
}
Because cross-references do not affect vectors, you cannot use vector searches to filter objects based on properties of the target object.
However, you could use two separate queries to achieve a similar result. For example, you could perform a vector search to identify JeopardyCategory
objects that are similar to a given vector, resulting in a list of JeopardyCategory
objects. You could then use the unique title
properties of these objects in a second query filter the results as shown above. This will result in JeopardyQuestion
objects that are cross-referenced to the JeopardyCategory
objects identified in the first query.
Further information​
Managing cross-references​
For further information on how to manage cross-references, see the how-to guide on cross-references. It includes information on how to: