Skip to main content

Query

LICENSE Weaviate on Stackoverflow badge Weaviate issues on Github badge Weaviate version badge Weaviate total Docker pulls badge Go Report Card

Finally! It's time to start and query Weaviate!

For this guide, you don't have to load any data into a Weaviate, we are going to use the Wikipedia demo dataset.

But before we start, some basics:

  • Weaviate's main API is its GraphQL-API
    • New to GraphQL? Check this 100-second explainer video.
  • Weaviate also has a RESTful API but it is used for other operations.
  • You can also use the clients to query Weaviate natively in your language of choice. The clients will automatically determine which API to use for the request.
  • The Weaviate Console contains an auto-complete feature to help you write queries.

Let's get started!

Root GraphQL functions​

Weaviate's GraphQL-API has three root functions:

  • Get{} to retrieve data based on the schema.
  • Aggregate{} to retrieve meta information (e.g., the number of data objects in a class).
  • Explore{} to explore the complete vector space without using the schema.
{
Get {
# etc...
}
}
{
Aggregate {
# etc...
}
}
{
Explore {
# etc...
}
}
note

A more detailed explanation of Weaviate's GraphQL design is available here.

Get{}​

In the basics getting started guide, you've learned how Weaviate uses a class-property structure and in the schema getting started guide you've learned how you can define the class-property structure.

Our demo dataset has two classes: Article and Paragraph. The Article class has the properties: title of the data type string, hasParagraphs of the data type Paragraph, and linksToArticles of the data type Article. The Paragraph class has the properties: title of the data type string, content of the data type text, order of the data type int, and inArticle of the data type Article. You can also inspect the schema of the demo dataset in JSON format here.

If we now want to Get{} all (well, "all", in this case, "all" means limited to the default limit, more about this later) Paragraphs without cross-references we can run the following query:

{
Get {
Paragraph {
content
order
title
}
}
}

Try out ⬆️

Or we can set a cross reference like this:

{
Get {
Paragraph {
content
order
title
inArticle { # <= cross reference
... on Article {
title
}
}
}
}
}

Try out ⬆️

Let's break down what's happening in this section:

inArticle {
... on Article {
title
}
}
  • inArticle is the name of the property with the cross-reference to Paragraph.
  • ... on Article is set because we can have a single property cross reference to multiple classes (did you notice that the cross-reference is set inside an array?). In this specific dataset, we only reference to Article but this could be more than one class.
  • title is a property of Article. Because we told GraphQL that we are going to query over the cross-reference Article, it knows that the properties of Article should be the available options.

You might remember that the Article class also has a cross-reference back to Paragraph. So, the following query is valid:

{
Get {
Paragraph {
content
order
title
inArticle {
... on Article {
title
hasParagraphs {
... on Paragraph {
title
order
}
}
}
}
}
}
}

The GraphQL-API has additional properties that can be retrieved with _additional{}. Some modules extend the _additional{} property too, more about this later.

An example with basic _additional{} properties:

{
Get {
Paragraph {
content
title
_additional {
id # <= the UUID of the data object
vector # <= the vector (if any) of the object
creationTimeUnix # <= the creating time as unix timestamp
lastUpdateTimeUnix # <= the latest update as unix timestamp
}
}
}
}

Try out ⬆️

Get{} with filters​

This is where the fun really begins! In Weaviate we set arguments on the level of the class names between brackets.

Let's start with the simplest filter we have, the limit filter to, as you might have guessed, limit the number of results to a given number.

{
Get {
Paragraph(
limit: 3
) {
content
title
}
}
}

Try out ⬆️

Weaviate also allows you to paginate over the results:

{
Get {
Paragraph(
limit: 3
offset: 3
) {
content
title
}
}
}

Try out ⬆️

You can also use the filters to query for specific vectors! Simply like this:

{
Get {
Paragraph(
nearVector: {
certainty: 0.95
vector: [
-0.14980282,
-0.18726847,
-0.20329526,
... # This may include hundreds (e.g. 384) of dimensions
-0.028092828,
0.41721362,
-0.09374439
]
}
) {
content
title
_additional {
certainty
}
}
}
}

Did you see certainty? This is the distance from the vector to the data objects. You can also calculate the cosine similarity if you like based on the certainty, more about this here.

You can also do the equivalent but based on the UUID of any object in the same vector space (the same, because we match based on vector length. But if you use the same model to vectorize in different classes you can also mix them).

{
Get {
Paragraph(
nearObject: {
id: "fd7383f7-f2e3-3d50-a272-db9b614417cb"
}
limit: 5
) {
content
title
_additional {
certainty
}
}
}
}

Try out ⬆️

Traditional inverted index filtering is also possible. In Weaviate we call this the where filter.

The where filter takes three operands of its own:

  1. path is the graph path in your schema.
  2. operator is set to define what you want to do with the value inside the path (e.g., Equal or GreaterThan, etc. See the list here).
  3. value* is set based on the type of the property defined in path. So, if the property in path is an int, this becomes valueInt, if it's a string it becomes valueString. The value itself is whatever you want to filter on.

The examples below are a bit more explanatory.

Let's filter for "Italian cuisine" in the title of the Paragraph.

{
Get {
Paragraph(
where: {
path: ["title"]
operator: Equal
valueString: "Italian cuisine"
}
limit: 10
) {
title
order
}
}
}

Try out ⬆️

Or for Paragraphs where the order is higher than 5.

{
Get {
Paragraph(
where: {
path: ["order"]
operator: GreaterThan
valueInt: 5
}
limit: 10
) {
title
order
}
}
}

Try out ⬆️

Or by combining them setting multiple operands:

{
Get {
Paragraph(
where: {
operator: And # <= We can have And, Or, etc.
operands: [{
path: ["title"]
operator: Equal
valueString: "Italian cuisine"
},{
path: ["order"]
operator: GreaterThan
valueInt: 5
}]
}
limit: 5
) {
title
order
}
}
}

Try out ⬆️

The path is an array, so this means you can also set the filter specifically for a cross reference:

{
Get {
Paragraph(
where: {
path: ["inArticle", "Article", "title"]
operator: Equal
valueString: "Francesco Bellissimo"
}
limit: 25
) {
content
inArticle {
... on Article {
title
}
}
}
}
}

Try out ⬆️

And yes, you can combine vector search with where filters.

{
Get {
Paragraph(
nearObject: { # <= vector search
id: "fd7383f7-f2e3-3d50-a272-db9b614417cb"
}
where: { # <= where filter
path: ["inArticle", "Article", "title"]
operator: Equal
valueString: "Francesco Bellissimo"
}
limit: 25
) {
content
inArticle {
... on Article {
title
}
}
_additional {
certainty
}
}
}
}

Try out ⬆️

info

We call Weaviate "vector first". This means that when combining vector search with a where filter, the where-filter will create an allowed-list that skips entries that are not allowed in the ANN index.

If you use Weaviate with modules (the current Wikipedia demo dataset uses the text2vec-transformers vectorizer module and the Q&A generator module), they might add custom filters and custom _additional properties. These arguments are described in the documentation of the respective modules themselves.

Let's explore the additional filters for the modules which are part of this dataset.

First, there is the additional nearText filter exposed by the text2vec-transformers module.

{
Get {
Paragraph(
nearText: {
concepts: ["Italian cuisine"]
}
limit: 10
) {
content
inArticle {
... on Article {
title
}
}
_additional {
certainty
}
}
}
}

Try out ⬆️

Second, we can use the ask arguments exposed by the Q&A module, note how there are also additional _additional properties.

{
Get {
Paragraph(
ask: {
question: "Who is Francesco Bellissimo?"
}
limit: 1
) {
title
inArticle {
... on Article {
title
}
}
_additional {
answer {
result
}
}
}
}
}

Try out ⬆️

And last but not least, we can combine all of them together!

{
Get {
Paragraph(
ask: {
question: "When was the program with Daniele Macuglia launched?"
}
where: { # <= where filter
path: ["inArticle", "Article", "title"]
operator: Equal
valueString: "Francesco Bellissimo"
}
limit: 1
) {
content
inArticle {
... on Article {
title
}
}
_additional {
answer {
result
}
}
}
}
}

Try out ⬆️

Talking about filters, wanna see something cool? Weaviate has many more functions out-of-the-box like feature projection to visualize your results, like this example on a 3D surface.

{
Get {
Paragraph(
nearText: {
concepts: ["Italian cuisine"]
}
limit: 10
) {
content
inArticle {
... on Article {
title
}
}
_additional {
featureProjection( # <= feature projection
dimensions: 3
){
vector
}
}
}
}
}

Try out ⬆️

Last but not least, all the standard filters are documented in the filters section of the GraphQL references documentation or in the documentation of the individual modules.

Aggregate{}​

​The Aggregate{} can be used to show aggregated data. For example, how many objects do I have of the Paragraph class?

​There are three core concepts to keep in mind for the Aggregate function.

  1. Doing something on a class level is done in the meta property.
  2. Doing something on a property level is done inside the property.
  3. Different property types (e.g., string, int, etc) support different aggregate functions.

​The examples below are a bit more explanatory.

Let's start with counting the number of data objects in the Paragraph class:

{
Aggregate {
Paragraph {
meta { # <= the meta property
count
}
}
}
}

Try out ⬆️

You can also mix in filters like this:

{
Aggregate {
Paragraph(
nearObject: { # <= vector search
id: "fd7383f7-f2e3-3d50-a272-db9b614417cb"
certainty: 0.5
}
where: { # <= where filter
path: ["inArticle", "Article", "title"]
operator: Equal
valueString: "Francesco Bellissimo"
}
) {
meta {
count
}
}
}
}

Try out ⬆️

The order property in the Paragraph class is a nice example of how you can use the Aggregate{} function for integers.

{
Aggregate {
Paragraph {
order {
count
maximum
mean
median
minimum
mode
sum
type
}
}
}
}

Try out ⬆️

You can find detailed documentation on the Aggregate{} function here.

Explore{}​

The Explore{} function can be used if you want to search through the complete vector space but if you don't know the class that you're targeting. Bear in mind, if you know the class, you know the properties, the types, etc.

In short, the Explore{} function lets you explore the vector space.

Important to know: in almost any situation, need to do two queries when using the Explore{} function and you must set a nearObject or nearVector search parameter.

  1. Target candidates based on your vector search or similarity search.
  2. Collect these candidates.
{
Explore(
nearObject: {
id: "fd7383f7-f2e3-3d50-a272-db9b614417cb"
certainty: 0.95
}
) {
beacon
className
certainty
distance
}
}

Try out ⬆️

The Explore{} function works very straightforwardly and only returns four properties.

  1. beacon contains the URL and the id. It's a "beacon in the vector space" how you can target a data object.
  2. className contains the name of the class that this data object has.
  3. certainty is the order from the query to the dataobject.
  4. distance is the distance between the query and the data object.
Warning

Data objects without vectors are skipped.

Data objects with different vector lengths than the input vector length or ID are skipped.

Recap​

Weaviate's GraphQL-API is used to query your datasets. The structure of the dataset is based on the schema you've defined. You can add vector filters, where-filters, filters from modules, and you can mix them all together.

More Resources​

If you can't find the answer to your question here, please look at the:

  1. Frequently Asked Questions. Or,
  2. Knowledge base of old issues. Or,
  3. For questions: Stackoverflow. Or,
  4. For issues: Github. Or,
  5. Ask your question in the Slack channel: Slack.