Skip to main content

Filters

Overview

This page shows you how to add conditional filters to your searches with the where operator.

A filter is a set of Boolean (i.e. True or False) conditions. Accordingly, a filter will only include or exclude objects and will not affect their rankings.

List of filter operators

For a list of filter operators, see the API references: Filters page.

ContainsAny and ContainsAll

The ContainsAny and ContainsAll operators filter objects using values of an array as criteria.

To use either of these operators, provide the filter criterion array as valueText. Note that the usage of ContainsAny and ContainsAll is different for batch deletion operations (read more).

A single-condition filter

To add a filter, you must provide at least one where condition to your query.

The following example specifies that the round property must equal "Double Jeopardy!". Note that the valueText parameter is used since the property datatype is text.

Filter arguments list

See this page for the list of available filter arguments.

from weaviate.classes import Filter

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.fetch_objects(
filters=Filter("round").equal("Double Jeopardy!"),
limit=3
)

# print result objects
for o in response.objects:
print(json.dumps(o.properties, indent=2))
Example response

It should produce a response like the one below:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "garage",
"question": "This French word originally meant \"a place where one docks\" a boat, not a car",
"round": "Double Jeopardy!"
},
{
"answer": "Mexico",
"question": "The Colorado River provides much of the border between this country's Baja California Norte & Sonora",
"round": "Double Jeopardy!"
},
{
"answer": "Amy Carter",
"question": "On September 1, 1996 this former first daughter married Jim Wentzel at the Pond House near Plains",
"round": "Double Jeopardy!"
}
]
}
}
}

With a search operator

Conditional filters can be combined with a search operator such as nearXXX, hybrid or bm25.

The following example adds a points filter to a nearText query, where the points property must be greater than 200. Note that the valueInt is used as the property datatype is int.

from weaviate.classes import Filter

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.near_text(
query="fashion icons",
filters=Filter("points").greater_than(200),
limit=3
)

# print result objects
for o in response.objects:
print(json.dumps(o.properties, indent=2))
Example response

It should produce a response like the one below:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "fashion designers",
"points": 400,
"question": "Ted Lapidus, Guy Laroche, Christian Lacroix",
"round": "Jeopardy!"
},
{
"answer": "Dapper Flapper",
"points": 400,
"question": "A stylish young woman of the 1920s",
"round": "Double Jeopardy!"
},
{
"answer": "Women's Wear Daily",
"points": 800,
"question": "This daily chronicler of the fashion industry launched \"W\", a bi-weekly, in 1972",
"round": "Jeopardy!"
}
]
}
}
}

By partial matches (text)

With text data type properties, you can use the Like operator to filter by partial matches.

The following example filters for objects including the text "inter" in any part of a token in the answer property.

* vs ?

* matches zero or more characters, whereas ? matches exactly one unknown character.

from weaviate.classes import Filter

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.fetch_objects(
filters=Filter("answer").like("*inter*"),
limit=3
)

# print result objects
for o in response.objects:
print(json.dumps(o.properties, indent=2))
Example response

It should produce a response like the one below:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "interglacial",
"question": "This term refers to the warm periods within ice ages; we're in one of those periods now",
"round": "Jeopardy!"
},
{
"answer": "the Interior",
"question": "In 1849, Thomas Ewing, \"The Logician of the West\", became the USA's first Secy. of this Cabinet Dept.",
"round": "Jeopardy!"
},
{
"answer": "Interlaken, Switzerland",
"question": "You can view the Jungfrau Peak from the main street of this town between the Brienz & Thun Lakes",
"round": "Final Jeopardy!"
}
]
}
}
}

Multiple-condition filters

To add a multiple-condition filter, you must set the operator to And or Or, and set two or more conditions under the corresponding operands parameter.

The following example specifies and And condition, so that both:

  • the round property must equal "Double Jeopardy!", and
  • the points property must be less than 600.
from weaviate.classes import Filter

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.fetch_objects(
filters=Filter("round").equal("Double Jeopardy!") &
Filter("points").less_than(600),
limit=3
)

# print result objects
for o in response.objects:
print(json.dumps(o.properties, indent=2))
Example response

It should produce a response like the one below:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "Mexico",
"points": 200,
"question": "The Colorado River provides much of the border between this country's Baja California Norte & Sonora",
"round": "Double Jeopardy!"
},
{
"answer": "Amy Carter",
"points": 200,
"question": "On September 1, 1996 this former first daughter married Jim Wentzel at the Pond House near Plains",
"round": "Double Jeopardy!"
},
{
"answer": "Greek",
"points": 400,
"question": "Athenians speak the Attic dialect of this language",
"round": "Double Jeopardy!"
}
]
}
}
}

Nested multiple conditions

Conditional filters can be nested in Weaviate. To do so, set the operator of an outer operands value to And or Or. Then, you can provide two or more conditions to the inner operands.

The following example specifies that:

  • the answer property must contain a substring "nest", And
  • the points property must be greater than 700, Or, the points property must be less than 300.
from weaviate.classes import Filter

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.fetch_objects(
filters=Filter("question").like("*nest*") &
(Filter("points").greater_than(700) | Filter("points").less_than(300)),
limit=3
)

# print result objects
for o in response.objects:
print(json.dumps(o.properties, indent=2))
Example response

It should produce a response like the one below:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "rhinestones",
"points": 100,
"question": "Imitation diamonds, they were originally gems obtained from a certain German river",
"round": "Jeopardy!"
},
{
"answer": "Clytemnestra",
"points": 1000,
"question": "In \"Absalom! Absalom!\", it's the \"mythological\" name of Thomas Stupen's daughter, known as Clytie for short",
"round": "Double Jeopardy!"
},
{
"answer": "Ernest Hemingway",
"points": 200,
"question": "His 1926 novel \"The Sun Also Rises\" has been published in England as \"Fiesta\"",
"round": "Jeopardy!"
}
]
}
}
}

Filter using cross-references

You can filter objects using properties from a cross-referenced object.

The following example filters JeopardyQuestion objects using properties of JeopardyCategory that they are cross-referencing.

More specifically, the example filters for the title property of JeopardyCategory objects that are cross-referenced from the JeopardyQuestion object. The title property must include the substring Sport.

Case-sensitivity

The results are case-insensitive here, as the title property is defined with word tokenization.

from weaviate.classes import Filter

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.fetch_objects(
filters=Filter(["hasCategory", "JeopardyCategory", "title"]).like("*Sport*"),
limit=3
)

# print result objects
for o in response.objects:
print(json.dumps(o.properties, indent=2))
Example response

It should produce a response like the one below:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "Sampan",
"hasCategory": [
{
"title": "TRANSPORTATION"
}
],
"question": "Smaller than a junk, this Oriental boat usually has a cabin with a roof made of mats",
"round": "Jeopardy!"
},
{
"answer": "Emmitt Smith",
"hasCategory": [
{
"title": "SPORTS"
}
],
"question": "In 1994 this Dallas Cowboy scored 22 touchdowns; in 1995 he topped that with 25",
"round": "Jeopardy!"
},
{
"answer": "Lee Iacocca",
"hasCategory": [
{
"title": "TRANSPORTATION"
}
],
"question": "Chrysler executive who developed the Ford Mustang",
"round": "Jeopardy!"
}
]
}
}
}

Filter by metadata

You can filter by any number of metadata properties, such as object id, property length, timestamp, null state and more.

See the API references: Filters page for the full list of available metadata filters and any special usage patterns.

Improving filter performance

In some edge cases, filter performance may be slow due to a mismatch between the filter architecture and the data structure. For example, if a property has very large cardinality (i.e. a large number of unique values), its range-based filter performance may be slow.

If you are experiencing slow filter performance, we suggest further restricting your query by adding more conditions to the where operator, or adding a limit parameter to your query.

We are working on improving the performance of these filters in a future release. Please upvote this feature if this is important to you, so we can prioritize it accordingly.