Skip to main content

Hybrid search

Hybrid search combines results of a vector search and a keyword (BM25F) search. You can set the weights or the ranking method.

Named vectors

Added in v1.24

A hybrid on collections with named vectors configured must include a target vector name in the query. This allows Weaviate to find the correct vector to compare with the query vector.

reviews = client.collections.get("WineReviewNV")
response = reviews.query.hybrid(
query="A French Riesling",
target_vector="title_country",
limit=3
)

for o in response.objects:
print(o.properties)
Example response

The output is like this:


Combines results of a vector search and a keyword search based on the query string.

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.hybrid(
query="food",
limit=3
)

for o in response.objects:
print(o.properties)
Example response

The output is like this:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "a closer grocer",
"question": "A nearer food merchant"
},
{
"answer": "Famine",
"question": "From the Latin for \"hunger\", it's a period when food is extremely scarce"
},
{
"answer": "Tofu",
"question": "A popular health food, this soybean curd is used to make a variety of dishes & an ice cream substitute"
}
]
}
}
}

Explain the search results

Use the metadata properties to understand why an object is selected.

from weaviate.classes.query import MetadataQuery

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.hybrid(
query="food",
alpha=0.5,
return_metadata=MetadataQuery(score=True, explain_score=True),
limit=3
)

for o in response.objects:
print(o.properties)
print(o.metadata.score, o.metadata.explain_score)
Example response

The output is like this:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"_additional": {
"explainScore": "(bm25)\n(hybrid) Document df958a90-c3ad-5fde-9122-cd777c22da6c contributed 0.003968253968253968 to the score\n(hybrid) Document df958a90-c3ad-5fde-9122-cd777c22da6c contributed 0.012295081967213115 to the score",
"score": "0.016263336"
},
"answer": "a closer grocer",
"question": "A nearer food merchant"
},
{
"_additional": {
"explainScore": "(vector) [0.0223698 -0.02752683 -0.0061537363 0.0023812135 -0.00036100898 -0.0078375945 -0.018505432 -0.037500713 -0.0042215516 -0.012620432]... \n(hybrid) Document ec776112-e651-519d-afd1-b48e6237bbcb contributed 0.012096774193548387 to the score",
"score": "0.012096774"
},
"answer": "Famine",
"question": "From the Latin for \"hunger\", it's a period when food is extremely scarce"
},
{
"_additional": {
"explainScore": "(vector) [0.0223698 -0.02752683 -0.0061537363 0.0023812135 -0.00036100898 -0.0078375945 -0.018505432 -0.037500713 -0.0042215516 -0.012620432]... \n(hybrid) Document 98807640-cd16-507d-86a1-801902d784de contributed 0.011904761904761904 to the score",
"score": "0.011904762"
},
"answer": "Tofu",
"question": "A popular health food, this soybean curd is used to make a variety of dishes & an ice cream substitute"
}
]
}
}
}

Use the alpha argument to change how much each search affects the results.

  • An alpha of 1 is a pure vector search.
  • An alpha of 0 is a pure keyword search.
client.collections.get("JeopardyQuestion")
response = jeopardy.query.hybrid(
query="food",
alpha=0.25,
limit=3
)

for o in response.objects:
print(o.properties)
Example response

The output is like this:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "a closer grocer",
"question": "A nearer food merchant"
},
{
"answer": "food stores (supermarkets)",
"question": "This type of retail store sells more shampoo & makeup than any other"
},
{
"answer": "cake",
"question": "Devil's food & angel food are types of this dessert"
}
]
}
}
}

Change the ranking method

Added in v1.20

Ranked Fusion is the default fusion algorithm.

  • To use objects' keyword and vector search scores instead of ranks, use Relative Score Fusion.
  • To use autocut with the hybrid operator, use Relative Score Fusion.
from weaviate.classes.query import HybridFusion

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.hybrid(
query="food",
fusion_type=HybridFusion.RELATIVE_SCORE,
limit=3
)

for o in response.objects:
print(o.properties)
Example response

The output is like this:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "a closer grocer",
"question": "A nearer food merchant"
},
{
"answer": "food stores (supermarkets)",
"question": "This type of retail store sells more shampoo & makeup than any other"
},
{
"answer": "cake",
"question": "Devil's food & angel food are types of this dessert"
}
]
}
}
}
Additional information

For a discussion of fusion methods, see this blog post and this reference page

Added in v1.19.0

The keyword search portion of hybrid search can be directed to only search a subset of object properties. This does not affect the vector search portion.

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.hybrid(
query="food",
query_properties=["question"],
alpha=0.25,
limit=3
)

for o in response.objects:
print(o.properties)
Example response

The output is like this:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "a closer grocer",
"question": "A nearer food merchant"
},
{
"answer": "cake",
"question": "Devil's food & angel food are types of this dessert"
},
{
"answer": "honey",
"question": "The primary source of this food is the Apis mellifera"
}
]
}
}
}

Set weights on property values

Specify the relative value of an object's properties in the keyword search. Higher values increase the property's contribution to the search score.

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.hybrid(
query="food",
query_properties=["question^2", "answer"],
alpha=0.25,
limit=3
)

for o in response.objects:
print(o.properties)
Example response

The output is like this:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "a closer grocer",
"question": "A nearer food merchant"
},
{
"answer": "cake",
"question": "Devil's food & angel food are types of this dessert"
},
{
"answer": "food stores (supermarkets)",
"question": "This type of retail store sells more shampoo & makeup than any other"
}
]
}
}
}

Specify a vector

To specify a vector instead of using a vector of the query string, pass it in addition to the query string for the keyword search.

query_vector = [-0.02] * 1536  # Some vector that is compatible with object vectors

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.hybrid(
query="food",
query_properties=["question^2", "answer"],
vector=query_vector,
limit=3
)

for o in response.objects:
print(o.properties)