Skip to main content

Hybrid search

Hybrid search combines the results of a vector search and a keyword (BM25F) search by fusing the two result sets.

The fusion method and the relative weights are configurable.

Combine the results of a vector search and a keyword search. The search uses a single query string.

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.hybrid(query="food", limit=3)

for o in response.objects:
print(o.properties)
Example response

The output is like this:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "a closer grocer",
"question": "A nearer food merchant"
},
{
"answer": "Famine",
"question": "From the Latin for \"hunger\", it's a period when food is extremely scarce"
},
{
"answer": "Tofu",
"question": "A popular health food, this soybean curd is used to make a variety of dishes & an ice cream substitute"
}
]
}
}
}

Named vectors

Added in v1.24

A hybrid search on a collection that has named vectors must specify a target vector. Weaviate uses the query vector to search the target vector space.

reviews = client.collections.get("WineReviewNV")
response = reviews.query.hybrid(
query="A French Riesling",
target_vector="title_country",
limit=3
)

for o in response.objects:
print(o.properties)
Example response

The output is like this:


Explain the search results

To see the object rankings, set the explain score field in your query. The search rankings are part of the object metadata. Weaviate uses the score to order the search results.

from weaviate.classes.query import MetadataQuery

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.hybrid(
query="food",
alpha=0.5,
return_metadata=MetadataQuery(score=True, explain_score=True),
limit=3,
)

for o in response.objects:
print(o.properties)
print(o.metadata.score, o.metadata.explain_score)
Example response

The output is like this:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"_additional": {
"explainScore": "(bm25)\n(hybrid) Document df958a90-c3ad-5fde-9122-cd777c22da6c contributed 0.003968253968253968 to the score\n(hybrid) Document df958a90-c3ad-5fde-9122-cd777c22da6c contributed 0.012295081967213115 to the score",
"score": "0.016263336"
},
"answer": "a closer grocer",
"question": "A nearer food merchant"
},
{
"_additional": {
"explainScore": "(vector) [0.0223698 -0.02752683 -0.0061537363 0.0023812135 -0.00036100898 -0.0078375945 -0.018505432 -0.037500713 -0.0042215516 -0.012620432]... \n(hybrid) Document ec776112-e651-519d-afd1-b48e6237bbcb contributed 0.012096774193548387 to the score",
"score": "0.012096774"
},
"answer": "Famine",
"question": "From the Latin for \"hunger\", it's a period when food is extremely scarce"
},
{
"_additional": {
"explainScore": "(vector) [0.0223698 -0.02752683 -0.0061537363 0.0023812135 -0.00036100898 -0.0078375945 -0.018505432 -0.037500713 -0.0042215516 -0.012620432]... \n(hybrid) Document 98807640-cd16-507d-86a1-801902d784de contributed 0.011904761904761904 to the score",
"score": "0.011904762"
},
"answer": "Tofu",
"question": "A popular health food, this soybean curd is used to make a variety of dishes & an ice cream substitute"
}
]
}
}
}

Hybrid search results can favor the keyword component or the vector component. To change the relative weights of the keyword and vector components, set the alpha value in your query.

  • An alpha of 1 is a pure vector search.
  • An alpha of 0 is a pure keyword search.
client.collections.get("JeopardyQuestion")
response = jeopardy.query.hybrid(
query="food",
alpha=0.25,
limit=3,
)

for o in response.objects:
print(o.properties)
Example response

The output is like this:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "a closer grocer",
"question": "A nearer food merchant"
},
{
"answer": "food stores (supermarkets)",
"question": "This type of retail store sells more shampoo & makeup than any other"
},
{
"answer": "cake",
"question": "Devil's food & angel food are types of this dessert"
}
]
}
}
}

Change the fusion method

Relative Score Fusion is the default fusion method starting in v1.24.

  • To use the keyword and vector search relative scores instead of the search rankings, use Relative Score Fusion.
  • To use autocut with the hybrid operator, use Relative Score Fusion.
from weaviate.classes.query import HybridFusion

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.hybrid(
query="food",
fusion_type=HybridFusion.RELATIVE_SCORE,
limit=3,
)

for o in response.objects:
print(o.properties)
Example response

The output is like this:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "a closer grocer",
"question": "A nearer food merchant"
},
{
"answer": "food stores (supermarkets)",
"question": "This type of retail store sells more shampoo & makeup than any other"
},
{
"answer": "cake",
"question": "Devil's food & angel food are types of this dessert"
}
]
}
}
}
Additional information

For a discussion of fusion methods, see this blog post and this reference page

Specify keyword search properties

Added in v1.19.0

The keyword search portion of hybrid search can be directed to only search a subset of object properties. This does not affect the vector search portion.

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.hybrid(
query="food",
query_properties=["question"],
alpha=0.25,
limit=3,
)

for o in response.objects:
print(o.properties)
Example response

The output is like this:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "a closer grocer",
"question": "A nearer food merchant"
},
{
"answer": "cake",
"question": "Devil's food & angel food are types of this dessert"
},
{
"answer": "honey",
"question": "The primary source of this food is the Apis mellifera"
}
]
}
}
}

Set weights on property values

Specify the relative value of an object's properties in the keyword search. Higher values increase the property's contribution to the search score.

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.hybrid(
query="food",
query_properties=["question^2", "answer"],
alpha=0.25,
limit=3,
)

for o in response.objects:
print(o.properties)
Example response

The output is like this:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "a closer grocer",
"question": "A nearer food merchant"
},
{
"answer": "cake",
"question": "Devil's food & angel food are types of this dessert"
},
{
"answer": "food stores (supermarkets)",
"question": "This type of retail store sells more shampoo & makeup than any other"
}
]
}
}
}

Specify a search vector

The vector component of hybrid search can use a query string or a query vector. To specify a query vector instead of a query string, provide a query vector (for the vector search) and a query string (for the keyword search) in your query.

query_vector = [-0.02] * 1536  # Some vector that is compatible with object vectors

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.hybrid(
query="food",
vector=query_vector,
alpha=0.25,
limit=3,
)

for o in response.objects:
print(o.properties)