Search operators
This page covers the search operators that can be used in queries, such as vector search operators (nearText
, nearVector
, nearObject
, etc), keyword search operator (bm25
), hybrid search operator (hybrid
).
Only one search operator can be added to queries on the collection level.
Operator availability
Built-in operators
These operators are available in all Weaviate instances regardless of configuration.
Module-specific operators
Module-specific search operators are made available in certain Weaviate modules.
By adding relevant modules, you can use the following operators:
Vector search operators
nearXXX
operators allow you to find data objects based on their vector similarity to the query. They query can be a raw vector (nearVector
) or an object UUID (nearObject
).
If the appropriate vectorizer model is enabled, a text query (nearText
), an image (nearImage
), or another media input may be be used as the query.
All vector search operators can be used with a certainty
or distance
threshold specified, as well as a limit
operator or an autocut
operator to specify the desired similarity or distance between the query and the results
nearVector
nearVector
finds data objects closest to an input vector.
Variables
Variable | Required | Type | Description |
---|---|---|---|
vector | yes | [float] | This variable takes a vector embedding in the form of an array of floats. The array should have the same length as the vectors in this collection. |
distance | no | float | The maximum allowed distance to the provided search input. Cannot be used together with the certainty variable. The interpretation of the value of the distance field depends on the distance metric used. |
certainty | no | float | Normalized Distance between the result item and the search vector. Normalized to be between 0 (perfect opposite) and 1 (identical vectors). Can't be used together with the distance variable. |
Example
- Python Client v4
- Python Client v3
- JS/TS Client v2
- Go
- Java
- Curl
- GraphQL
import weaviate
import weaviate.classes as wvc
import os
client = weaviate.connect_to_local(
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY")
}
)
try:
collection = client.collections.get("Article")
response = collection.query.near_vector(
near_vector=query_vector,
distance=0.7,
limit=5,
)
for o in response.objects:
print(o.properties)
finally:
client.close()
import weaviate
client = weaviate.Client("http://localhost:8080")
nearVector = {
"vector": [0.1, -0.15, 0.3.. ] # Replace with a compatible vector
}
result = (
client.query
.get("Publication", "name")
.with_additional("distance")
.with_near_vector(nearVector)
.do()
)
print(result)
import weaviate from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080',
});
const response = await client.graphql
.get()
.withClassName('Publication')
.withFields('name _additional {certainty}')
.withNearVector({
vector: [0.1, -0.15, 0.3, ... ] // Replace with a compatible vector
})
.do();
console.log(response);
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate-go-client/v4/weaviate/graphql"
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
className := "Publication"
name := graphql.Field{Name: "name"}
_additional := graphql.Field{
Name: "_additional", Fields: []graphql.Field{
{Name: "certainty"}, // only supported if distance==cosine
{Name: "distance"}, // always supported
},
}
nearVector := client.GraphQL().NearVectorArgBuilder().
WithVector([]float32{0.1, -0.15, 0.3}) // Replace with a compatible vector
ctx := context.Background()
result, err := client.GraphQL().Get().
WithClassName(className).
WithFields(name, _additional).
WithNearVector(nearVector).
Do(ctx)
if err != nil {
panic(err)
}
fmt.Printf("%v", result)
}
package io.weaviate;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
import io.weaviate.client.v1.graphql.model.GraphQLResponse;
import io.weaviate.client.v1.graphql.query.argument.NearVectorArgument;
import io.weaviate.client.v1.graphql.query.fields.Field;
public class App {
public static void main(String[] args) {
Config config = new Config("http", "localhost:8080");
WeaviateClient client = new WeaviateClient(config);
String className = "Publication";
Field title = Field.builder().name("title").build();
Field _additional = Field.builder()
.name("_additional")
.fields(new Field[]{
Field.builder().name("certainty").build() // only supported if distance==cosine
Field.builder().name("distance").build() // always supported
}).build();
Float[] vector = new Float[]{0.1f, -0.15f, 0.3f}; // Replace with a compatible vector
NearVectorArgument nearVector = NearVectorArgument.builder()
.vector(vector)
.build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName(className)
.withFields(title, _additional)
.withNearVector(nearVector)
.run();
if (result.hasErrors()) {
System.out.println(result.getError());
return;
}
System.out.println(result.getResult());
}
}
Replace the placeholder vector with a compatible vector.
# Note: under _additional, `certainty` is only supported if distance==cosine, but `distance` is always supported
echo '{
"query": "{
Get {
Publication(
nearVector: {
vector: [0.1, -0.15, 0.3]
}
) {
name
_additional {
certainty
distance
}
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer learn-weaviate' \
-d @- \
https://edu-demo.weaviate.network/v1/graphql
{
Get{
Publication(
nearVector: {
vector: [0.1, -0.15, 0.3] # Replace with a compatible vector
}
){
name
_additional {
certainty
}
}
}
}
nearObject
nearVector
finds data objects closest to an existing object in the same collection. The object is typically specified by its UUID.
- Note: You can specify an object's
id
orbeacon
in the argument, along with a desiredcertainty
. - Note that the first result will always be the object used for search.
Variables
Variable | Required | Type | Description |
---|---|---|---|
id | yes | UUID | Data object identifier in the uuid format. |
beacon | no | url | Data object identifier in the beacon URL format. E.g., weaviate://<hostname>/<kind>/id . |
distance | no | float | The maximum allowed distance to the provided search input. Cannot be used together with the certainty variable. The interpretation of the value of the distance field depends on the distance metric used. |
certainty | no | float | Normalized Distance between the result item and the search vector. Normalized to be between 0 (perfect opposite) and 1 (identical vectors). Can't be used together with the distance variable. |
Example
- Python Client v4
- Python Client v3
- JS/TS Client v2
- Go
- Java
- Curl
- GraphQL
import weaviate
import weaviate.classes as wvc
import os
client = weaviate.connect_to_local(
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY")
}
)
try:
collection = client.collections.get("Article")
response = collection.query.near_object(
near_object=object_id,
distance=0.6,
limit=5,
)
for o in response.objects:
print(o.properties)
finally:
client.close()
import weaviate
client = weaviate.Client("http://localhost:8080")
nearObject = {
"id": "32d5a368-ace8-3bb7-ade7-9f7ff03eddb6", # or {"beacon": "weaviate://localhost/32d5a368-ace8-3bb7-ade7-9f7ff03eddb6"}
"distance": 0.6,
}
result = (
client.query
.get("Publication", "name")
.with_additional("distance") # "certainty" only supported if distance==cosine
.with_near_object(nearObject)
.with_limit(5)
.do()
)
print(result)
import weaviate from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080',
});
const response = await client.graphql
.get()
.withClassName('Publication')
.withFields('name _additional {certainty distance}}') // certainty only supported if distance==cosine
.withNearObject({
id: '32d5a368-ace8-3bb7-ade7-9f7ff03eddb6',
distance: 0.6,
})
.withLimit(5)
.do();
console.log(response);
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate-go-client/v4/weaviate/graphql"
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
className := "Publication"
fields := []graphql.Field{
{Name: "name"},
{Name: "_additional", Fields: []graphql.Field{
{Name: "certainty"}, // certainty only supported if distance==cosine
{Name: "distance"}, // distance always supported
}},
}
nearObject := client.GraphQL().NearObjectArgBuilder().WithID("32d5a368-ace8-3bb7-ade7-9f7ff03eddb6")
ctx := context.Background()
result, err := client.GraphQL().Get().
WithClassName(className).
WithFields(fields...).
WithNearObject(nearObject).
Do(ctx)
if err != nil {
panic(err)
}
fmt.Printf("%v", result)
}
package io.weaviate;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
import io.weaviate.client.v1.graphql.model.GraphQLResponse;
import io.weaviate.client.v1.graphql.query.argument.NearObjectArgument;
import io.weaviate.client.v1.graphql.query.fields.Field;
public class App {
public static void main(String[] args) {
Config config = new Config("http", "localhost:8080");
WeaviateClient client = new WeaviateClient(config);
String className = "Publication";
Field name = Field.builder().name("name").build();
Field _additional = Field.builder()
.name("_additional")
.fields(new Field[]{
Field.builder().name("certainty").build(), // only supported if distance==cosine
Field.builder().name("distance").build() // always supported
}).build();
NearObjectArgument nearObject = client.graphQL().arguments().nearObjectArgBuilder()
.id("32d5a368-ace8-3bb7-ade7-9f7ff03eddb6")
.build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName(className)
.withFields(name, _additional)
.withNearObject(nearObject)
.run();
if (result.hasErrors()) {
System.out.println(result.getError());
return;
}
System.out.println(result.getResult());
}
}
# Note: prior to v1.14, use `certainty` instead of `distance`
# Under _additional, `certainty` is only supported if distance==cosine, but `distance` is always supported
echo '{
"query": "{
Get {
Publication(
nearObject: {
id: \"32d5a368-ace8-3bb7-ade7-9f7ff03eddb6\",
distance: 0.6
}
) {
name
_additional {
certainty
distance
}
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer learn-weaviate' \
-d @- \
https://edu-demo.weaviate.network/v1/graphql
{
Get{
Publication(
nearObject: {
id: "32d5a368-ace8-3bb7-ade7-9f7ff03eddb6", # or weaviate://localhost/32d5a368-ace8-3bb7-ade7-9f7ff03eddb6
distance: 0.6 # prior to v1.14, use certainty: 0.7
}
) {
name
_additional {
certainty # only works if distance==cosine
distance # always works
}
}
}
}
Expected response
{
"data": {
"Get": {
"Publication": [
{
"_additional": {
"distance": -1.1920929e-07
},
"name": "The New York Times Company"
},
{
"_additional": {
"distance": 0.059879005
},
"name": "New York Times"
},
{
"_additional": {
"distance": 0.09176409
},
"name": "International New York Times"
},
{
"_additional": {
"distance": 0.13954824
},
"name": "New Yorker"
},
...
]
}
}
}
nearText
The nearText
operator finds data objects based on their vector similarity to a natural language query.
This operator is enabled if a compatible vectorizer module is configured for the collection. Compatible vectorizer modules are:
- Any
text2vec
module - Any
multi2vec
module
Variables
Variable | Required | Type | Description |
---|---|---|---|
concepts | yes | [string] | An array of strings that can be natural language queries, or single words. If multiple strings are used, a centroid is calculated and used. Learn more about how the concepts are parsed here. |
distance | no | float | The maximum allowed distance to the provided search input. Cannot be used together with the certainty variable. The interpretation of the value of the distance field depends on the distance metric used. |
certainty | no | float | Normalized Distance between the result item and the search vector. Normalized to be between 0 (perfect opposite) and 1 (identical vectors). Can't be used together with the distance variable. |
autocorrect | no | boolean | Autocorrect input text values. Requires the text-spellcheck module to be present & enabled. |
moveTo | no | object{} | Move your search term closer to another vector described by keywords |
moveTo{concepts} | no | [string] | An array of strings - natural language queries or single words. If multiple strings are used, a centroid is calculated and used. |
moveTo{objects} | no | [UUID] | Object IDs to move the results to. This is used to "bias" NLP search results into a certain direction in vector space. |
moveTo{force} | no | float | The force to apply to a particular movement. Must be between 0 and 1 where 0 is equivalent to no movement and 1 is equivalent to largest movement possible. |
moveAwayFrom | no | object{} | Move your search term away from another vector described by keywords |
moveAwayFrom{concepts} | no | [string] | An array of strings - natural language queries or single words. If multiple strings are used, a centroid is calculated and used. |
moveAwayFrom{objects} | no | [UUID] | Object IDs to move the results from. This is used to "bias" NLP search results into a certain direction in vector space. |
moveAwayFrom{force} | no | float | The force to apply to a particular movement. Must be between 0 and 1 where 0 is equivalent to no movement and 1 is equivalent to largest movement possible. |
Example I
This example shows an example usage the nearText
operator, including how to bias results towards another search query.
- Python Client v4
- Python Client v3
- JS/TS Client v2
- Go
- Java
- Curl
- GraphQL
import weaviate
import weaviate.classes as wvc
from weaviate.classes.query import Move
import os
client = weaviate.connect_to_local()
try:
publications = client.collections.get("Publication")
response = publications.query.near_text(
query="fashion",
distance=0.6,
move_to=Move(force=0.85, concepts="haute couture"),
move_away=Move(force=0.45, concepts="finance"),
return_metadata=wvc.query.MetadataQuery(distance=True),
limit=2
)
for o in response.objects:
print(o.properties)
print(o.metadata)
finally:
client.close()
import weaviate
client = weaviate.Client("http://localhost:8080")
nearText = {
"concepts": ["fashion"],
"distance": 0.6, # prior to v1.14 use "certainty" instead of "distance"
"moveAwayFrom": {
"concepts": ["finance"],
"force": 0.45
},
"moveTo": {
"concepts": ["haute couture"],
"force": 0.85
}
}
result = (
client.query
.get("Publication", "name")
.with_additional(["certainty OR distance"]) # note that certainty is only supported if distance==cosine
.with_near_text(nearText)
.do()
)
print(result)
import weaviate from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080',
});
const response = await client.graphql
.get()
.withClassName('Publication')
.withFields('name _additional{certainty distance}') // note that certainty is only supported if distance==cosine
.withNearText({
concepts: ['fashion'],
distance: 0.6, // prior to v1.14 use certainty instead of distance
moveAwayFrom: {
concepts: ['finance'],
force: 0.45,
},
moveTo: {
concepts: ['haute couture'],
force: 0.85,
},
})
.do();
console.log(response);
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate-go-client/v4/weaviate/graphql"
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
className := "Publication"
name := graphql.Field{Name: "name"}
_additional := graphql.Field{
Name: "_additional", Fields: []graphql.Field{
{Name: "certainty"}, // only supported if distance==cosine
{Name: "distance"}, // always supported
},
}
concepts := []string{"fashion"}
distance := float32(0.6)
moveAwayFrom := &graphql.MoveParameters{
Concepts: []string{"finance"},
Force: 0.45,
}
moveTo := &graphql.MoveParameters{
Concepts: []string{"haute couture"},
Force: 0.85,
}
nearText := client.GraphQL().NearTextArgBuilder().
WithConcepts(concepts).
WithDistance(distance). // use WithCertainty(certainty) prior to v1.14
WithMoveTo(moveTo).
WithMoveAwayFrom(moveAwayFrom)
ctx := context.Background()
result, err := client.GraphQL().Get().
WithClassName(className).
WithFields(name, _additional).
WithNearText(nearText).
Do(ctx)
if err != nil {
panic(err)
}
fmt.Printf("%v", result)
}
package io.weaviate;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
import io.weaviate.client.v1.graphql.model.GraphQLResponse;
import io.weaviate.client.v1.graphql.query.argument.NearTextArgument;
import io.weaviate.client.v1.graphql.query.argument.NearTextMoveParameters;
import io.weaviate.client.v1.graphql.query.fields.Field;
public class App {
public static void main(String[] args) {
Config config = new Config("http", "localhost:8080");
WeaviateClient client = new WeaviateClient(config);
NearTextMoveParameters moveTo = NearTextMoveParameters.builder()
.concepts(new String[]{ "haute couture" }).force(0.85f).build();
NearTextMoveParameters moveAway = NearTextMoveParameters.builder()
.concepts(new String[]{ "finance" }).force(0.45f)
.build();
NearTextArgument nearText = client.graphQL().arguments().nearTextArgBuilder()
.concepts(new String[]{ "fashion" })
.distance(0.6f) // use .certainty(0.7f) prior to v1.14
.moveTo(moveTo)
.moveAwayFrom(moveAway)
.build();
Field name = Field.builder().name("name").build();
Field _additional = Field.builder()
.name("_additional")
.fields(new Field[]{
Field.builder().name("certainty").build(), // only supported if distance==cosine
Field.builder().name("distance").build(), // always supported
}).build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName("Publication")
.withFields(name, _additional)
.withNearText(nearText)
.run();
if (result.hasErrors()) {
System.out.println(result.getError());
return;
}
System.out.println(result.getResult());
}
}
# Note: Under nearText, use `certainty` instead of distance prior to v1.14
# Under _additional, `certainty` is only supported if distance==cosine, but `distance` is always supported
echo '{
"query": "{
Get {
Publication(
nearText: {
concepts: [\"fashion\"],
distance: 0.6,
moveAwayFrom: {
concepts: [\"finance\"],
force: 0.45
},
moveTo: {
concepts: [\"haute couture\"],
force: 0.85
}
}
) {
name
_additional {
certainty
distance
}
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer learn-weaviate' \
-H "X-OpenAI-Api-Key: $OPENAI_API_KEY" \
-d @- \
https://edu-demo.weaviate.network/v1/graphql
{
Get{
Publication(
nearText: {
concepts: ["fashion"],
distance: 0.6 # prior to v1.14 use "certainty" instead of "distance"
moveAwayFrom: {
concepts: ["finance"],
force: 0.45
},
moveTo: {
concepts: ["haute couture"],
force: 0.85
}
}
){
name
_additional {
certainty # only supported if distance==cosine.
distance # always supported
}
}
}
}
Example II
You can also bias results toward other data objects. For example, in this query, we move our query about "travelling in asia", towards an article on food.
- Python Client v4
- Python Client v3
- JS/TS Client v2
- Go
- Java
- Curl
- GraphQL
import weaviate
import weaviate.classes as wvc
import os
client = weaviate.connect_to_local(
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY")
}
)
try:
collection = client.collections.get("Article")
response = collection.query.near_text(
query="travelling in Asia",
certainty=0.7,
move_to=wvc.query.Move(
force=0.75,
objects="c4209549-7981-3699-9648-61a78c2124b9"
),
return_metadata=wvc.query.MetadataQuery(certainty=True),
limit=5,
)
for o in response.objects:
print(o.properties)
print(o.metadata.certainty)
finally:
client.close()
import weaviate
client = weaviate.Client("http://localhost:8080")
nearText = {
"concepts": ["travelling in Asia"],
"certainty": 0.7,
"moveTo": {
"objects": [{"id": "c4209549-7981-3699-9648-61a78c2124b9"}],
"force": 0.85
}
}
result = (
client.query
.get("Article", ["title", "summary", "_additional { certainty }"])
.with_near_text(nearText)
.do()
)
print(result)
import weaviate from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080',
});
const response = await client.graphql
.get()
.withClassName('Article')
.withFields('title summary _additional { certainty }')
.withNearText({
concepts: ['travelling in Asia'],
certainty: 0.7,
moveTo: {
// this ID is of the article: "Tohoku: A Japan destination for all seasons."
objects: [{ id: 'c4209549-7981-3699-9648-61a78c2124b9' }],
force: 0.85,
},
})
.do();
console.log(response);
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate-go-client/v4/weaviate/graphql"
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
className := "Article"
title := graphql.Field{Name: "title"}
summary := graphql.Field{Name: "summary"}
_additional := graphql.Field{
Name: "_additional", Fields: []graphql.Field{
{Name: "certainty"},
},
}
concepts := []string{"travelling in Asia"}
certainty := float32(0.7)
moveTo := &graphql.MoveParameters{
Objects: []graphql.MoverObject{
// this ID is of the article: "Tohoku: A Japan destination for all seasons."
{ID: "c4209549-7981-3699-9648-61a78c2124b9"},
},
Force: 0.85,
}
nearText := client.GraphQL().NearTextArgBuilder().
WithConcepts(concepts).
WithCertainty(certainty).
WithMoveTo(moveTo)
ctx := context.Background()
result, err := client.GraphQL().Get().
WithClassName(className).
WithFields(title, summary, _additional).
WithNearText(nearText).
Do(ctx)
if err != nil {
panic(err)
}
fmt.Printf("%v", result)
}
package io.weaviate;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
import io.weaviate.client.v1.graphql.model.GraphQLResponse;
import io.weaviate.client.v1.graphql.query.argument.NearTextArgument;
import io.weaviate.client.v1.graphql.query.argument.NearTextMoveParameters;
import io.weaviate.client.v1.graphql.query.fields.Field;
public class App {
public static void main(String[] args) {
Config config = new Config("http", "localhost:8080");
WeaviateClient client = new WeaviateClient(config);
NearTextMoveParameters moveTo = NearTextMoveParameters.builder()
.objects(new NearTextMoveParameters.ObjectMove[]{
// this ID is of the article: "Tohoku: A Japan destination for all seasons."
NearTextMoveParameters.ObjectMove.builder().id("c4209549-7981-3699-9648-61a78c2124b9").build()
})
.force(0.85f)
.build();
NearTextArgument nearText = client.graphQL().arguments().nearTextArgBuilder()
.concepts(new String[]{ "travelling in Asia" })
.certainty(0.7f)
.moveTo(moveTo)
.build();
Field title = Field.builder().name("title").build();
Field summary = Field.builder().name("summary").build();
Field _additional = Field.builder()
.name("_additional")
.fields(new Field[]{
Field.builder().name("certainty").build(),
}).build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName("Article")
.withFields(title, summary, _additional)
.withNearText(nearText)
.run();
if (result.hasErrors()) {
System.out.println(result.getError());
return;
}
System.out.println(result.getResult());
}
}
# The ID belongs to the article "Tohoku: A Japan destination for all seasons."
echo '{
"query": "{
Get {
Article(
nearText: {
concepts: [\"travelling in Asia\"],
certainty: 0.7,
moveTo: {
objects: [{
id: \"c4209549-7981-3699-9648-61a78c2124b9\"
}]
force: 0.85
}
}
) {
title
summary
_additional {
certainty
}
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer learn-weaviate' \
-H "X-OpenAI-Api-Key: $OPENAI_API_KEY" \
-d @- \
https://edu-demo.weaviate.network/v1/graphql
{
Get {
Article(
nearText: {
concepts: ["travelling in Asia"],
certainty: 0.7,
moveTo: {
objects: [{
# this ID is of the article:
# "Tohoku: A Japan destination for all seasons."
id: "c4209549-7981-3699-9648-61a78c2124b9"
}]
force: 0.85
}
}
) {
title
summary
_additional {
certainty
}
}
}
}
Expected response
{
"data": {
"Get": {
"Article": [
{
"_additional": {
"certainty": 0.9619976580142975
},
"summary": "We've scoured the planet for what we think are 50 of the most delicious foods ever created. A Hong Kong best food, best enjoyed before cholesterol checks. When you have a best food as naturally delicious as these little fellas, keep it simple. Courtesy Matt@PEK/Creative Commons/FlickrThis best food Thai masterpiece teems with shrimp, mushrooms, tomatoes, lemongrass, galangal and kaffir lime leaves. It's a result of being born in a land where the world's most delicious food is sold on nearly every street corner.",
"title": "World food: 50 best dishes"
},
{
"_additional": {
"certainty": 0.9297388792037964
},
"summary": "The look reflects the elegant ambiance created by interior designer Joyce Wang in Hong Kong, while their mixology program also reflects the original venue. MONO Hong Kong , 5/F, 18 On Lan Street, Central, Hong KongKoral, The Apurva Kempinski Bali, IndonesiaKoral's signature dish: Tomatoes Bedugul. Esterre at Palace Hotel TokyoLegendary French chef Alain Ducasse has a global portfolio of restaurants, many holding Michelin stars. John Anthony/JW Marriott HanoiCantonese cuisine from Hong Kong is again on the menu, this time at the JW Marriott in Hanoi. Stanley takes its name from the elegant Hong Kong waterside district and the design touches reflect this legacy with Chinese antiques.",
"title": "20 best new Asia-Pacific restaurants to try in 2020"
}
...
]
}
}
}
Additional information
Concept parsing
A nearText
query will interpret each term in an array input as distinct strings to be vectorized. If multiple strings are passed, the query vector will be an average vector of the individual string vectors.
["New York Times"]
= one vector position is determined based on the occurrences of the words["New", "York", "Times"]
= all concepts have a similar weight.["New York", "Times"]
= a combination of the two above.
A practical example would be: concepts: ["beatles", "John Lennon"]
Semantic Path
- Only available in
txt2vec-contextionary
module
The semantic path returns an array of concepts from the query to the data object. This allows you to see which steps Weaviate took and how the query and data object are interpreted.
Property | Description |
---|---|
concept | the concept that is found in this step. |
distanceToNext | the distance to the next step (null for the last step). |
distanceToPrevious | this distance to the previous step (null for the first step). |
distanceToQuery | the distance of this step to the query. |
distanceToResult | the distance of the step to this result. |
Note: Building a semantic path is only possible if a nearText: {}
operator is set as the explore term represents the beginning of the path and each search result represents the end of the path. Since nearText: {}
queries are currently exclusively possible in GraphQL, the semanticPath
is therefore not available in the REST API.
Example: showing a semantic path without edges.
- Python Client v4
- Python Client v3
- JS/TS Client v2
- Go
- Java
- Curl
- GraphQL
import weaviate
import weaviate.classes as wvc
import os
client = weaviate.connect_to_local(
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY")
}
)
try:
# Semantic path is not yet supported by the V4 client. Please use a raw GraphQL query instead.
response = client.graphql_raw_query(
"""
{
Get {
Publication (
nearText:{
concepts: ["fashion"],
distance: 0.6,
moveAwayFrom: {
concepts: ["finance"],
force: 0.45
},
moveTo: {
concepts: ["haute couture"],
force: 0.85
}
}
) {
name
_additional {
semanticPath {
path {
concept
distanceToNext
distanceToPrevious
distanceToQuery
distanceToResult
}
}
}
}
}
}
"""
)
finally:
client.close()
import weaviate
client = weaviate.Client("http://localhost:8080")
near_text_operator = {
"concepts": ["fashion"],
"distance": 0.6, #prior to v1.14 use certainty: 0.7
"moveAwayFrom": {
"concepts": ["finance"],
"force": 0.45
},
"moveTo": {
"concepts": ["haute couture"],
"force": 0.85
}
}
additional_props = {
"semanticPath": "path {distanceToNext distanceToPrevious distanceToQuery distanceToResult}"
}
query_result = (
client.query
.get("Publication", "name")
.with_additional(additional_props)
.with_near_text(near_text_operator)
.do()
)
print(query_result)
import weaviate from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080',
});
const response = await client.graphql
.get()
.withClassName('Publication')
.withFields('name _additional { semanticPath { path { concept distanceToNext distanceToPrevious distanceToQuery distanceToResult } } }')
.withNearText({
concepts: ['fashion'],
distance: 0.6, // prior to v1.14 use certainty: 0.7
moveAwayFrom: {
concepts: ['finance'],
force: 0.45,
},
moveTo: {
concepts: ['haute couture'],
force: 0.85,
},
})
.do();
console.log(response);
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate-go-client/v4/weaviate/graphql"
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
className := "Publication"
fields := []graphql.Field{
{Name: "name"},
{Name: "_additional", Fields: []graphql.Field{
{Name: "semanticPath", Fields: []graphql.Field{
{Name: "path", Fields: []graphql.Field{
{Name: "concept"},
{Name: "distanceToNext"},
{Name: "distanceToPrevious"},
{Name: "distanceToQuery"},
{Name: "distanceToResult"},
}},
}},
}},
}
concepts := []string{"fashion"}
moveTo := &graphql.MoveParameters{
Concepts: []string{"haute couture"},
Force: 0.85,
}
moveAwayFrom := &graphql.MoveParameters{
Concepts: []string{"finance"},
Force: 0.45,
}
nearText := client.GraphQL().NearTextArgBuilder().
WithConcepts(concepts).
WithDistance(0.6). // prior to v1.14, use WithCertainty(0.7)
WithMoveTo(moveTo).
WithMoveAwayFrom(moveAwayFrom)
ctx := context.Background()
result, err := client.GraphQL().Get().
WithClassName(className).
WithFields(fields...).
WithNearText(nearText).
Do(ctx)
if err != nil {
panic(err)
}
fmt.Printf("%v", result)
}
package io.weaviate;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
import io.weaviate.client.v1.graphql.model.GraphQLResponse;
import io.weaviate.client.v1.graphql.query.argument.NearTextArgument;
import io.weaviate.client.v1.graphql.query.argument.NearTextMoveParameters;
import io.weaviate.client.v1.graphql.query.fields.Field;
public class App {
public static void main(String[] args) {
Config config = new Config("http", "localhost:8080");
WeaviateClient client = new WeaviateClient(config);
Field name = Field.builder().name("name").build();
Field _additional = Field.builder()
.name("_additional")
.fields(new Field[]{
Field.builder()
.name("semanticPath")
.fields(new Field[]{
Field.builder()
.name("path")
.fields(new Field[]{
Field.builder().name("concept").build(),
Field.builder().name("distanceToNext").build(),
Field.builder().name("distanceToPrevious").build(),
Field.builder().name("distanceToQuery").build(),
Field.builder().name("distanceToResult").build()
})
.build()
}).build()
}).build();
NearTextMoveParameters moveTo = NearTextMoveParameters.builder()
.concepts(new String[]{ "haute couture" }).force(0.85f).build();
NearTextMoveParameters moveAway = NearTextMoveParameters.builder()
.concepts(new String[]{ "finance" }).force(0.45f)
.build();
NearTextArgument explore = client.graphQL().arguments().nearTextArgBuilder()
.concepts(new String[]{ "fashion" })
.distance(0.6f) // prior to v1.14, use .certainty(0.7f)
.moveTo(moveTo)
.moveAwayFrom(moveAway)
.build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName("Publication")
.withFields(name, _additional)
.withNearText(explore)
.run();
if (result.hasErrors()) {
System.out.println(result.getError());
return;
}
System.out.println(result.getResult());
}
}
# Note: Under nearText, use `certainty` instead of `distance` prior to v1.14
echo '{
"query": "{
Get {
Publication (
nearText: {
concepts: [\"fashion\"],
distance: 0.6,
moveAwayFrom: {
concepts: [\"finance\"],
force: 0.45
},
moveTo: {
concepts: [\"haute couture\"],
force: 0.85
}
}
) {
name
_additional {
semanticPath{
path {
concept
distanceToNext
distanceToPrevious
distanceToQuery
distanceToResult
}
}
}
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer learn-weaviate' \
-H "X-OpenAI-Api-Key: $OPENAI_API_KEY" \
-d @- \
https://edu-demo.weaviate.network/v1/graphql
{
Get {
Publication (
nearText:{
concepts: ["fashion"],
distance: 0.6, # prior to v1.14 use certainty: 0.7
moveAwayFrom: {
concepts: ["finance"],
force: 0.45
},
moveTo: {
concepts: ["haute couture"],
force: 0.85
}
}
) {
name
_additional {
semanticPath {
path {
concept
distanceToNext
distanceToPrevious
distanceToQuery
distanceToResult
}
}
}
}
}
}
Multimodal search
Depending on the vectorizer module, you can use additional modalities such as images, audio, or video as the query, and retrieve corresponding, compatible objects.
Some modules, such as multi2vec-clip
and multi2vec-bind
allow you to search across modalities. For example, you can search for images using a text query, or search for text using an image query.
For more information, see specific module pages such as these:
hybrid
This operator allows you to combine BM25 and vector search to get a "best of both worlds" type search results set.
Variables
Variables | Required | Type | Description |
---|---|---|---|
query | yes | string | search query |
alpha | no | float | weighting for each search algorithm, default 0.75 |
vector | no | [float] | optional to supply your own vector |
properties | no | [string] | list of properties to limit the BM25 search to, default all text properties |
fusionType | no | string | the type of hybrid fusion algorithm (available from v1.20.0 ) |
- Notes:
alpha
can be any number from 0 to 1, defaulting to 0.75.alpha
= 0 forces using a pure keyword search method (BM25)alpha
= 1 forces using a pure vector search methodalpha
= 0.5 weighs the BM25 and vector methods evenly
fusionType
can berankedFusion
orrelativeScoreFusion
rankedFusion
(default) adds inverted ranks of the BM25 and vector search methodsrelativeScoreFusion
adds normalized scores of the BM25 and vector search methods
Fusion algorithms
Ranked fusion
The rankedFusion
algorithm is Weaviate's original hybrid fusion algorithm.
In this algorithm, each object is scored according to its position in the results for that search (vector or keyword). The top-ranked objects in each search get the highest scores. Scores decrease going from top to least ranked. The total score is calculated by adding the rank-based scores from the vector and keyword searches.
Relative score fusion
v1.20
v1.24
and higher.In relativeScoreFusion
the vector search and keyword search scores are scaled between 0
and 1
. The highest raw score becomes 1
in the scaled scores. The lowest value is assigned 0
. The remaining values are ranked between 0
and 1
. The total score is a scaled sum of the normalized vector similarity and normalized BM25 scores.
Fusion scoring comparison
This example uses a small search result set to compare the ranked fusion and relative fusion algorithms. The table shows the following information:
document id
, from 0 to 4keyword score
, sortedvector search score
, sorted
Search Type | (id): score | (id): score | (id): score | (id): score | (id): score |
---|---|---|---|---|---|
Keyword | (1): 5 | (0): 2.6 | (2): 2.3 | (4): 0.2 | (3): 0.09 |
Vector | (2): 0.6 | (4): 0.598 | (0): 0.596 | (1): 0.594 | (3): 0.009 |
The ranking algorithms use these scores to derive the hybrid ranking.
Ranked Fusion
The score depends on the rank of the result. The score is equal to 1/(RANK + 60)
:
Search Type | (id): score | (id): score | (id): score | (id): score | (id): score |
---|---|---|---|---|---|
Keyword | (1): 0.0154 | (0): 0.0160 | (2): 0.0161 | (4): 0.0167 | (3): 0.0166 |
Vector | (2): 0.016502 | (4): 0.016502 | (0): 0.016503 | (1): 0.016503 | (3): 0.016666 |
As you can see, the results of each rank is identical, regardless of the input score.
Relative Score Fusion
Here, we normalize the scores – the largest score is set to 1 and the lowest to 0, and all entries in-between are scaled according to their relative distance to the maximum and minimum values.
Search Type | (id): score | (id): score | (id): score | (id): score | (id): score |
---|---|---|---|---|---|
Keyword | (1): 1.0 | (0): 0.511 | (2): 0.450 | (4): 0.022 | (3): 0.0 |
Vector | (2): 1.0 | (4): 0.996 | (0): 0.993 | (1): 0.986 | (3): 0.0 |
Here, the scores reflect the relative distribution of the original scores. For example, the vector search scores of the first 4 documents were almost identical, which is still the case for the normalized scores.
Weighting & final scores
Before adding these scores up, they are weighted according to the alpha parameter. Let’s assume alpha=0.5
, meaning both search types contribute equally to the final result and therefore each score is multiplied by 0.5.
Now, we can add the scores for each document up and compare the results from both fusion algorithms.
Algorithm Type | (id): score | (id): score | (id): score | (id): score | (id): score |
---|---|---|---|---|---|
Ranked | (2): 0.016301 | (1): 0.015952 | (0): 0.015952 | (4): 0.016600 | (3): 0.016630 |
Relative | (1): 0.993 | (0): 0.752 | (2): 0.725 | (4): 0.509 | (3): 0.0 |
What can we learn from this?
For the vector search, the scores for the top 4 objects (IDs 2, 4, 0, 1) were almost identical, and all of them were good results. While for the keyword search, one object (ID 1) was much better than the rest.
This is captured in the final result of relativeScoreFusion
, which identified the object ID 1 the top result. This is justified because this document was the best result in the keyword search with a big gap to the next-best score and in the top group of vector search.
In contrast, for rankedFusion
, the object ID 2 is the top result, closely followed by objects ID 1 and ID 0.
For a fuller discussion of fusion methods, see this blog post
Additional metadata response
Hybrid search results are sorted by a score, derived as a fused combination of their BM25F score and nearText
similarity (higher is more relevant). This score
, and additionally the explainScore
metadata can be optionally retrieved in the response.
Example
- Python Client v4
- Python Client v3
- JS/TS Client v2
- Go
- Java
- Curl
- GraphQL
import weaviate
import weaviate.classes as wvc
import os
client = weaviate.connect_to_local(
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY")
}
)
try:
collection = client.collections.get("Article")
response = collection.query.hybrid(
query="Fisherman that catches salmon",
alpha=0.5,
return_metadata=wvc.query.MetadataQuery(score=True, explain_score=True),
limit=5,
)
for o in response.objects:
print(o.properties)
print(o.metadata.score)
print(o.metadata.explain_score)
finally:
client.close()
result = (
client.query
.get("Article", ["title", "summary"])
.with_additional(["score", "explainScore"])
.with_hybrid("Fisherman that catches salmon", alpha=0.5)
.do()
)
const response = await client.graphql
.get()
.withClassName('Article')
.withFields('title summary _additional { score explainScore }')
.withHybrid({
query: 'Fisherman that catches salmon',
alpha: 0.5, // optional, defaults to 0.75
})
.do();
console.log(JSON.stringify(response, null, 2));
hybrid := &HybridArgumentBuilder{}
hybrid.WithQuery("Fisherman that catches salmon").WithAlpha(0.5)
query := builder.WithClassName("Article").WithHybrid(hybrid).build()
HybridArgument hybrid = client.graphQL().arguments().HybridArgBuilder()
.query("Fisherman that catches salmon")
.alpha(0.5f)
.build();
Field name = Field.builder().name("title" "summary").build();
Field _additional = Field.builder()
.name("_additional")
.fields(new Field[]{Field.builder().name("score explainScore").build()})
.build();
// when
testGenerics.createTestSchemaAndData(client);
Result<GraphQLResponse> result = client.graphQL().get().withClassName("Article")
.withHybrid(hybrid)
.withFields(name, _additional).run();
echo '{
"query": "{
Get {
Article(
hybrid: {
query: \"Fisherman that catches salmon\"
alpha: 0.5
}
) {
title
summary
_additional { score explainScore }
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer learn-weaviate' \
-H "X-OpenAI-Api-Key: $OPENAI_API_KEY" \
-d @- \
https://edu-demo.weaviate.network/v1/graphql
{
Get {
Article (
hybrid: {
query: "Fisherman that catches salmon"
alpha: 0.5
}
) {
title
summary
_additional { score, explainScore }
}
}
}
Example with vector specified
You can optionally supply the vector query to the vector
variable. This will override the query
variable for the vector search component of the hybrid search.
- Python Client v4
- Python Client v3
- JS/TS Client v2
- Go
- Java
- Curl
- GraphQL
import weaviate
import weaviate.classes as wvc
import os
client = weaviate.connect_to_local(
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY")
}
)
try:
collection = client.collections.get("Article")
response = collection.query.hybrid(
query="Fisherman that catches salmon",
vector=query_vector,
alpha=0.5,
return_metadata=wvc.query.MetadataQuery(score=True, explain_score=True),
limit=5,
)
for o in response.objects:
print(o.properties)
print(o.metadata.score)
print(o.metadata.explain_score)
finally:
client.close()
result = (
client.query
.get("Article", ["title", "summary"])
.with_additional(["score"])
.with_hybrid("Fisherman that catches salmon", alpha=0.5, vector=[1, 2, 3])
.do()
)
const response = await client.graphql
.get()
.withClassName('Article')
.withFields('title summary _additional { score }')
.withHybrid({
query: 'Fisherman that catches salmon',
vector: [1, 2, 3], // optional. Not needed if Weaviate handles the vectorization.
alpha: 0.5, // optional, defaults to 0.75
})
.do();
console.log(response);
hybrid := &HybridArgumentBuilder{}
hybrid.WithQuery("Fisherman that catches salmon").WithVector(1, 2, 3).WithAlpha(0.5)
query := builder.WithClassName("Article").WithHybrid(hybrid).build()
HybridArgument hybrid = client.graphQL().arguments().HybridArgBuilder()
.query("Fisherman that catches salmon")
.vector(Float[]{1f,2f,3f})
.alpha(0.5f)
.build();
Field name = Field.builder().name("title" "summary").build();
Field _additional = Field.builder()
.name("_additional")
.fields(new Field[]{Field.builder().name("score").build()})
.build();
// when
testGenerics.createTestSchemaAndData(client);
Result<GraphQLResponse> result = client.graphQL().get().withClassName("Article")
.withHybrid(hybrid)
.withFields(name, _additional).run();
# The `vector` below is optional. Not needed if Weaviate handles the vectorization.
# If you provide your own embeddings, put the vector query there, and make sure it has the correct number of dimensions.
echo '{
"query": "{
Get {
Article(
hybrid: {
query: \"Fisherman that catches salmon\"
alpha: 0.5
vector: [1, 2, 3]
})
{
title
summary
_additional { score }
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer learn-weaviate' \
-H "X-OpenAI-Api-Key: $OPENAI_API_KEY" \
-d @- \
https://edu-demo.weaviate.network/v1/graphql
{
Get {
Article (
hybrid: {
query: "Fisherman that catches salmon"
alpha: 0.5
vector: [1, 2, 3] # optional. Not needed if Weaviate handles the vectorization. If you provide your own embeddings, put the vector query here.
})
{
title
summary
_additional { score }
}
}
}
Hybrid with a conditional filter
v1.18.0
A conditional (where
) filter can be used with hybrid
.
- Python Client v4
- Python Client v3
- JS/TS Client v2
- Go
- Java
- Curl
- GraphQL
import weaviate
import weaviate.classes as wvc
import os
client = weaviate.connect_to_local(
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY")
}
)
try:
collection = client.collections.get("Article")
response = collection.query.hybrid(
query="How to catch an Alaskan Pollock",
alpha=0.5,
filters=wvc.query.Filter.by_property("wordCount").less_than(1000),
limit=5,
)
for o in response.objects:
print(o.properties)
finally:
client.close()
where_filter = {
"path": ["wordCount"],
"operator": "LessThan",
"valueInt": "1000"
}
query_result = (
client.query
.get("Article", ["title", "summary"])
.with_where(where_filter)
.with_hybrid(query= "How to catch an Alaskan Pollock",alpha=0.5)
.do()
)
const response = await client.graphql
.get()
.withClassName('Article')
.withFields('title summary')
.withHybrid({
query: 'How to catch Alaskan Pollock',
alpha: 0.5,
})
.withWhere({
operator: 'LessThan',
path: ['wordCount'],
valueInt: 1000,
})
.do();
console.log(JSON.stringify(response, null, 2));
where := filters.Where().
WithPath([]string{"content"}).
WithOperator(filters.Equal).
WithValueString("Alaskan") // All results must have "Alaskan" in the content property
name = graphql.Field{Name: "summary"}
hybrid := &graphql.HybridArgumentBuilder{}
hybrid.WithQuery("How to catch an Alaskan Pollock").WithAlpha(0.5)
resultSet, gqlErr := client.GraphQL().Get().WithClassName("Article").WithHybrid(hybrid).WithWhere(where).WithFields(name).Do(context.Background())
articles := get["Article"].([]interface{})
Field title = Field.builder().name("title" "summary").build();
WhereFilter where = WhereFilter.builder()
.path(new String[]{ "wordCount" })
.operator(Operator.LessThan)
.valueInt(1000)
.build();
HybridFilter hybridFilter = HybridFilter.builder()
.query("How to catch an Alaskan Pollock.")
.alpha(0.5)
.build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName("Article")
.withFields(title)
.withWhere(where)
.withHybrid(hybridFilter)
.run();
echo '{
"query": "{
Get {
Article (
hybrid: { query: \"How to catch an Alaskan Pollock\", alpha: 0.5 }
where: { path: [\"wordCount\"], operator: LessThan, valueInt: 1000 }
) {
title
summary
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer learn-weaviate' \
-H "X-OpenAI-Api-Key: $OPENAI_API_KEY" \
-d @- \
https://edu-demo.weaviate.network/v1/graphql
{
Get {
Article (
hybrid: { query: "how to fish", alpha: 0.5 }
where: { path: ["wordCount"], operator: LessThan, valueInt: 1000 }
) {
title
summary
}
}
}
Specify object properties for BM25 search
v1.19
A hybrid
operator can accept an array of strings to limit the set of properties for the BM25 component of the search. If unspecified, all text properties will be searched.
- Python Client v4
- Python Client v3
- JS/TS Client v2
- Java
- Curl
- GraphQL
import weaviate
import weaviate.classes as wvc
import os
client = weaviate.connect_to_local(
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY")
}
)
try:
collection = client.collections.get("JeopardyQuestion")
response = collection.query.hybrid(
query="Venus",
alpha=0.25,
query_properties=["question"],
return_metadata=wvc.query.MetadataQuery(score=True),
limit=5,
)
for o in response.objects:
print(o.properties)
print(o.metadata.score)
finally:
client.close()
result = (
client.query
.get("JeopardyQuestion", ["question", "answer"])
.with_additional(["score"])
.with_hybrid(
"Venus",
alpha=0.25, # closer to pure keyword search
properties=["question"] # changing to "answer" will yield a different result set
)
.with_limit(3)do()
)
print(json.dumps(result, indent=4))
const response = await client.graphql
.get()
.withClassName('JeopardyQuestion')
.withFields('question answer _additional{ score }')
.withHybrid({
query: 'Venus',
alpha: 0.25, // closer to pure keyword search
properties: ['question'], // changing to "answer" will yield a different set of results
})
withLimit(3)
.do();
console.log(response['data']['Get']['JeopardyQuestion']);
HybridArgument hybrid = client.graphQL().arguments().HybridArgBuilder()
.query("Fisherman that catches salmon")
.alpha(0.25f) // closer to pure keyword search
.properties(String[]{"question"}) // changing to "answer" will yield a different result set
.build();
Field name = Field.builder().name("question" "answer").build();
Field _additional = Field.builder()
.name("_additional")
.fields(new Field[]{Field.builder().name("score").build()})
.build();
// when
testGenerics.createTestSchemaAndData(client);
Result<GraphQLResponse> result = client.graphQL().get().withClassName("JeopardyQuestion")
.withHybrid(hybrid)
.withFields(name, _additional)
.withLimit(3)
.run();
echo '{
"query": "{
Get {
JeopardyQuestion (
hybrid: {
query: \"Venus\"
alpha: 0.25
properties: [\"question\"]
}
limit: 3
)
{
question
answer
_additional { score }
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer learn-weaviate' \
-H "X-OpenAI-Api-Key: $OPENAI_API_KEY" \
-d @- \
https://edu-demo.weaviate.network/v1/graphql
{
Get {
JeopardyQuestion(
hybrid: {
query: "Venus"
alpha: 0.25 # closer to pure keyword search
properties: ["question"] # changing to "answer" will yield a different result set
}
limit: 3
) {
question
answer
_additional {
score
}
}
}
}
Oversearch with relativeScoreFusion
v1.21
When relativeScoreFusion
is used as the fusionType
with a small search limit
, a result set can be very sensitive to the limit parameter due to the normalization of the scores.
To mitigate this effect, Weaviate automatically performs a search with a higher limit (100) and then trims the results down to the requested limit.
BM25
The bm25
operator performs a keyword (sparse vector) search, and uses the BM25F ranking function to score the results. BM25F (Best Match 25 with Extension to Multiple Weighted Fields) is an extended version of BM25 that applies the scoring algorithm to multiple fields (properties
), producing better results.
The search is case-insensitive, and case matching does not confer a score advantage. Stop words are removed. Stemming is not supported yet.
Schema configuration
The free parameters k1
and b
are configurable and optional. See the schema reference for more details.
Variables
The bm25
operator supports the following variables:
Variables | Required | Description |
---|---|---|
query | yes | The keyword search query. |
properties | no | Array of properties (fields) to search in, defaulting to all properties in the collection. |
Specific properties can be boosted by a factor specified as a number after the caret sign, for example properties: ["title^3", "summary"]
.
Additional metadata response
The BM25F score
metadata can be optionally retrieved in the response. A higher score indicates higher relevance.
Example query
- Python Client v4
- Python Client v3
- JS/TS Client v2
- Go
- Java
- Curl
- GraphQL
import weaviate
import weaviate.classes as wvc
import os
client = weaviate.connect_to_local(
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY")
}
)
try:
collection = client.collections.get("Article")
response = collection.query.bm25(
query="fox",
query_properties=["title"],
return_metadata=wvc.query.MetadataQuery(score=True),
limit=5,
)
for o in response.objects:
print(o.properties)
print(o.metadata.score)
finally:
client.close()
import weaviate
client = weaviate.Client("http://localhost:8080")
bm25 = {
"query": "fox",
"properties": ["title"], # by default, all properties are searched
}
result = (
client.query
.get("Article", ["title", "_additional {score} "])
.with_bm25(**bm25)
.do()
)
print(result)
import weaviate from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080',
});
const response = await client.graphql
.get()
.withClassName('Article')
.withFields('title _additional {score}')
.withBm25({
query: 'fox',
properties: ['title'],
})
.do();
console.log(response);
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate-go-client/v4/weaviate/graphql"
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
className := "Article"
title := graphql.Field{Name: "title"}
_additional := graphql.Field{
Name: "_additional", Fields: []graphql.Field{
{Name: "score"},
},
}
query := string{"fox"}
properties: []string{"title"},
bm25 := client.GraphQL().Bm25ArgBuilder().
WithQuery(query).
WithProperties(properties)
ctx := context.Background()
result, err := client.GraphQL().Get().
WithClassName(className).
WithFields(title, _additional).
WithBm25(bm25).
Do(ctx)
if err != nil {
panic(err)
}
fmt.Printf("%v", result)
}
package io.weaviate;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
import io.weaviate.client.v1.graphql.model.GraphQLResponse;
import io.weaviate.client.v1.graphql.query.argument.Bm25Argument;
import io.weaviate.client.v1.graphql.query.fields.Field;
public class App {
public static void main(String[] args) {
Config config = new Config("http", "localhost:8080");
WeaviateClient client = new WeaviateClient(config);
Bm25Argument bm25 = client.graphQL().arguments().bm25ArgBuilder()
.query(new String("fox"))
.properties(new String[]{ "title" })
.build();
Field title = Field.builder().name("title").build();
Field _additional = Field.builder()
.name("_additional")
.fields(new Field[]{
Field.builder().name("score").build()
}).build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName("Article")
.withFields(title, _additional)
.withBm25(bm25)
.run();
if (result.hasErrors()) {
System.out.println(result.getError());
return;
}
System.out.println(result.getResult());
}
}
echo '{
"query": "{
Get {
Article(
bm25: {
query: \"fox\",
properties: [\"title\"],
}
) {
title
_additional {
score
}
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer learn-weaviate' \
-d @- \
https://edu-demo.weaviate.network/v1/graphql
{
Get {
Article(
bm25: {
query: "fox",
properties: ["title"]
}
) {
title
_additional {
score
}
}
}
}
Expected response
{
"data": {
"Get": {
"Article": [
{
"_additional": {
"certainty": null,
"distance": null,
"score": "3.4985464"
},
"title": "Tim Dowling: is the dog’s friendship with the fox sweet – or a bad omen?"
}
]
}
},
"errors": null
}
BM25 with a conditional filter
v1.18
A conditional (where
) filter can be used with bm25
.
- Python Client v4
- Python Client v3
- JS/TS Client v2
- Go
- Java
- Curl
- GraphQL
import weaviate
import weaviate.classes as wvc
import os
client = weaviate.connect_to_local(
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY")
}
)
try:
collection = client.collections.get("Article")
response = collection.query.bm25(
query="how to fish",
return_metadata=wvc.query.MetadataQuery(score=True),
filters=wvc.query.Filter.by_property("wordCount").less_than(1000),
limit=5,
)
for o in response.objects:
print(o.properties)
print(o.metadata.score)
finally:
client.close()
where_filter = {
"path": ["wordCount"],
"operator": "LessThan",
"valueInt": "1000"
}
query_result = (
client.query
.get("Article", ["title", "summary"])
.with_where(where_filter)
.with_bm25(query="how to fish")
.do()
)
const response = await client.graphql
.get()
.withClassName('Article')
.withFields('title summary')
.withBm25({
query: 'how to fish',
})
.withWhere({
operator: 'LessThan',
path: ['wordCount'],
valueInt: 1000,
})
.do();
console.log(JSON.stringify(response, null, 2));
resultSet, gqlErr := client.GraphQL().Get().WithClassName("Article").WithHybrid(hybrid).WithWhere(where).WithFields(name).Do(context.Background())
where := filters.Where().
WithPath([]string{"wordCount"}).
WithOperator(filters.LessThan).
WithValueInt(1000)
name = graphql.Field{Name: "summary"} // the output field
bm25B := &BM25ArgumentBuilder{}
bm25B = bm25B.WithQuery("How to fish").WithProperties("title", "summary")
resultSet, gqlErr := client.GraphQL().Get().WithClassName("Article").WithBM25(bm25B).WithWhere(where).WithFields(name).Do(context.Background())
articles := get["Article"].([]interface{})
Field title = Field.builder().name("title" "summary").build();
WhereFilter where = WhereFilter.builder()
.path(new String[]{ "wordCount" })
.operator(Operator.LessThan)
.valueInt(1000)
.build();
Bm25Argument bm25 = client.graphQL().arguments().Bm25ArgBuilder()
.query("how to fish")
.properties(new String[]{"title","summary"})
.build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName("Article")
.withFields(title)
.withWhere(where)
.withBm25(bm25)
.run();
echo '{
"query": "{
Get {
Article(
bm25: { query: \"how to fish\", properties: [\"title\"] }
where: { path: [\"wordCount\"], operator: LessThan, valueInt: 1000 }
) {
summary
title
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer learn-weaviate' \
-d @- \
https://edu-demo.weaviate.network/v1/graphql
{
Get {
Article (
bm25: { query: "how to fish", properties: ["title"] }
where: { path: ["wordCount"], operator: LessThan, valueInt: 1000 }
) {
summary
title
}
}
}
Expected response
{
"data": {
"Get": {
"Article": [
{
"summary": "Sometimes, the hardest part of setting a fishing record is just getting the fish weighed. A Kentucky fisherman has officially set a new record in the state after reeling in a 9.05-pound saugeye. While getting the fish in the boat was difficult, the angler had just as much trouble finding an officially certified scale to weigh it on. In order to qualify for a state record, fish must be weighed on an officially certified scale. The previous record for a saugeye in Kentucky ws an 8 pound, 8-ounce fish caught in 2019.",
"title": "Kentucky fisherman catches record-breaking fish, searches for certified scale"
},
{
"summary": "Unpaid last month because there wasn\u2019t enough money. Ms. Hunt picks up shifts at JJ Fish & Chicken, bartends and babysits. three daughters is subsidized,and cereal fromErica Hunt\u2019s monthly budget on $12 an hourErica Hunt\u2019s monthly budget on $12 an hourExpensesIncome and benefitsRent, $775Take-home pay, $1,400Varies based on hours worked. Daycare, $600Daycare for Ms. Hunt\u2019s three daughters is subsidized, as are her electricity and internet costs. Household goods, $300Child support, $350Ms. Hunt picks up shifts at JJ Fish & Chicken, bartends and babysits to make more money.",
"title": "Opinion | What to Tell the Critics of a $15 Minimum Wage"
},
...
]
}
}
}
ask
Enabled by the module: Question Answering.
This operator allows you to return answers to questions by running the results through a Q&A model.
Variables
Variable | Required | Type | Description |
---|---|---|---|
question | yes | string | The question to be answered. |
certainty | no | float | Desired minimal certainty or confidence of answer to the question. The higher the value, the stricter the search becomes. The lower the value, the fuzzier the search becomes. If no certainty is set, any answer that could be extracted will be returned. |
properties | no | [string] | The properties of the queries collection which contains text. If no properties are set, all are considered. |
rerank | no | boolean | If enabled, the qna module will rerank the result based on the answer score. For example, if the 3rd result - as determined by the previous (semantic) search contained the most likely answer, result 3 will be pushed to position 1, etc. Supported since v1.10.0 |
Example
- Python Client v4
- Python Client v3
- JS/TS Client v2
- Go
- Java
- Curl
- GraphQL
import weaviate
import weaviate.classes as wvc
import os
client = weaviate.connect_to_local(
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY")
}
)
try:
# QnA module use is not yet supported by the V4 client. Please use a raw GraphQL query instead.
response = client.graphql_raw_query(
"""
{
Get {
Article(
ask: {
question: "Who is the king of the Netherlands?",
properties: ["summary"],
},
limit: 1
) {
title
_additional {
answer {
hasAnswer
property
result
startPosition
endPosition
}
}
}
}
}
"""
finally:
client.close()
import weaviate
import os
client = weaviate.Client(
"https://edu-demo.weaviate.network",
auth_client_secret=weaviate.auth.AuthApiKey("learn-weaviate"),
additional_headers={
"X-OpenAI-Api-Key": os.environ["OPENAI_API_KEY"] # Replace with your OPENAI API key
}
)
ask = {
"question": "Who is the king of the Netherlands?",
"properties": ["summary"]
}
result = (
client.query
.get("Article", ["title", "_additional {answer {hasAnswer property result startPosition endPosition} }"])
.with_ask(ask)
.with_limit(1)
.do()
)
print(result)
import weaviate from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'https',
host: 'edu-demo.weaviate.network',
apiKey: new weaviate.ApiKey('learn-weaviate'),
headers: {
'X-OpenAI-Api-Key': process.env['OPENAI_API_KEY'],
},
});
const response = await client.graphql
.get()
.withClassName('Article')
.withAsk({
question: 'Who is the king of the Netherlands?',
properties: ['summary'],
})
.withFields('title _additional { answer { hasAnswer property result startPosition endPosition } }')
.withLimit(1)
.do();
console.log(response);
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate-go-client/v4/weaviate/graphql"
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
className := "Article"
fields := []graphql.Field{
{Name: "title"},
{Name: "_additional", Fields: []graphql.Field{
{Name: "answer", Fields: []graphql.Field{
{Name: "hasAnswer"},
{Name: "certainty"},
{Name: "property"},
{Name: "result"},
{Name: "startPosition"},
{Name: "endPosition"},
}},
}},
}
ask := client.GraphQL().AskArgBuilder().
WithQuestion("Who is the king of the Netherlands?").
WithProperties([]string{"summary"})
ctx := context.Background()
result, err := client.GraphQL().Get().
WithClassName(className).
WithFields(fields...).
WithAsk(ask).
WithLimit(1).
Do(ctx)
if err != nil {
panic(err)
}
fmt.Printf("%v", result)
}
package io.weaviate;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
import io.weaviate.client.v1.graphql.model.GraphQLResponse;
import io.weaviate.client.v1.graphql.query.argument.AskArgument;
import io.weaviate.client.v1.graphql.query.fields.Field;
public class App {
public static void main(String[] args) {
Config config = new Config("http", "localhost:8080");
WeaviateClient client = new WeaviateClient(config);
Field title = Field.builder().name("title").build();
Field _additional = Field.builder()
.name("_additional")
.fields(new Field[]{
Field.builder()
.name("answer")
.fields(new Field[]{
Field.builder().name("hasAnswer").build(),
Field.builder().name("certainty").build(),
Field.builder().name("property").build(),
Field.builder().name("result").build(),
Field.builder().name("startPosition").build(),
Field.builder().name("endPosition").build()
}).build()
}).build();
AskArgument ask = AskArgument.builder()
.question("Who is the king of the Netherlands?")
.properties(new String[]{ "summary" })
.build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName("Article")
.withFields(title, _additional)
.withAsk(ask)
.withLimit(1)
.run();
if (result.hasErrors()) {
System.out.println(result.getError());
return;
}
System.out.println(result.getResult());
}
}
echo '{
"query": "{
Get {
Article(
ask: {
question: \"Who is the king of the Netherlands?\",
properties: [\"summary\"]
},
limit: 1
) {
title
_additional {
answer {
hasAnswer
property
result
startPosition
endPosition
}
}
}
}
}
"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer learn-weaviate' \
-H "X-OpenAI-Api-Key: $OPENAI_API_KEY" \
-d @- \
https://edu-demo.weaviate.network/v1/graphql
{
Get {
Article(
ask: {
question: "Who is the king of the Netherlands?",
properties: ["summary"],
},
limit: 1
) {
title
_additional {
answer {
hasAnswer
property
result
startPosition
endPosition
}
}
}
}
}
Additional metadata response
The answer
and a certainty
can be retrieved.
Questions and feedback
If you have any questions or feedback, let us know in the user forum.