Question Answering - transformers
In short
- The Question and Answer (Q&A) module is a Weaviate module for answer extraction from data.
- The module depends on a text vectorization module that should be running with Weaviate.
- The module adds an
ask {}
operator to the GraphQLGet {}
queries - The module returns a max. of 1 answer in the GraphQL
_additional {}
field. - The answer with the highest
certainty
(confidence level) will be returned.
Introduction
The Question and Answer (Q&A) module is a Weaviate module for answer extraction from data. It uses BERT-related models for finding and extracting answers. This module can be used in GraphQL Get{...}
queries, as a search operator. The qna-transformers
module tries to find an answer in the data objects of the specified class. If an answer is found within the given certainty
range, it will be returned in the GraphQL _additional { answer { ... } }
field. There will be a maximum of 1 answer returned, if this is above the optionally set certainty
. The answer with the highest certainty
(confidence level) will be returned.
There are currently five different Question Answering models available (source: Hugging Face Model Hub): distilbert-base-uncased-distilled-squad (uncased)
, bert-large-uncased-whole-word-masking-finetuned-squad (uncased)
, distilbert-base-cased-distilled-squad (cased)
, deepset/roberta-base-squad2
, and deepset/bert-large-uncased-whole-word-masking-squad2 (uncased)
. Note that not all models perform well on every dataset and use case. We recommend to use bert-large-uncased-whole-word-masking-finetuned-squad (uncased)
, which performs best on most datasets (although it's quite heavyweighted).
Starting with v1.10.0
, the answer score can be used as a reranking factor for the search results.
How to enable (module configuration)
Docker Compose
The Q&A module can be added as a service to the Docker Compose file. You must have a text vectorizer like text2vec-contextionary
or text2vec-transformers
running. An example Docker Compose file for using the qna-transformers
module (bert-large-uncased-whole-word-masking-finetuned-squad (uncased)
) in combination with the text2vec-transformers
is as follows:
---
services:
weaviate:
command:
- --host
- 0.0.0.0
- --port
- '8080'
- --scheme
- http
image: cr.weaviate.io/semitechnologies/weaviate:1.28.0
ports:
- 8080:8080
- 50051:50051
restart: on-failure:0
environment:
TRANSFORMERS_INFERENCE_API: 'http://t2v-transformers:8080'
QNA_INFERENCE_API: "http://qna-transformers:8080"
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
ENABLE_MODULES: 'text2vec-transformers,qna-transformers'
CLUSTER_HOSTNAME: 'node1'
t2v-transformers:
image: cr.weaviate.io/semitechnologies/transformers-inference:sentence-transformers-msmarco-distilbert-base-v2
environment:
ENABLE_CUDA: '1'
NVIDIA_VISIBLE_DEVICES: all
deploy:
resources:
reservations:
devices:
- capabilities: [gpu]
qna-transformers:
image: cr.weaviate.io/semitechnologies/qna-transformers:bert-large-uncased-whole-word-masking-finetuned-squad
environment:
ENABLE_CUDA: '1'
NVIDIA_VISIBLE_DEVICES: all
deploy:
resources:
reservations:
devices:
- capabilities: [gpu]
...
Variable explanations:
QNA_INFERENCE_API
: where the qna module is runningENABLE_CUDA
: if set to 1 it uses GPU (if available on the host machine)
Note: at the moment, text vectorization modules cannot be combined in a single setup. This means that you can either enable the text2vec-contextionary
, the text2vec-transformers
or no text vectorization module.
How to use (GraphQL)
GraphQL Ask search
This module adds a search operator to GraphQL Get{...}
queries: ask{}
. This new operator takes the following arguments:
Field | Data Type | Required | Example value | Description |
---|---|---|---|---|
question | string | yes | "What is the name of the Dutch king?" | The question to be answered. |
certainty | float | no | 0.75 | Desired minimal certainty or confidence of answer to the question. The higher the value, the stricter the search becomes. The lower the value, the fuzzier the search becomes. If no certainty is set, any answer that could be extracted will be returned |
properties | list of strings | no | ["summary"] | The properties of the queries Class which contains text. If no properties are set, all are considered. |
rerank | bool | no | true | If enabled, the qna module will rerank the result based on the answer score. For example, if the 3rd result - as determined by the previous (semantic) search contained the most likely answer, result 3 will be pushed to position 1, etc. Not supported prior to v1.10.0 |
Notes:
- The GraphQL
Explore { }
function does support theask
searcher, but the result is only a beacon to the object containing the answer. It is thus not any different from performing a nearText semantic search with the question. No extraction is happening. - You cannot use the
'ask'
operator along with a'nearXXX'
operator!
Example query
- Python Client v4
- Python Client v3
- JS/TS Client v2
- Go
- Java
- Curl
- GraphQL
import weaviate
import weaviate.classes as wvc
import os
client = weaviate.connect_to_local(
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY")
}
)
try:
# QnA module use is not yet supported by the V4 client. Please use a raw GraphQL query instead.
response = client.graphql_raw_query(
"""
{
Get {
Article(
ask: {
question: "Who is the king of the Netherlands?",
properties: ["summary"],
},
limit: 1
) {
title
_additional {
answer {
hasAnswer
property
result
startPosition
endPosition
}
}
}
}
}
"""
finally:
client.close()
import weaviate
import os
client = weaviate.Client(
"https://edu-demo.weaviate.network",
auth_client_secret=weaviate.auth.AuthApiKey("learn-weaviate"),
additional_headers={
"X-OpenAI-Api-Key": os.environ["OPENAI_API_KEY"] # Replace with your OPENAI API key
}
)
ask = {
"question": "Who is the king of the Netherlands?",
"properties": ["summary"]
}
result = (
client.query
.get("Article", ["title", "_additional {answer {hasAnswer property result startPosition endPosition} }"])
.with_ask(ask)
.with_limit(1)
.do()
)
print(result)
import weaviate from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'https',
host: 'edu-demo.weaviate.network',
apiKey: new weaviate.ApiKey('learn-weaviate'),
headers: {
'X-OpenAI-Api-Key': process.env['OPENAI_API_KEY'],
},
});
const response = await client.graphql
.get()
.withClassName('Article')
.withAsk({
question: 'Who is the king of the Netherlands?',
properties: ['summary'],
})
.withFields('title _additional { answer { hasAnswer property result startPosition endPosition } }')
.withLimit(1)
.do();
console.log(response);
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate-go-client/v4/weaviate/graphql"
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
className := "Article"
fields := []graphql.Field{
{Name: "title"},
{Name: "_additional", Fields: []graphql.Field{
{Name: "answer", Fields: []graphql.Field{
{Name: "hasAnswer"},
{Name: "certainty"},
{Name: "property"},
{Name: "result"},
{Name: "startPosition"},
{Name: "endPosition"},
}},
}},
}
ask := client.GraphQL().AskArgBuilder().
WithQuestion("Who is the king of the Netherlands?").
WithProperties([]string{"summary"})
ctx := context.Background()
result, err := client.GraphQL().Get().
WithClassName(className).
WithFields(fields...).
WithAsk(ask).
WithLimit(1).
Do(ctx)
if err != nil {
panic(err)
}
fmt.Printf("%v", result)
}
package io.weaviate;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
import io.weaviate.client.v1.graphql.model.GraphQLResponse;
import io.weaviate.client.v1.graphql.query.argument.AskArgument;
import io.weaviate.client.v1.graphql.query.fields.Field;
public class App {
public static void main(String[] args) {
Config config = new Config("http", "localhost:8080");
WeaviateClient client = new WeaviateClient(config);
Field title = Field.builder().name("title").build();
Field _additional = Field.builder()
.name("_additional")
.fields(new Field[]{
Field.builder()
.name("answer")
.fields(new Field[]{
Field.builder().name("hasAnswer").build(),
Field.builder().name("certainty").build(),
Field.builder().name("property").build(),
Field.builder().name("result").build(),
Field.builder().name("startPosition").build(),
Field.builder().name("endPosition").build()
}).build()
}).build();
AskArgument ask = AskArgument.builder()
.question("Who is the king of the Netherlands?")
.properties(new String[]{ "summary" })
.build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName("Article")
.withFields(title, _additional)
.withAsk(ask)
.withLimit(1)
.run();
if (result.hasErrors()) {
System.out.println(result.getError());
return;
}
System.out.println(result.getResult());
}
}
echo '{
"query": "{
Get {
Article(
ask: {
question: \"Who is the king of the Netherlands?\",
properties: [\"summary\"]
},
limit: 1
) {
title
_additional {
answer {
hasAnswer
property
result
startPosition
endPosition
}
}
}
}
}
"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer learn-weaviate' \
-H "X-OpenAI-Api-Key: $OPENAI_API_KEY" \
-d @- \
https://edu-demo.weaviate.network/v1/graphql
{
Get {
Article(
ask: {
question: "Who is the king of the Netherlands?",
properties: ["summary"],
},
limit: 1
) {
title
_additional {
answer {
hasAnswer
property
result
startPosition
endPosition
}
}
}
}
}
GraphQL response
The answer is contained in a new GraphQL _additional
property called answer
. It contains the following fields:
hasAnswer
(boolean
): could an answer be found?result
(nullablestring
): An answer if one could be found.null
ifhasAnswer==false
certainty
(nullablefloat
): The certainty of the answer returned.null
ifhasAnswer==false
property
(nullablestring
): The property which contains the answer.null
ifhasAnswer==false
startPosition
(int
): The character offset where the answer starts.0
ifhasAnswer==false
endPosition
(int
): The character offset where the answer ends0
ifhasAnswer==false
Note: startPosition
, endPosition
and property
in the response are not guaranteed to be present. They are calculated by a case-insensitive string matching function against the input text. If the transformer model formats the output differently (e.g. by introducing spaces between tokens which were not present in the original input), the calculation of the position and determining the property fails.
Example response
{
"data": {
"Get": {
"Article": [
{
"_additional": {
"answer": {
"certainty": 0.73,
"endPosition": 26,
"hasAnswer": true,
"property": "summary",
"result": "king willem - alexander",
"startPosition": 48
}
},
"title": "Bruised Oranges - The Dutch royals are botching covid-19 etiquette"
}
]
}
},
"errors": null
}
Custom Q&A Transformer module
You can use the same approach as for text2vec-transformers
, see here, i.e. either pick one of the pre-built containers or build your own container from your own model using the semitechnologies/qna-transformers:custom
base image. Make sure that your model is compatible with Hugging Face's transformers.AutoModelForQuestionAnswering
.
How it works (under the hood)
Under the hood, the model uses a two-step approach. First it performs a semantic search to find the documents (e.g. a Sentence, Paragraph, Article, etc.) most likely to contain the answer. In a second step, a BERT-style answer extraction is performed on all text
and string
properties of the document. There are now three possible outcomes:
- No answer was found because the question can not be answered,
- An answer was found, but did not meet the user-specified minimum certainty, so it was discarded (typically the case when the document is on topic, but does not contain an actual answer to the question), and
- An answer was found that matches the desired certainty. It is returned to the user.
The module performs a semantic search under the hood, so a text2vec-...
module is required. It does not need to be of the same type as the qna-...
module. For example, you can use a text2vec-contextionary
module to perform the semantic search, and a qna-transformers
module to extract the answer.
Automatic sliding window for long documents
If a text value in a data object is longer than 512 tokens, the Q&A Transformer module automatically splits the text into smaller texts. The module uses a sliding window, i.e. overlapping pieces of text, to avoid a scenario that an answer cannot be found if it lies on a boundary. If an answer lies on the boundary, the Q&A module returns the result (answer) with the highest score (as the sliding mechanism could lead to duplicates).
Model license(s)
The qna-transformers
module is compatible with various models, each with their own license. For detailed information, see the license of the model you are using in the Hugging Face Hub.
It is your responsibility to evaluate whether the terms of its license(s), if any, are appropriate for your intended use.
Questions and feedback
If you have any questions or feedback, let us know in the user forum.