In short
- The SpellCheck module is a Weaviate module for spell checking of raw text in GraphQL queries.
- The module depends on a Python spellchecking service.
- The module adds an
spellCheck {}
filter to the GraphQLnearText {}
search arguments. - The module returns the spelling check result in the GraphQL
_additional { spellCheck {} }
field.
Introduction
The SpellCheck module is a Weaviate module for checking spelling in raw texts in GraphQL query inputs. Using the Python spellchecker as service, the module analyzes text, gives a suggestion and can force an auto-correction.
How to enable (module configuration)
Docker-compose
The Q&A module can be added as a service to the Docker-compose file. You must have a text vectorizer like text2vec-contextionary
or text2vec-transformers
running. An example Docker-compose file for using the spellcheck
module with the text2vec-contextionary
is here:
---
version: '3.4'
services:
weaviate:
command:
- --host
- 0.0.0.0
- --port
- '8080'
- --scheme
- http
image: semitechnologies/weaviate:1.9.0
ports:
- 8080:8080
restart: on-failure:0
environment:
CONTEXTIONARY_URL: contextionary:9999
SPELLCHECK_INFERENCE_API: "http://text-spellcheck:8080"
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
DEFAULT_VECTORIZER_MODULE: 'text2vec-contextionary'
ENABLE_MODULES: 'text2vec-contextionary,text-spellcheck'
contextionary:
environment:
OCCURRENCE_WEIGHT_LINEAR_FACTOR: 0.75
EXTENSIONS_STORAGE_MODE: weaviate
EXTENSIONS_STORAGE_ORIGIN: http://weaviate:8080
NEIGHBOR_OCCURRENCE_IGNORE_PERCENTILE: 5
ENABLE_COMPOUND_SPLITTING: 'false'
image: semitechnologies/contextionary:en0.16.0-v1.0.2
ports:
- 9999:9999
text-spellcheck:
image: semitechnologies/text-spellcheck-model:pyspellchecker-d933122
...
Variable explanations:
SPELLCHECK_INFERENCE_API
: where the spellcheck module is running
How to use (GraphQL)
Use the new spellchecker module to verify user-provided search queries (in existing nearText
(given that a text2vec
module is used) or ask
(if the qna-transformers
module is enabled) functions) are spelled correctly and even suggest alternative, correct spellings. Spell-checking happens at query time.
There are two ways to use this module:
- It provides a new GraphQL
_additional
property which can be used to check (but not alter) the provided queries, see query below.
Example query
{
Get {
Article(nearText:{
concepts: ["houssing prices"]
}) {
title
_additional{
spellCheck{
changes{
corrected
original
}
didYouMean
location
originalText
}
}
}
}
}
import weaviate
client = weaviate.Client("http://localhost:8080")
near_text = {
"concepts": ["houssing prices"],
}
result = (
client.query
.get("Article", ["title", "_additional {spellCheck { change {corrected original} didYouMean location originalText}}"])
.with_near_text(near_text)
.do()
)
print(result)
const weaviate = require("weaviate-client");
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080',
});
client.graphql
.get()
.withClassName('Article')
.withFields('title _additional {spellCheck { change {corrected original} didYouMean location originalText}}')
.withNearText({
concepts: ["houssing prices"],
})
.do()
.then(res => {
console.log(res)
})
.catch(err => {
console.error(err)
});
package main
import (
"context"
"fmt"
"github.com/semi-technologies/weaviate-go-client/v4/weaviate"
"github.com/semi-technologies/weaviate-go-client/v4/weaviate/graphql"
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
}
client := weaviate.New(cfg)
className := "Article"
fields := []graphql.Field{
{Name: "title"},
{Name: "_additional", Fields: []graphql.Field{
{Name: "spellCheck", Fields: []graphql.Field{
{Name: "change", Fields: []graphql.Field{
{Name: "corrected"},
{Name: "original"},
}},
{Name: "didYouMean"},
{Name: "location"},
{Name: "originalText"},
}},
}},
}
concepts := []string{"houssing prices"}
nearText := client.GraphQL().NearTextArgBuilder().
WithConcepts(concepts)
ctx := context.Background()
result, err := client.GraphQL().Get().
WithClassName(className).
WithFields(fields...).
WithNearText(nearText).
Do(ctx)
if err != nil {
panic(err)
}
fmt.Printf("%v", result)
}
package technology.semi.weaviate;
import technology.semi.weaviate.client.Config;
import technology.semi.weaviate.client.WeaviateClient;
import technology.semi.weaviate.client.base.Result;
import technology.semi.weaviate.client.v1.graphql.model.GraphQLResponse;
import technology.semi.weaviate.client.v1.graphql.query.argument.NearTextArgument;
import technology.semi.weaviate.client.v1.graphql.query.fields.Field;
public class App {
public static void main(String[] args) {
Config config = new Config("http", "localhost:8080");
WeaviateClient client = new WeaviateClient(config);
Field title = Field.builder().name("title").build();
Field _additional = Field.builder()
.name("_additional")
.fields(new Field[]{
Field.builder()
.name("spellCheck")
.fields(new Field[]{
Field.builder()
.name("change")
.fields(new Field[]{
Field.builder().name("corrected").build(),
Field.builder().name("original").build()
}).build(),
Field.builder().name("didYouMean").build(),
Field.builder().name("location").build(),
Field.builder().name("originalText").build()
}).build()
}).build();
NearTextArgument explore = client.graphQL().arguments().nearTextArgBuilder()
.concepts(new String[]{ "houssing prices" })
.build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName("Article")
.withFields(title, _additional)
.withNearText(explore)
.run();
if (result.hasErrors()) {
System.out.println(result.getError());
return;
}
System.out.println(result.getResult());
}
}
$ echo '{
"query": "{
Get {
Article(nearText:{
concepts: ["houssing prices"]
}) {
title
_additional{
spellCheck{
changes{
corrected
original
}
didYouMean
location
originalText
}
}
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-d @- \
http://localhost:8080/v1/graphql
GraphQL response
The result is contained in a new GraphQL _additional
property called spellCheck
. It contains the following fields:
changes
: a list with the following fields:corrected
(string
): the corrected spelling if a correction is foundoriginal
(string
): the original spelled word in the query
didYouMean
: the corrected full text in the queryoriginalText
: the original full text in the querylocation
: the location of the misspelled string in the query
Example response
{
"data": {
"Get": {
"Article": [
{
"_additional": {
"spellCheck": [
{
"changes": [
{
"corrected": "housing",
"original": "houssing"
}
],
"didYouMean": "housing prices",
"location": "nearText.concepts[0]",
"originalText": "houssing prices"
}
]
},
"title": "..."
}
]
}
},
"errors": null
}
- It extends existing
text2vec-modules
with aautoCorrect
flag, which can be used to correct the query if incorrect in the background:
Example query
{
Get {
Article(nearText:{
concepts: ["houssing prices"],
autocorrect: true
}) {
title
_additional{
spellCheck{
changes{
corrected
original
}
didYouMean
location
originalText
}
}
}
}
}
🟢 Click here to try out this graphql example in the Weaviate Console.
More resources
If you can’t find the answer to your question here, please look at the:
- Frequently Asked Questions. Or,
- Knowledge base of old issues. Or,
- For questions: Stackoverflow. Or,
- For issues: Github. Or,
- Ask your question in the Slack channel: Slack.