Core Knowledge
Getting started

Installation
Configuration
Client libraries

Schema

GraphQL references
RESTful API references

Modules

    Roadmap
    Architecture
    Vector Index (ANN) Plugins
    Benchmarks

    Tutorials
    More resources

    spellcheck

    Weaviate on Stackoverflow badge Weaviate issues on Github badge Weaviate v1.15.2 version badge Weaviate v1.15.2 version badge Weaviate total Docker pulls badge


    In short

    • The SpellCheck module is a Weaviate module for spell checking of raw text in GraphQL queries.
    • The module depends on a Python spellchecking service.
    • The module adds an spellCheck {} filter to the GraphQL nearText {} search arguments.
    • The module returns the spelling check result in the GraphQL _additional { spellCheck {} } field.

    Introduction

    The SpellCheck module is a Weaviate module for checking spelling in raw texts in GraphQL query inputs. Using the Python spellchecker as service, the module analyzes text, gives a suggestion and can force an auto-correction.

    How to enable (module configuration)

    Docker-compose

    The Q&A module can be added as a service to the Docker-compose file. You must have a text vectorizer like text2vec-contextionary or text2vec-transformers running. An example Docker-compose file for using the spellcheck module with the text2vec-contextionary is here:

    ---
    version: '3.4'
    services:
      weaviate:
        command:
        - --host
        - 0.0.0.0
        - --port
        - '8080'
        - --scheme
        - http
        image: semitechnologies/weaviate:1.15.2
        ports:
        - 8080:8080
        restart: on-failure:0
        environment:
          CONTEXTIONARY_URL: contextionary:9999
          SPELLCHECK_INFERENCE_API: "http://text-spellcheck:8080"
          QUERY_DEFAULTS_LIMIT: 25
          AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
          PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
          DEFAULT_VECTORIZER_MODULE: 'text2vec-contextionary'
          ENABLE_MODULES: 'text2vec-contextionary,text-spellcheck'
          CLUSTER_HOSTNAME: 'node1'
      contextionary:
        environment:
          OCCURRENCE_WEIGHT_LINEAR_FACTOR: 0.75
          EXTENSIONS_STORAGE_MODE: weaviate
          EXTENSIONS_STORAGE_ORIGIN: http://weaviate:8080
          NEIGHBOR_OCCURRENCE_IGNORE_PERCENTILE: 5
          ENABLE_COMPOUND_SPLITTING: 'false'
        image: semitechnologies/contextionary:en0.16.0-v1.0.2
        ports:
        - 9999:9999
      text-spellcheck:
        image: semitechnologies/text-spellcheck-model:pyspellchecker-d933122
    ...
    

    Variable explanations:

    • SPELLCHECK_INFERENCE_API: where the spellcheck module is running

    How to use (GraphQL)

    Use the new spellchecker module to verify user-provided search queries (in existing nearText (given that a text2vec module is used) or ask (if the qna-transformers module is enabled) functions) are spelled correctly and even suggest alternative, correct spellings. Spell-checking happens at query time.

    There are two ways to use this module:

    1. It provides a new GraphQL _additional property which can be used to check (but not alter) the provided queries, see query below.

    Example query

      {
      Get {
        Article(nearText:{
          concepts: ["houssing prices"]
        }) {
          title
          _additional{
            spellCheck{
              changes{
                corrected
                original
              }
              didYouMean
              location
              originalText
            }
          }
        }
      }
    }
    
      import weaviate
    
    client = weaviate.Client("http://localhost:8080")
    
    near_text = {
      "concepts": ["houssing prices"],
    }
    
    result = (
      client.query
      .get("Article", ["title", "_additional {spellCheck { change {corrected original} didYouMean location originalText}}"])
      .with_near_text(near_text)
      .do()
    )
    
    print(result)
    
      const weaviate = require("weaviate-client");
    
    const client = weaviate.client({
      scheme: 'http',
      host: 'localhost:8080',
    });
    
    client.graphql
          .get()
          .withClassName('Article')
          .withFields('title _additional {spellCheck { change {corrected original} didYouMean location originalText}}')
          .withNearText({
            concepts: ["houssing prices"],
          })
          .do()
          .then(res => {
            console.log(res)
          })
          .catch(err => {
            console.error(err)
          });
    
      package main
    
    import (
      "context"
      "fmt"
    
      "github.com/semi-technologies/weaviate-go-client/v4/weaviate"
      "github.com/semi-technologies/weaviate-go-client/v4/weaviate/graphql"
    )
    
    func main() {
      cfg := weaviate.Config{
        Host:   "localhost:8080",
        Scheme: "http",
      }
      client := weaviate.New(cfg)
    
      className := "Article"
      fields := []graphql.Field{
        {Name: "title"},
        {Name: "_additional", Fields: []graphql.Field{
          {Name: "spellCheck", Fields: []graphql.Field{
            {Name: "change", Fields: []graphql.Field{
              {Name: "corrected"},
              {Name: "original"},
            }},
            {Name: "didYouMean"},
            {Name: "location"},
            {Name: "originalText"},
          }},
        }},
      }
      concepts := []string{"houssing prices"}
      nearText := client.GraphQL().NearTextArgBuilder().
        WithConcepts(concepts)
    
      ctx := context.Background()
      result, err := client.GraphQL().Get().
        WithClassName(className).
        WithFields(fields...).
        WithNearText(nearText).
        Do(ctx)
    
      if err != nil {
        panic(err)
      }
      fmt.Printf("%v", result)
    }
    
      package technology.semi.weaviate;
    
    import technology.semi.weaviate.client.Config;
    import technology.semi.weaviate.client.WeaviateClient;
    import technology.semi.weaviate.client.base.Result;
    import technology.semi.weaviate.client.v1.graphql.model.GraphQLResponse;
    import technology.semi.weaviate.client.v1.graphql.query.argument.NearTextArgument;
    import technology.semi.weaviate.client.v1.graphql.query.fields.Field;
    
    public class App {
      public static void main(String[] args) {
        Config config = new Config("http", "localhost:8080");
        WeaviateClient client = new WeaviateClient(config);
    
        Field title = Field.builder().name("title").build();
        Field _additional = Field.builder()
          .name("_additional")
          .fields(new Field[]{
            Field.builder()
              .name("spellCheck")
              .fields(new Field[]{
                Field.builder()
                  .name("change")
                  .fields(new Field[]{
                    Field.builder().name("corrected").build(),
                    Field.builder().name("original").build()
                  }).build(),
                Field.builder().name("didYouMean").build(),
                Field.builder().name("location").build(),
                Field.builder().name("originalText").build()
              }).build()
          }).build();
    
        NearTextArgument explore = client.graphQL().arguments().nearTextArgBuilder()
          .concepts(new String[]{ "houssing prices" })
          .build();
    
        Result<GraphQLResponse> result = client.graphQL().get()
          .withClassName("Article")
          .withFields(title, _additional)
          .withNearText(explore)
          .run();
    
        if (result.hasErrors()) {
          System.out.println(result.getError());
          return;
        }
        System.out.println(result.getResult());
      }
    }
    
      $ echo '{ 
      "query": "{
        Get {
          Article(nearText:{
            concepts: ["houssing prices"]
          }) {
            title
            _additional{
              spellCheck{
                changes{
                  corrected
                  original
                }
                didYouMean
                location
                originalText
              }
            }
          }
        }
      }"
    }' | curl \
        -X POST \
        -H 'Content-Type: application/json' \
        -d @- \
        http://localhost:8080/v1/graphql
    

    GraphQL response

    The result is contained in a new GraphQL _additional property called spellCheck. It contains the following fields:

    • changes: a list with the following fields:
      • corrected (string): the corrected spelling if a correction is found
      • original (string): the original spelled word in the query
    • didYouMean: the corrected full text in the query
    • originalText: the original full text in the query
    • location: the location of the misspelled string in the query

    Example response

    {
      "data": {
        "Get": {
          "Article": [
            {
              "_additional": {
                "spellCheck": [
                  {
                    "changes": [
                      {
                        "corrected": "housing",
                        "original": "houssing"
                      }
                    ],
                    "didYouMean": "housing prices",
                    "location": "nearText.concepts[0]",
                    "originalText": "houssing prices"
                  }
                ]
              },
              "title": "..."
            }
          ]
        }
      },
      "errors": null
    }
    
    1. It extends existing text2vec-modules with a autoCorrect flag, which can be used to correct the query if incorrect in the background:

    Example query

    {
      Get {
        Article(nearText:{
          concepts: ["houssing prices"],
          autocorrect: true
        }) {
          title
          _additional{
            spellCheck{
              changes{
                corrected
                original
              }
              didYouMean
              location
              originalText
            }
          }
        }
      }
    }
    

    🟢 Click here to try out this graphql example in the Weaviate Console.

    More resources

    If you can’t find the answer to your question here, please look at the:

    1. Frequently Asked Questions. Or,
    2. Knowledge base of old issues. Or,
    3. For questions: Stackoverflow. Or,
    4. For issues: Github. Or,
    5. Ask your question in the Slack channel: Slack.