Modules

Weaviate on Stackoverflow badge Weaviate issues on Github badge Weaviate total Docker pulls badge

đź’ˇ You are looking at older or release candidate documentation. The current Weaviate version is v1.15.2


Introduction

Weaviate is completely modularized. The Core of Weaviate, without any modules attached, is a pure vector-native database and search engine. Data is stored as vectors, and these vectors are searchable by the provide vector index algorithm. Without any modules attached, Weaviate does not know how to vectorize data, i.e. how to calculate the vectors from a data item. Depending on the type of data you want to store and search (text, images, etc), and depending on the use case (like search, question answering, etc, depending on language, classification, ML model, training set, etc), you can choose and attach a module that best fits your use case.

Characteristics

Modules can be “vectorizers” (defines how the numbers in the vectors are chosen from the data) or other modules providing additional functions like question answering, custom classification, etc. Modules have the following characteristics:

  • Naming convention:
    • Vectorizer: <media>2vec-<name>-<optional>, for example text2vec-contextionary, image2vec-RESNET or text2vec-BERT-transformer-hyperfoo.
    • Other modules: <functionality>-<name>-<optional>.
    • A module name must be url-safe, meaning it must not contain any characters which would require url-encoding.
    • A module name is not case-sensitive. text2vec-bert would be the same module as text2vec-BERT.
  • Module information is accessible through the v1/modules/<module-name>/<module-specific-endpoint> RESTful endpoint.
  • General module information (which modules are attached, version, etc.) is accessible through Weaviate’s v1/meta endpoint.
  • Modules can add additional properties in the RESTful API and _additional properties in the GraphQL API.
  • A module can add filters in GraphQL queries.
  • Which vectorizer and other modules are applied to which data classes is configured in the schema.

Text vectorizer Contextionary

One vectorizer that is provided is text2vec-contextionary. text2vec-contextionary is a text vectorizer that gives context to the textual data using a language model trained using fasttext on Wiki data and CommonCrawl. More information can be found here.

Custom modules

Custom modules will soon be supported, more information can be found here. Stay tuned!

More Resources

If you can’t find the answer to your question here, please look at the:

  1. Frequently Asked Questions. Or,
  2. Knowledge base of old issues. Or,
  3. For questions: Stackoverflow. Or,
  4. For issues: Github. Or,
  5. Ask your question in the Slack channel: Slack.