Skip to main content

Reference - Modules

This section describes Weaviate's individual modules, including their capabilities and how to use them.

Looking for vectorizer, generative AI, or reranker integration docs?

They have moved to our model provider integrations section, for a more focussed, user-centric look at these integrations.

General

Weaviate's modules are built into the codebase, and enabled through environment variables to provide additional functionalities.

Module types

Weaviate modules can be divided into the following categories:

  • Vectorizers: Convert data into vector embeddings for import and vector search.
  • Rerankers: Improve search results by reordering initial search results.
  • Generative AI: Integrate generative AI models for retrieval augmented generation (RAG).
  • Backup: Facilitate backup and restore operations in Weaviate.
  • Offloading: Facilitate offloading of tenant data to external storage.
  • [Others]: Modules that provide additional functionalities.

Vectorizer, reranker, and generative AI integrations

For these modules, see the model provider integrations documentation. These pages are organized by the model provider (e.g. Hugging Face, OpenAI) and then the model type (e.g. vectorizer, reranker, generative AI).

For example:

Embedding integration illustration
Reranker integration illustration
Generative integration illustration

Module characteristics

  • Naming convention:
    • Vectorizer (Retriever module): <media>2vec-<name>-<optional>, for example text2vec-contextionary, img2vec-neural or text2vec-transformers.
    • Other modules: <functionality>-<name>-<optional>, for example qna-transformers.
    • A module name must be url-safe, meaning it must not contain any characters which would require url-encoding.
    • A module name is not case-sensitive. text2vec-bert would be the same module as text2vec-BERT.
  • Module information is accessible through the v1/modules/<module-name>/<module-specific-endpoint> RESTful endpoint.
  • General module information (which modules are attached, version, etc.) is accessible through Weaviate's v1/meta endpoint.
  • Modules can add additional properties in the RESTful API and _additional properties in the GraphQL API.
  • A module can add filters in GraphQL queries.
  • Which vectorizer and other modules are applied to which data collection is configured in the schema.

Backup Modules

Backup and restore operations in Weaviate are facilitated by the use of backup provider modules.

These are interchangeable storage backends which exist either internally or externally.

External provider

External backup providers coordinate the storage and retrieval of backed-up Weaviate data with external storage services.

This type of provider is ideal for production environments. This is because storing the backup data outside of a Weaviate instance decouples the availability of the backup from the Weaviate instance itself. In the event of an unreachable node, the backup is still available.

Additionally, multi-node Weaviate clusters require the use of an external provider. Storing a multi-node backup on internally on a single node presents several issues, like significantly reducing the durability and availability of the backup, and is not supported.

The supported external backup providers are:

Thanks to the extensibility of the module system, new providers can be readily added. If you are interested in an external provider other than the ones listed above, feel free to reach out via our forum, or open an issue on GitHub.

Internal provider

Internal providers coordinate the storage and retrieval of backed-up Weaviate data within a Weaviate instance. This type of provider is intended for developmental or experimental use, and is not recommended for production. Internal Providers are not compatible for multi-node backups, which require the use of an external provider.

As of Weaviate v1.16, the only supported internal backup provider is the filesystem provider.

Offloading Modules

Added in v1.26

Offloading modules facilitate the offloading of tenant data to external storage. This is useful for managing resources and costs.

See how to configure: offloading for more information on how to configure and use offloading modules.

Other modules

In addition to the above, there are other modules such as:

  • qna-transformers: Question-answering (answer extraction) capability using transformers models.
  • qna-openai: Question-answering (answer extraction) capability using OpenAI models.
  • ner-transformers: Named entity recognition capability using transformers models.
  • text-spellcheck: Spell checking capability for GraphQL queries.
  • sum-transformers: Summarize text using transformer models.

Other third party integrations

Weaviate integrates with third party systems that provide a wide range of tools and services. For information on particular systems, see Integrations

Questions and feedback

If you have any questions or feedback, let us know in the user forum.