Skip to main content

Model provider integrations

Weaviate integrates with a variety of self-hosted and API-based models from a range of providers.

This enables an enhanced developed experience, such as the ability to:

  • Import objects directly into Weaviate without having to manually specify embeddings, and
  • Build an integrated retrieval augmented generation (RAG) pipeline with generative AI models.

Model provider integrations

API-based

Model providerEmbeddingsGenerative AIOthers
Anthropic-Text-
Anyscale-Text-
AWSTextText
CohereTextTextReranker
DatabricksTextText-
FriendliAI-Text-
GoogleText, MultimodalText-
Hugging FaceText--
Jina AIText-Reranker
MistralTextText-
OctoAI (Deprecated)TextText-
OpenAITextText-
Azure OpenAITextText-
Voyage AIText-Reranker

Enable all API-based modules

Experimental feature

Available starting in v1.26.0. This is an experimental feature. Use with caution.

You can enable all API-based integrations at once by by setting the ENABLE_API_BASED_MODULES environment variable to true.

This make all API-based model integrations available for use, such as those for Anthropic, Cohere, OpenAI, and so on. These modules are lightweight, so enabling them all will not significantly increase resource usage.

Read more about enabling all API-based modules.

Locally hosted

Model providerEmbeddingsGenerative AIOthers
GPT4AllText--
Hugging FaceText, Multimodal (CLIP)-Reranker
Meta ImageBindMultimodal--
OllamaTextText-

How does Weaviate generate embeddings?

When a model provider integration for embeddings is enabled, Weaviate automatically generates embeddings for objects that are added to the database.

This is done by providing the source data to the integration provider, which then returns the embeddings to Weaviate. The embeddings are then stored in the Weaviate database.

Weaviate generates embeddings for objects as follows:

  • Selects properties with text or text[] data types unless they are configured to be skipped
  • Sorts properties in alphabetical (a-z) order before concatenating values
  • Prepends the collection name if configured
Case sensitivity

For Weaviate versions before v1.27, the string created above is lowercased before being sent to the model provider. Starting in v1.27, the string is sent as is.

If you prefer the text to be lowercased, you can do so by setting the LOWERCASE_VECTORIZATION_INPUT environment variable. The text is always lowercased for the text2vec-contextionary integration.

Questions and feedback

If you have any questions or feedback, let us know in the user forum.