Model provider integrations

Weaviate integrates with a variety of self-hosted and API-based models from a range of providers.

This enables an enhanced developed experience, such as the ability to:

Import objects directly into Weaviate without having to manually specify embeddings, and
Build an integrated retrieval augmented generation (RAG) pipeline with generative AI models.

Model provider integrations

API-based

Model provider	Embeddings	Generative AI	Others
Anthropic	-	Text	-
Anyscale	-	Text	-
AWS	Text	Text
Cohere	Text, Multimodal	Text	Reranker
Databricks	Text	Text	-
FriendliAI	-	Text	-
Google	Text, Multimodal	Text	-
Hugging Face	Text	-	-
Jina AI	Text, Multimodal	-	Reranker
Mistral	Text	Text	-
NVIDIA	Text, Multimodal	Text	Reranker
OctoAI (Deprecated)	Text	Text	-
OpenAI	Text	Text	-
Azure OpenAI	Text	Text	-
Voyage AI	Text, Multimodal	-	Reranker
Weaviate	Text	-	-
xAI	-	Text	-

Enable all API-based modules

Experimental feature

Available starting in v1.26.0. This is an experimental feature. Use with caution.

You can enable all API-based integrations at once by by setting the ENABLE_API_BASED_MODULES environment variable to true.

This make all API-based model integrations available for use, such as those for Anthropic, Cohere, OpenAI, and so on. These modules are lightweight, so enabling them all will not significantly increase resource usage.

Read more about enabling all API-based modules.

Locally hosted

Model provider	Embeddings	Generative AI	Others
GPT4All	Text	-	-
Hugging Face	Text, Multimodal (CLIP)	-	Reranker
Meta ImageBind	Multimodal	-	-
Ollama	Text	Text	-

How does Weaviate generate embeddings?

When a model provider integration for embeddings is enabled, Weaviate automatically generates embeddings for objects that are added to the database.

This is done by providing the source data to the integration provider, which then returns the embeddings to Weaviate. The embeddings are then stored in the Weaviate database.

Weaviate generates embeddings for objects as follows:

Selects properties with text or text[] data types unless they are configured to be skipped
Sorts properties in alphabetical (a-z) order before concatenating values
Prepends the collection name if configured

Case sensitivity

For Weaviate versions before v1.27, the string created above is lowercased before being sent to the model provider. Starting in v1.27, the string is sent as is.

If you prefer the text to be lowercased, you can do so by setting the LOWERCASE_VECTORIZATION_INPUT environment variable. The text is always lowercased for the text2vec-contextionary integration.

Questions and feedback

If you have any questions or feedback, let us know in the user forum.