Skip to main content


LICENSEΒ Weaviate on Stackoverflow badgeΒ Weaviate issues on Github badgeΒ Weaviate version badgeΒ Weaviate total Docker pulls badgeΒ Go Report Card


The text2vec-cohere module allows you to use the Cohere embeddings directly in the Weaviate vector search engine as a vectorization module. ​When you create a Weaviate class that is set to use this module, it will automatically vectorize your data using Cohere's models.

  • Note: this module uses a third-party API and may incur costs.
  • Note: make sure to check the Cohere pricing page before vectorizing large amounts of data.
  • Note: Weaviate automatically parallelizes requests to the Cohere-API when using the batch endpoint, see the previous note.

How to enable​

Request a Cohere API-key via their dashboard.

Weaviate Cloud Service​

This module is enabled by default on the WCS

Weaviate open source​

You can find an example Docker-compose file below, which will spin up Weaviate with the Cohere module.

version: '3.4'
image: semitechnologies/weaviate:1.17.2
restart: on-failure:0
- "8080:8080"
ENABLE_MODULES: text2vec-cohere
COHERE_APIKEY: sk-foobar # request a key on, setting this parameter is optional, you can also provide the API key on runtime

How to configure​

​In your Weaviate schema, you must define how you want this module to vectorize your data. If you are new to Weaviate schemas, you might want to check out the getting started guide on the Weaviate schema first.

The following schema configuration tells Weaviate to vectorize the Document class with text2vec-cohere, using the multilingual-22-12 model and without input truncation by the Cohere API.

The multilingual models use dot product, and the English model uses cosine. Make sure to set this accordingly in your Weaviate schema. You can see supported distance metrics here.

"classes": [
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-cohere",
"vectorIndexConfig": {
"distance": "dot" // <== Cohere models use dot product instead of the Weaviate default cosine
"moduleConfig": {
"text2vec-cohere": {
"model": "multilingual-22-12", // <== defaults to multilingual-22-12 if not set
"truncate": "RIGHT" // <== defaults to RIGHT if not set
"properties": [
"dataType": [
"description": "Content that will be vectorized",
"moduleConfig": {
"text2vec-cohere": {
"skip": false,
"vectorizePropertyName": false
"name": "content"

How to use​

  • If the Cohere API key is not set in the text2vec-cohere module, you can set the API key on query time by adding the following to the HTTP header: X-Cohere-Api-Key: <cohere-api-key>.
  • Using this module will enable GraphQL vector search parameters in Weaviate. They can be found here.


nearText: {
concepts: ["fashion"],
distance: 0.6 # prior to v1.14 use "certainty" instead of "distance"
moveAwayFrom: {
concepts: ["finance"],
force: 0.45
moveTo: {
concepts: ["haute couture"],
force: 0.85
_additional {
certainty # only supported if distance==cosine.
distance # always supported

🟒 Try out this GraphQL example in the Weaviate Console.

Additional information​

Available models​

Weaviate defaults to Cohere's multilingual-22-12 embedding model unless specified otherwise.

For example, the following schema configuration will set Weaviate to vectorize the Document class with text2vec-cohere using the multilingual-22-12 model.

"classes": [
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-cohere",
"vectorIndexConfig": {
"distance": "dot"
"moduleConfig": {
"text2vec-cohere": {
"model": "multilingual-22-12"


If the input text contains too many tokens and is not truncated, the API will throw an error. The Cohere API can be set to automatically truncate your input text.

You can set the truncation option with the truncate parameter to RIGHT or NONE. Passing RIGHT will discard the right side of the input, the remaining input is exactly the maximum input token length for the model. source

  • The upside of truncating is that a batch import always succeeds.
  • The downside of truncating (i.e., NONE) is that a large text will be partially vectorized without the user being made aware of the truncation.

Cohere Rate Limits​

Because you will be getting embeddings based on your own API key, you will be dealing with rate limits applied to your account. More information about Cohere rate limits can be found here.

Throttle the import inside your application​

If you run into rate limits, you can also decide to throttle the import in your application.

E.g., in Python and Go using the Weaviate client.

from weaviate import Client
import time

def configure_batch(client: Client, batch_size: int, batch_target_rate: int):
Configure the weaviate client's batch so it creates objects at `batch_target_rate`.

client : Client
The Weaviate client instance.
batch_size : int
The batch size.
batch_target_rate : int
The batch target rate as # of objects per second.

def callback(batch_results: dict) -> None:

# you could print batch errors here
time_took_to_create_batch = batch_size * (client.batch.creation_time/client.batch.recommended_num_objects)
max(batch_size/batch_target_rate - time_took_to_create_batch + 1, 0)


More resources​

If you can't find the answer to your question here, please look at the:

  1. Frequently Asked Questions. Or,
  2. Knowledge base of old issues. Or,
  3. For questions: Stackoverflow. Or,
  4. For issues: Github. Or,
  5. Ask your question in the Slack channel: Slack.