Skip to main content

text2vec-jinaai

Overview

The text2vec-jinaai module enables Weaviate to obtain vectors using JinaAI Embeddings.

Key notes:

  • As it uses a third-party API, you will need an API key.
  • Its usage may incur costs.
  • JinaAI requires a third-party API key. You can obtain one here.
  • When you enable the text2vec-jinaai model, you can use the nearText search operator.
  • The default model is jina-embeddings-v2-base-en.
Added in v1.22.3

Weaviate instance configuration

Docker Compose file

To use text2vec-jinaai, you must enable it in your Docker Compose file (docker-compose.yml). You can edit the Docker Compose file manually, or use the the Weaviate configuration tool to create a custom file.

Parameters

ParameterRequiredPurpose
ENABLE_MODULESYesThe modules to enable. Include text2vec-jinaai to enable the module.
DEFAULT_VECTORIZER_MODULENoThe default vectorizer module. To make text2vec-jinaai the default for all classes, set it here.
JINAAI_APIKEYNoYour JinaAI API key. You can also provide the key at query time.

Example

This Docker Compose file shows how to use JinaAI as the vectorizer.

  • It enables text2vec-jinaai.
  • It sets text2vec-jinaai as the default vectorizer.
  • It sets a JinaAI API key.
---
version: '3.4'
services:
weaviate:
image: cr.weaviate.io/semitechnologies/weaviate:1.24.10
restart: on-failure:0
ports:
- 8080:8080
- 50051:50051
environment:
QUERY_DEFAULTS_LIMIT: 20
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: "./data"
ENABLE_MODULES: 'text2vec-jinaai'
DEFAULT_VECTORIZER_MODULE: 'text2vec-jinaai'
JINAAI_APIKEY: 'YOUR_JINAAI_API_KEY' # Setting this parameter is optional, you can also provide the key at query time.
CLUSTER_HOSTNAME: 'node1'
...

Collection configuration

To configure how the module behaves in each collection, update the Weaviate schema.

API settings

Parameters

ParameterRequiredDefaultPurpose
modelNojina-embeddings-v2-base-enA model name, e.g. jina-embeddings-v2-small-en.

Example

The following example configures the Document collection by setting the vectorizer to text2vec-jinaai and the model to jina-embeddings-v2-small-en:

{
"classes": [
{
"class": "Document",
"description": "A collection called document",
"vectorizer": "text2vec-jinaai",
"moduleConfig": {
"text2vec-jinaai": {
"model": "jina-embeddings-v2-small-en",
}
},
}
]
}

Vectorization settings

You can set vectorizer behavior using the moduleConfig section under each collection and property:

Collection level settings

ParameterTypeDefaultPurpose
vectorizerstring-Sets the module for vectorization.
vectorizeClassNamebooleantrueWhether to include the class name during vectorization.

Property level settings

ParameterTypeDefaultPurpose
skipbooleanfalseWhen true, does not include the property during vectorization.
vectorizePropertyNamebooleanfalseWhether to include the property name during vectorization.

Example

{
"classes": [
{
"class": "Document",
"description": "A collection called document",
"vectorizer": "text2vec-jinaai",
"moduleConfig": {
"text2vec-jinaai": {
"model": "jina-embeddings-v2-small-en",
"vectorizeClassName": false
}
},
"properties": [
{
"name": "content",
"dataType": ["text"],
"description": "Content that will be vectorized",
"moduleConfig": {
"text2vec-jinaai": {
"skip": false,
"vectorizePropertyName": false
}
}
}
]
}
]
}

Query-time parameters

API key

You can supply the API key at query time by adding it to the HTTP header.

HTTP HeaderValuePurpose
X-Jinaai-Api-KeyYOUR-JINAAI-API-KEYJinaAI API key

Usage example

This is an example of a nearText query that uses text2vec-jinaai.

import weaviate
from weaviate.classes.query import MetadataQuery, Move
import os

client = weaviate.connect_to_local(
headers={
"X-Jinaai-Api-Key": "YOUR_JINAAI_APIKEY",
}
)

publications = client.collections.get("Publication")

response = publications.query.near_text(
query="fashion",
distance=0.6,
move_to=Move(force=0.85, concepts="haute couture"),
move_away=Move(force=0.45, concepts="finance"),
return_metadata=MetadataQuery(distance=True),
limit=2
)

for o in response.objects:
print(o.properties)
print(o.metadata)

client.close()

Additional information

Available models

The following models are available:

  • jina-embeddings-v2-base-en (Default)
  • jina-embeddings-v2-small-en

Questions and feedback

If you have any questions or feedback, let us know in our user forum.