NVIDIA Reranker Models with Weaviate
v1.28.5
, v1.29.0
Weaviate's integration with NVIDIA's APIs allows you to access their models' capabilities directly from Weaviate.
Configure a Weaviate collection to use an NVIDIA reranker model, and Weaviate will use the specified model and your NVIDIA NIM API key to rerank search results.
This two-step process involves Weaviate first performing a search and then reranking the results using the specified model.
Requirements
Weaviate configuration
Your Weaviate instance must be configured with the NVIDIA reranker integration (reranker-nvidia
) module.
For Weaviate Cloud (WCD) users
This integration is enabled by default on Weaviate Cloud (WCD) serverless instances.
For self-hosted users
- Check the cluster metadata to verify if the module is enabled.
- Follow the how-to configure modules guide to enable the module in Weaviate.
API credentials
You must provide a valid NVIDIA NIM API key to Weaviate for this integration. Go to NVIDIA to sign up and obtain an API key.
Provide the API key to Weaviate using one of the following methods:
- Set the
NVIDIA_APIKEY
environment variable that is available to Weaviate. - Provide the API key at runtime, as shown in the examples below.
- Python API v4
- JS/TS API v3
import weaviate
from weaviate.classes.init import Auth
import os
# Recommended: save sensitive data as environment variables
nvidia_key = os.getenv("NVIDIA_APIKEY")
headers = {
"X-NVIDIA-Api-Key": nvidia_key,
}
client = weaviate.connect_to_weaviate_cloud(
cluster_url=weaviate_url, # `weaviate_url`: your Weaviate URL
auth_credentials=Auth.api_key(weaviate_key), # `weaviate_key`: your Weaviate API key
headers=headers
)
# Work with Weaviate
client.close()
import weaviate from 'weaviate-client'
const nvidiaApiKey = process.env.NVIDIA_APIKEY || ''; // Replace with your inference API key
const client = await weaviate.connectToWeaviateCloud(
'WEAVIATE_INSTANCE_URL', // Replace with your instance URL
{
authCredentials: new weaviate.ApiKey('WEAVIATE_INSTANCE_APIKEY'),
headers: {
'X-NVIDIA-Api-Key': nvidiaApiKey,
}
}
)
// Work with Weaviate
client.close()
Configure the reranker
v1.25.23
, v1.26.8
and v1.27.1
A collection's reranker
model integration configuration is mutable from v1.25.23
, v1.26.8
and v1.27.1
. See this section for details on how to update the collection configuration.
Configure a Weaviate collection to use an NVIDIA reranker model as follows:
- Python API v4
- JS/TS API v3
Select a model
You can specify one of the available models for Weaviate to use, as shown in the following configuration example:
- Python API v4
- JS/TS API v3
The default model is used if no model is specified.
Reranking query
Once the reranker is configured, Weaviate performs reranking operations using the specified NVIDIA model.
More specifically, Weaviate performs an initial search, then reranks the results using the specified model.
Any search in Weaviate can be combined with a reranker to perform reranking operations.
- Python API v4
- JS/TS API v3
from weaviate.classes.query import Rerank
collection = client.collections.get("DemoCollection")
response = collection.query.near_text(
query="A holiday film", # The model provider integration will automatically vectorize the query
limit=2,
rerank=Rerank(
prop="title", # The property to rerank on
query="A melodic holiday film" # If not provided, the original query will be used
)
)
for obj in response.objects:
print(obj.properties["title"])
let myCollection = client.collections.get('DemoCollection');
const results = await myCollection.query.nearText(
['A holiday film'],
{
limit: 2,
rerank: {
property: 'title', // The property to rerank on
query: 'A melodic holiday film' // If not provided, the original query will be used
}
}
);
for (const obj of results.objects) {
console.log(obj.properties['title']);
}
References
Available models
You can use any reranker model on NVIDIA NIM APIs with Weaviate.
The default model is nnvidia/rerank-qa-mistral-4b
.
Further resources
Other integrations
- NVIDIA text embedding models + Weaviate.
- NVIDIA multimodal embedding embeddings models + Weaviate
- NVIDIA generative models + Weaviate.
Code examples
Once the integrations are configured at the collection, the data management and search operations in Weaviate work identically to any other collection. See the following model-agnostic examples:
- The how-to: manage data guides show how to perform data operations (i.e. create, update, delete).
- The how-to: search guides show how to perform search operations (i.e. vector, keyword, hybrid) as well as retrieval augmented generation.
References
Questions and feedback
If you have any questions or feedback, let us know in the user forum.