NVIDIA Reranker Models with Weaviate

Added in v1.28.5, v1.29.0

Weaviate's integration with NVIDIA's APIs allows you to access their models' capabilities directly from Weaviate.

Configure a Weaviate collection to use an NVIDIA reranker model, and Weaviate will use the specified model and your NVIDIA NIM API key to rerank search results.

This two-step process involves Weaviate first performing a search and then reranking the results using the specified model.

Reranker integration illustration

Requirements

Weaviate configuration

Your Weaviate instance must be configured with the NVIDIA reranker integration (reranker-nvidia) module.

For Weaviate Cloud (WCD) users

This integration is enabled by default on Weaviate Cloud (WCD) serverless instances.

For self-hosted users

Check the cluster metadata to verify if the module is enabled.
Follow the how-to configure modules guide to enable the module in Weaviate.

API credentials

You must provide a valid NVIDIA NIM API key to Weaviate for this integration. Go to NVIDIA to sign up and obtain an API key.

Provide the API key to Weaviate using one of the following methods:

Set the NVIDIA_APIKEY environment variable that is available to Weaviate.
Provide the API key at runtime, as shown in the examples below.

Python API v4
JS/TS API v3

import weaviate
from weaviate.classes.init import Auth
import os

# Recommended: save sensitive data as environment variables
nvidia_key = os.getenv("NVIDIA_APIKEY")
headers = {
    "X-NVIDIA-Api-Key": nvidia_key,
}

client = weaviate.connect_to_weaviate_cloud(
    cluster_url=weaviate_url,                       # `weaviate_url`: your Weaviate URL
    auth_credentials=Auth.api_key(weaviate_key),      # `weaviate_key`: your Weaviate API key
    headers=headers
)

# Work with Weaviate

client.close()

API docs

import weaviate from 'weaviate-client'

const nvidiaApiKey = process.env.NVIDIA_APIKEY || '';  // Replace with your inference API key

const client = await weaviate.connectToWeaviateCloud(
  'WEAVIATE_INSTANCE_URL',  // Replace with your instance URL
  {
    authCredentials: new weaviate.ApiKey('WEAVIATE_INSTANCE_APIKEY'),
    headers: {
      'X-NVIDIA-Api-Key': nvidiaApiKey,
    }
  }
)

// Work with Weaviate

client.close()

API docs

Configure the reranker

Reranker model integration mutable from v1.25.23, v1.26.8 and v1.27.1

A collection's reranker model integration configuration is mutable from v1.25.23, v1.26.8 and v1.27.1. See this section for details on how to update the collection configuration.

Configure a Weaviate collection to use an NVIDIA reranker model as follows:

Python API v4
JS/TS API v3

from weaviate.classes.config import Configure

client.collections.create(
    "DemoCollection",
    reranker_config=Configure.Reranker.nvidia()
    # Additional parameters not shown
)

API docs

// Coming soon

API docs

Select a model

You can specify one of the available models for Weaviate to use, as shown in the following configuration example:

Python API v4
JS/TS API v3

from weaviate.classes.config import Configure

client.collections.create(
    "DemoCollection",
    reranker_config=Configure.Reranker.nvidia(
        model="nvidia/llama-3.2-nv-rerankqa-1b-v2",
        base_url="https://integrate.api.nvidia.com/v1",
    )
    # Additional parameters not shown
)

API docs

// Coming soon

API docs

The default model is used if no model is specified.

Reranking query

Once the reranker is configured, Weaviate performs reranking operations using the specified NVIDIA model.

More specifically, Weaviate performs an initial search, then reranks the results using the specified model.

Any search in Weaviate can be combined with a reranker to perform reranking operations.

Reranker integration illustration

Python API v4
JS/TS API v3

from weaviate.classes.query import Rerank

collection = client.collections.get("DemoCollection")

response = collection.query.near_text(
    query="A holiday film",  # The model provider integration will automatically vectorize the query
    limit=2,
    rerank=Rerank(
        prop="title",                   # The property to rerank on
        query="A melodic holiday film"  # If not provided, the original query will be used
    )
)

for obj in response.objects:
    print(obj.properties["title"])

API docs

let myCollection = client.collections.get('DemoCollection');

const results = await myCollection.query.nearText(
  ['A holiday film'],
  {
    limit: 2,
    rerank: {
      property: 'title',                // The property to rerank on
      query: 'A melodic holiday film'   // If not provided, the original query will be used
    }
  }
);

for (const obj of results.objects) {
  console.log(obj.properties['title']);
}

API docs

References

Available models

You can use any reranker model on NVIDIA NIM APIs with Weaviate.

The default model is nnvidia/rerank-qa-mistral-4b.

Further resources

Other integrations

Code examples

Once the integrations are configured at the collection, the data management and search operations in Weaviate work identically to any other collection. See the following model-agnostic examples:

The how-to: manage data guides show how to perform data operations (i.e. create, update, delete).
The how-to: search guides show how to perform search operations (i.e. vector, keyword, hybrid) as well as retrieval augmented generation.

References

NVIDIA NIM API documentation

Questions and feedback

If you have any questions or feedback, let us know in the user forum.

NVIDIA Reranker Models with Weaviate

Requirements​

Weaviate configuration​

API credentials​

Configure the reranker​

Select a model​

Reranking query​

References​

Available models​

Further resources​

Other integrations​

Code examples​

References​

Questions and feedback​

Requirements

Weaviate configuration

API credentials

Configure the reranker

Select a model

Reranking query

References

Available models

Further resources

Other integrations

Code examples

References

Questions and feedback