Skip to main content

RAG: Overview

Motivation

Retrieval augmented generation (RAG) is a way to combine the best of both worlds: the retrieval capabilities of semantic search and the generation capabilities of AI models such as large language models. This allows you to retrieve objects from a Weaviate instance and then generate outputs based on the retrieved objects.

Setup

When we created a collection, we specified the generative parameter as shown here.

  generative: configure.generative.cohere(),

This selects a generative module that will be used to generate outputs based on the retrieved objects. In this case, we're using the cohere module, and the command family of large language models.

As we did before with the vectorizer module, you will require an API key from the provider of the generative module. In this case, you will need an API key from Cohere.

RAG queries

RAG queries are also called 'generative' queries in Weaviate. You can access these functions through the generate submodule of the collection object.

Each generative query works in addition to the regular search query, and will perform a RAG query on each retrieved object.

Questions and feedback

If you have any questions or feedback, let us know in the user forum.