RAG: Overview
Motivation
Retrieval augmented generation (RAG) is a way to combine the best of both worlds: the retrieval capabilities of semantic search and the generation capabilities of AI models such as large language models. This allows you to retrieve objects from a Weaviate instance and then generate outputs based on the retrieved objects.
Setup
When we created a collection, we specified the generative
parameter as shown here.
generative: configure.generative.cohere(),
This selects a generative module that will be used to generate outputs based on the retrieved objects. In this case, we're using the cohere
module, and the command
family of large language models.
As we did before with the vectorizer module, you will require an API key from the provider of the generative module. In this case, you will need an API key from Cohere.
RAG queries
RAG queries are also called 'generative' queries in Weaviate. You can access these functions through the generate
submodule of the collection object.
Each generative query works in addition to the regular search query, and will perform a RAG query on each retrieved object.
Questions and feedback
If you have any questions or feedback, let us know in the user forum.