Skip to main content

Generative Search - Google

caution
This section of the documentation is deprecated and will be removed in the future.
See the relevant model provider integration page for the most up-to-date information.

Overview

  • The generative-palm module performs retrieval augmented generation, or RAG, using the data stored in your Weaviate instance.
  • The module can generate a response for each returned object, or a single response for a group of objects.
  • The module enables generative search operations on the Weaviate instance.
  • You need an API key for a Google generative model API to use this module.
  • You may incur costs when you use this module.
    • Check the vendor pricing.
  • You can use this module with Google Cloud Vertex AI, or with Google AI Studio.
Releases and versions

AI Studio (previously called MakerSuite) support was added in version 1.22.4.

Configuring generative-palm for VertexAI or AI Studio

The module can be used with either Google Cloud Vertex AI or AI Studio. The configurations vary slightly for each.

API key headers

Starting from v1.25.1 and v1.24.14, there are separate headers X-Google-Vertex-Api-Key and X-Google-Studio-Api-Key for Vertex AI users and AI Studio respectively.


Prior to Weaviate v1.25.1 or v1.24.14, there was one header for both Vertex AI users and AI Studio, specified with either X-Google-Api-Key or X-PaLM-Api-Key. We recommend using the new headers for clarity and future compatibility.

Google Cloud Vertex AI

You must enable the Vertex AI API on your Google Cloud project. To enable the API, following these instructions.

API key for Vertex AI users

The API key for Vertex AI users is called an access token in Google Cloud.

If you have the Google Cloud CLI tool installed and set up, you can view your token by running the following command:

gcloud auth print-access-token

Token expiration for Vertex AI users

Important

By default, Google Cloud's OAuth 2.0 access tokens have a lifetime of 1 hour. You can create tokens that last up to 12 hours. To create longer lasting tokens, follow the instructions in the Google Cloud IAM Guide.

Since the OAuth token is only valid for a limited time, you must periodically replace the token with a new one. After you generate the new token, you have to re-instantiate your Weaviate client to use it.

You can update the OAuth token manually, but manual updates may not be appropriate for your use case.

You can also automate the OAth token update. Weaviate does not control the OAth token update procedure. However, here are some automation options:

With Google Cloud CLI

If you are using the Google Cloud CLI, write a script to periodically update the token and extract the results.


Python code to extract the token looks like this:

client = re_instantiate_weaviate()

This is the re_instantiate_weaviate function:

import subprocess
import weaviate

def refresh_token() -> str:
result = subprocess.run(["gcloud", "auth", "print-access-token"], capture_output=True, text=True)
if result.returncode != 0:
print(f"Error refreshing token: {result.stderr}")
return None
return result.stdout.strip()

def re_instantiate_weaviate() -> weaviate.Client:
token = refresh_token()

client = weaviate.Client(
url = "https://WEAVIATE_INSTANCE_URL", # Replace WEAVIATE_INSTANCE_URL with the URL
additional_headers = {
"X-Google-Vertex-Api-Key": token,
}
)
return client

# Run this every ~60 minutes
client = re_instantiate_weaviate()
With google-auth

Another way is through Google's own authentication library google-auth.


See the links to google-auth in Python and Node.js libraries.


You can, then, periodically the refresh function (see Python docs) to obtain a renewed token, and re-instantiate the Weaviate client.

For example, you could periodically run:

client = re_instantiate_weaviate()

Where re_instantiate_weaviate is something like:

from google.auth.transport.requests import Request
from google.oauth2.service_account import Credentials
import weaviate
import os


def get_credentials() -> Credentials:
credentials = Credentials.from_service_account_file(
"path/to/your/service-account.json",
scopes=[
"https://www.googleapis.com/auth/generative-language",
"https://www.googleapis.com/auth/cloud-platform",
],
)
request = Request()
credentials.refresh(request)
return credentials


def re_instantiate_weaviate() -> weaviate.Client:
credentials = get_credentials()
token = credentials.token

client = weaviate.connect_to_wcs( # e.g. if you use the Weaviate Cloud Service
cluster_url="https://WEAVIATE_INSTANCE_URL", # Replace WEAVIATE_INSTANCE_URL with the URL
auth_credentials=weaviate.auth.AuthApiKey(os.getenv("WCD_DEMO_RO_KEY")), # Replace with your Weaviate Cloud key
headers={
"X-Google-Vertex-Api-Key": token,
},
)
return client


# Run this every ~60 minutes
client = re_instantiate_weaviate()

The service account key shown above can be generated by following this guide.

AI Studio

API key for AI Studio users

You can obtain an API key from this page. This is the key to pass on to Weaviate. This key does not have an expiration date.

apiEndpoint for AI Studio users

In the Weaviate schema configuration, set the apiEndpoint to generativelanguage.googleapis.com.

Introduction

generative-palm performs retrieval augmented generation, or RAG, based on the data stored in your Weaviate instance.

The module works in two steps:

  1. Run a search query in Weaviate to find relevant objects.
  2. Use a PaLM or Gemini model to generate a response. The response is based on the results of the previous step and a prompt or task that you provide.
note

You can use the generative-palm module with any upstream modules. For example, you could use text2vec-openai, text2vec-cohere, or text2vec-huggingface to vectorize and query your data. Then, you can pass the query results to the generative-palm module to generate a response.

The generative module provides results for individual objects or groups of objects:

  • singlePrompt returns a response for each object.
  • groupedTask groups the results to return a single response.

You need to input both a query and a prompt (for individual responses) or a task (for all responses).

Inference API key

Important: Provide the google API key to Weaviate

generative-palm uses a google API endpoint, you must provide a valid google API key to Weaviate.

Provide the key to Weaviate

To provide your Google API key, use the "X-Google-Vertex-Api-Key" or "X-Google-Studio-Api-Key" request header as appropriate. If you use a Weaviate client, follow these examples:

import weaviate

client = weaviate.Client(
url = "https://WEAVIATE_INSTANCE_URL", # Replace WEAVIATE_INSTANCE_URL with the URL
additional_headers = {
"X-Google-Vertex-Api-Key": "YOUR-VERTEX-API-KEY", # Replace with your API key
"X-Google-Studio-Api-Key": "YOUR-AI-STUDIO-API-KEY", # Replace with your API key
}
)

Module configuration

tip

If you use Weaviate Cloud (WCD), this module is already enabled and pre-configured. You cannot edit the configuration in WCD.

Docker Compose file (Weaviate open source only)

You can enable the Generative Palm module in your Docker Compose file (e.g. docker-compose.yml). Add the generative-palm module (alongside any other module you may need) to the ENABLE_MODULES property, like this:

ENABLE_MODULES: 'text2vec-palm,generative-palm'
See a full example of a Docker configuration with generative-palm

Here is a full example of a Docker configuration that uses the generative-palm module in combination with text2vec-palm. The configuration also provides the API key:

---
version: '3.4'
services:
weaviate:
command:
- --host
- 0.0.0.0
- --port
- '8080'
- --scheme
- http
image:
cr.weaviate.io/semitechnologies/weaviate:1.25.8
ports:
- 8080:8080
- 50051:50051
restart: on-failure:0
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
DEFAULT_VECTORIZER_MODULE: 'text2vec-palm'
ENABLE_MODULES: 'text2vec-palm,generative-palm'
CLUSTER_HOSTNAME: 'node1'

Schema configuration

To configure how the module behaves in a collection, see Weaviate schema.

Note that the projectId parameter is required for Vertex AI.

See this page for code examples on how to specify a generative module.

Example schema

This schema configuration sets the Google API information, as well as some optional parameters.

ParameterPurposeExample
"projectId"Only required with Vertex AI"cloud-large-language-models"
"apiEndpoint"Optional"us-central1-aiplatform.googleapis.com"
"modelId"Optional"chat-bison" (Vertex AI)
"chat-bison-001" (AI Studio)
{
"classes": [
{
"class": "Document",
"description": "A class called document",
...,
"moduleConfig": {
"generative-palm": {
"projectId": "YOUR-GOOGLE-CLOUD-PROJECT-ID", // Only required if using Vertex AI. Replace with your value: (e.g. "cloud-large-language-models")
"apiEndpoint": "YOUR-API-ENDPOINT", // Optional. Defaults to "us-central1-aiplatform.googleapis.
"modelId": "YOUR-GOOGLE-CLOUD-ENDPOINT-ID", // Optional. Defaults to `"chat-bison"` for Vertex AI and `"chat-bison-001"` for AI Studio.
"temperature": 0.2, // Optional
"maxOutputTokens": 512, // Optional
"topK": 3, // Optional
"topP": 0.95, // Optional
}
}
}
]
}

See the relevant Google API documentation for further details on these parameters.

New to Weaviate Schemas?

If you are new to Weaviate, check out the Weaviate schema tutorial.

How to use the module

This module extends the _additional {...} property with a generate operator.

generate takes the following arguments:

FieldData TypeRequiredExampleDescription
singleResult {prompt}stringnoSummarize the following in a tweet: {summary}Generates a response for each individual search result. You need to include at least one result field in the prompt, between braces.
groupedResult {task}stringnoExplain why these results are similar to each otherGenerates a single response for all search results

Example of properties in the prompt

When you pipe query results to the prompt, the query pass at least one field. If your results don't pass any fields, Weaviate throws an error.

For example, assume your schema looks like this:

{
Article {
title
summary
}
}

You can add both title and summary to the prompt by enclosing them in curly brackets:

{
Get {
Article {
title
summary
_additional {
generate(
singleResult: {
prompt: """
Summarize the following in a tweet:

{title} - {summary}
"""
}
) {
singleResult
error
}
}
}
}
}

Example - single result

Here is an example of a single result query:

  • A vector search (with nearText) finds articles about "Italian food."
  • The generator module describes each result as a Facebook ad.
    • The query asks for the summary field
    • The query adds summary field to the prompt for the generate operator.
import weaviate
import os

client = weaviate.connect_to_local(
headers={
"X-Google-Vertex-Api-Key": "YOUR-VERTEX-API-KEY",
"X-Google-Studio-Api-Key": "YOUR-AI-STUDIO-API-KEY",
}
)

try:
reviews = client.collections.get("WineReview")

# instruction for the generative module
generate_prompt = "Describe the following as a Facebook Ad: {review_body}"

response = reviews.generate.near_text(
query="fruity white wine",
single_prompt=generate_prompt,
limit=3
)

for o in response.objects:
print(o.generated) # "Single prompt" generations are attributes of each object
print(o.properties) # To inspect the retrieved object
finally:
client.close()

Example response - single result

{
"data": {
"Get": {
"Article": [
{
"_additional": {
"generate": {
"error": null,
"singleResult": "This Facebook Ad will explore the fascinating history of Italian food and how it has evolved over time. Learn from Dr Eva Del Soldato and Diego Zancani, two experts in Italian food history, about how even the emoji for pasta isn't just pasta -- it's a steaming plate of spaghetti heaped with tomato sauce on top. Discover how Italy's complex history has shaped the Italian food we know and love today."
}
},
"summary": "Even the emoji for pasta isn't just pasta -- it's a steaming plate of spaghetti heaped with tomato sauce on top. But while today we think of tomatoes as inextricably linked to Italian food, that hasn't always been the case. \"People tend to think Italian food was always as it is now -- that Dante was eating pizza,\" says Dr Eva Del Soldato , associate professor of romance languages at the University of Pennsylvania, who leads courses on Italian food history. In fact, she says, Italy's complex history -- it wasn't unified until 1861 -- means that what we think of Italian food is, for the most part, a relatively modern concept. Diego Zancani, emeritus professor of medieval and modern languages at Oxford University and author of \"How We Fell in Love with Italian Food,\" agrees.",
"title": "How this fruit became the star of Italian cooking"
}
]
}
}
}

Example - grouped result

Here is an example of a grouped result query:

  • A vector search (with nearText) finds publications about finance.
  • The generator module explains why these articles are about finance.
import weaviate
import os

client = weaviate.connect_to_local(
headers={
"X-Google-Vertex-Api-Key": "YOUR-VERTEX-API-KEY",
"X-Google-Studio-Api-Key": "YOUR-AI-STUDIO-API-KEY",
}
)

try:
reviews = client.collections.get("WineReview")

# instruction for the generative module
generate_prompt = "Explain what occasion these wines might be good for."

response = reviews.generate.near_text(
query="dry red wine",
grouped_task=generate_prompt,
limit=5
)

print(response.generated) # "Grouped task" generations are attributes of the entire response
for o in response.objects:
print(o.properties) # To inspect the retrieved object
finally:
client.close()

Example response - grouped result

{
"data": {
"Get": {
"Publication": [
{
"_additional": {
"generate": {
"error": null,
"groupedResult": "The Financial Times, Wall Street Journal, and The New York Times Company are all about finance because they provide news and analysis on the latest financial markets, economic trends, and business developments. They also provide advice and commentary on personal finance, investments, and other financial topics."
}
},
"name": "Financial Times"
},
{
"_additional": {
"generate": null
},
"name": "Wall Street Journal"
},
{
"_additional": {
"generate": null
},
"name": "The New York Times Company"
}
]
}
}
}

Additional information

Supported models

You can specify the model as a part of the schema as shown earlier. Available models and names differ between Vertex AI and AI Studio.

Vertex AI:

  • chat-bison (default)
  • chat-bison-32k (from Weaviate v1.24.9)
  • chat-bison@002 (from Weaviate v1.24.9)
  • chat-bison-32k@002 (from Weaviate v1.24.9)
  • chat-bison@001 (from Weaviate v1.24.9)
  • gemini-1.5-pro-preview-0514 (from Weaviate v1.25.1)
  • gemini-1.5-pro-preview-0409 (from Weaviate v1.25.1)
  • gemini-1.5-flash-preview-0514 (from Weaviate v1.25.1)
  • gemini-1.0-pro-002 (from Weaviate v1.25.1)
  • gemini-1.0-pro-001 (from Weaviate v1.25.1)
  • gemini-1.0-pro (from Weaviate v1.25.1)

AI Studio:

  • chat-bison-001 (default)
  • gemini-pro

Questions and feedback

If you have any questions or feedback, let us know in the user forum.