Skip to main content

Generative Search - Google

Overview

  • The generative-palm module performs retrieval augmented generation, or RAG, using the data stored in your Weaviate instance.
  • The module can generate a response for each returned object, or a single response for a group of objects.
  • The module enables generative search operations on the Weaviate instance.
  • You need an API key for a Google generative model API to use this module.
  • You may incur costs when you use this module.
    • Please check the vendor pricing.
  • You can use this module with Google Cloud Vertex AI, or with Google Google AI Studio.
Releases and versions

AI Studio (previously called MakerSuite) support was added in version 1.22.4.

Configuring generative-palm for VertexAI or AI Studio

The module can be used with either Google Cloud Vertex AI or AI Studio. The configurations vary slightly for each.

Google Cloud Vertex AI

You must enable the Vertex AI API on your Google Cloud project. To enable the API, following these instructions.

API key for Vertex AI users

The API key for Vertex AI users is called an access token in Google Cloud.

If you have the Google Cloud CLI tool installed and set up, you can view your token by running the following command:

gcloud auth print-access-token

Token expiration for Vertex AI users

Important

By default, Google Cloud's OAuth 2.0 access tokens have a lifetime of 1 hour. You can create tokens that last up to 12 hours. To create longer lasting tokens, follow the instructions in the Google Cloud IAM Guide.

Since the OAuth token is only valid for a limited time, you must periodically replace the token with a new one. After you generate the new token, you have to re-instantiate your Weaviate client to use it.

You can update the OAuth token manually, but manual updates may not be appropriate for your use case.

You can also automate the OAth token update. Weaviate does not control the OAth token update procedure. However, here are some automation options:

With Google Cloud CLI

If you are using the Google Cloud CLI, write a script to periodically update the token and extract the results.


Python code to extract the token looks like this:

client = re_instantiate_weaviate()

This is the re_instantiate_weaviate function:

import subprocess
import weaviate

def refresh_token() -> str:
result = subprocess.run(["gcloud", "auth", "print-access-token"], capture_output=True, text=True)
if result.returncode != 0:
print(f"Error refreshing token: {result.stderr}")
return None
return result.stdout.strip()

def re_instantiate_weaviate() -> weaviate.Client:
token = refresh_token()

client = weaviate.Client(
url = "https://WEAVIATE_INSTANCE_URL", # Replace WEAVIATE_INSTANCE_URL with the URL
additional_headers = {
"X-PaLM-Api-Key": token,
}
)
return client

# Run this every ~60 minutes
client = re_instantiate_weaviate()
With google-auth

Another way is through Google's own authentication library google-auth.


See the links to google-auth in Python and Node.js libraries.


You can, then, periodically the refresh function (see Python docs) to obtain a renewed token, and re-instantiate the Weaviate client.

For example, you could periodically run:

client = re_instantiate_weaviate()

Where re_instantiate_weaviate is something like:

from google.auth.transport.requests import Request
from google.oauth2.service_account import Credentials
import weaviate
import os


def get_credentials() -> Credentials:
credentials = Credentials.from_service_account_file(
"path/to/your/service-account.json",
scopes=[
"https://www.googleapis.com/auth/generative-language",
"https://www.googleapis.com/auth/cloud-platform",
],
)
request = Request()
credentials.refresh(request)
return credentials


def re_instantiate_weaviate() -> weaviate.Client:
credentials = get_credentials()
token = credentials.token

client = weaviate.connect_to_wcs( # e.g. if you use the Weaviate Cloud Service
cluster_url="https://WEAVIATE_INSTANCE_URL", # Replace WEAVIATE_INSTANCE_URL with the URL
auth_credentials=weaviate.auth.AuthApiKey(os.getenv("WCS_DEMO_RO_KEY")), # Replace with your WCS key
headers={
"X-PaLM-Api-Key": token,
},
)
return client


# Run this every ~60 minutes
client = re_instantiate_weaviate()

The service account key shown above can be generated by following this guide.

AI Studio

AI Studio may not be available in all regions. See this page for the latest information.

API key for AI Studio users

You can obtain an API key by logging in to your AI Studio account and creating an API key. This is the key to pass on to Weaviate. This key does not have an expiration date.

apiEndpoint for AI Studio users

In the Weaviate schema configuration, set the apiEndpoint to generativelanguage.googleapis.com.

Introduction

generative-palm performs retrieval augmented generation, or RAG, based on the data stored in your Weaviate instance.

The module works in two steps:

  1. Run a search query in Weaviate to find relevant objects.
  2. Use a PaLM or Gemini model to generate a response. The response is based on the results of the previous step and a prompt or task that you provide.
note

You can use the generative-palm module with any upstream modules. For example, you could use text2vec-openai, text2vec-cohere, or text2vec-huggingface to vectorize and query your data. Then, you can pass the query results to the generative-palm module to generate a response.

The generative module provides results for individual objects or groups of objects:

  • singlePrompt returns a response for each object.
  • groupedTask groups the results to return a single response.

You need to input both a query and a prompt (for individual responses) or a task (for all responses).

Inference API key

Important: Provide the google API key to Weaviate

generative-palm uses a google API endpoint, you must provide a valid google API key to Weaviate.

Provide the key to Weaviate

To provide your Google API key, use the "X-PaLM-Api-Key" request header. If you use a Weaviate client, follow these examples:

import weaviate

client = weaviate.Client(
url = "https://WEAVIATE_INSTANCE_URL", # Replace WEAVIATE_INSTANCE_URL with the URL
additional_headers = {
"X-PaLM-Api-Key": "YOUR-PALM-API-KEY", # Replace with your API key
}
)

Optionally (not recommended), you can provide the Google API key as an environment variable.

How to provide the Google API key as an environment variable

During the configuration of your Docker instance, by adding PALM_APIKEY under environment to your Docker Compose file, like this:

environment:
PALM_APIKEY: 'your-key-goes-here' # Setting this parameter is optional; you can also provide the key at runtime.
...

Module configuration

tip

If you use Weaviate Cloud Services (WCS), this module is already enabled and pre-configured. You cannot edit the configuration in WCS.

Docker Compose file (Weaviate open source only)

You can enable the Generative Palm module in your Docker Compose file (e.g. docker-compose.yml). Add the generative-palm module (alongside any other module you may need) to the ENABLE_MODULES property, like this:

ENABLE_MODULES: 'text2vec-palm,generative-palm'
See a full example of a Docker configuration with generative-palm

Here is a full example of a Docker configuration that uses the generative-palm module in combination with text2vec-palm. The configuration also provides the API key:

---
version: '3.4'
services:
weaviate:
command:
- --host
- 0.0.0.0
- --port
- '8080'
- --scheme
- http
image:
cr.weaviate.io/semitechnologies/weaviate:1.24.10
ports:
- 8080:8080
- 50051:50051
restart: on-failure:0
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
DEFAULT_VECTORIZER_MODULE: 'text2vec-palm'
ENABLE_MODULES: 'text2vec-palm,generative-palm'
PALM_APIKEY: sk-yourKeyGoesHere # This parameter is optional; you can also provide the key at runtime.
CLUSTER_HOSTNAME: 'node1'

Schema configuration

To configure how the module behaves in a collection, see Weaviate schema.

Note that the projectId parameter is required for Vertex AI.

See this page for code examples on how to specify a generative module.

Example schema

This schema configuration sets the Google API information, as well as some optional parameters.

ParameterPurposeExample
"projectId"Only required with Vertex AI"cloud-large-language-models"
"apiEndpoint"Optional"us-central1-aiplatform.googleapis.com"
"modelId"Optional"chat-bison" (Vertex AI)
"chat-bison-001" (AI Studio)
{
"classes": [
{
"class": "Document",
"description": "A class called document",
...,
"moduleConfig": {
"generative-palm": {
"projectId": "YOUR-GOOGLE-CLOUD-PROJECT-ID", // Only required if using Vertex AI. Replace with your value: (e.g. "cloud-large-language-models")
"apiEndpoint": "YOUR-API-ENDPOINT", // Optional. Defaults to "us-central1-aiplatform.googleapis.
"modelId": "YOUR-GOOGLE-CLOUD-ENDPOINT-ID", // Optional. Defaults to `"chat-bison"` for Vertex AI and `"chat-bison-001"` for AI Studio.
"temperature": 0.2, // Optional
"maxOutputTokens": 512, // Optional
"topK": 3, // Optional
"topP": 0.95, // Optional
}
}
}
]
}

See the relevant Google API documentation for further details on these parameters.

New to Weaviate Schemas?

If you are new to Weaviate, check out the Weaviate schema tutorial.

How to use the module

This module extends the _additional {...} property with a generate operator.

generate takes the following arguments:

FieldData TypeRequiredExampleDescription
singleResult {prompt}stringnoSummarize the following in a tweet: {summary}Generates a response for each individual search result. You need to include at least one result field in the prompt, between braces.
groupedResult {task}stringnoExplain why these results are similar to each otherGenerates a single response for all search results

Example of properties in the prompt

When you pipe query results to the prompt, the query pass at least one field. If your results don't pass any fields, Weaviate throws an error.

For example, assume your schema looks like this:

{
Article {
title
summary
}
}

You can add both title and summary to the prompt by enclosing them in curly brackets:

{
Get {
Article {
title
summary
_additional {
generate(
singleResult: {
prompt: """
Summarize the following in a tweet:

{title} - {summary}
"""
}
) {
singleResult
error
}
}
}
}
}

Example - single result

Here is an example of a single result query:

  • A vector search (with nearText) finds articles about "Italian food."
  • The generator module describes each result as a Facebook ad.
    • The query asks for the summary field
    • The query adds summary field to the prompt for the generate operator.
import weaviate
import os

client = weaviate.connect_to_local(
headers={
"X-PaLM-Api-Key": "YOUR_PALM_APIKEY",
}
)

try:
reviews = client.collections.get("WineReview")

# instruction for the generative module
generate_prompt = "Describe the following as a Facebook Ad: {review_body}"

response = reviews.generate.near_text(
query="fruity white wine",
single_prompt=generate_prompt,
limit=3
)

for o in response.objects:
print(o.generated) # "Single prompt" generations are attributes of each object
print(o.properties) # To inspect the retrieved object
finally:
client.close()

Example response - single result

{
"data": {
"Get": {
"Article": [
{
"_additional": {
"generate": {
"error": null,
"singleResult": "This Facebook Ad will explore the fascinating history of Italian food and how it has evolved over time. Learn from Dr Eva Del Soldato and Diego Zancani, two experts in Italian food history, about how even the emoji for pasta isn't just pasta -- it's a steaming plate of spaghetti heaped with tomato sauce on top. Discover how Italy's complex history has shaped the Italian food we know and love today."
}
},
"summary": "Even the emoji for pasta isn't just pasta -- it's a steaming plate of spaghetti heaped with tomato sauce on top. But while today we think of tomatoes as inextricably linked to Italian food, that hasn't always been the case. \"People tend to think Italian food was always as it is now -- that Dante was eating pizza,\" says Dr Eva Del Soldato , associate professor of romance languages at the University of Pennsylvania, who leads courses on Italian food history. In fact, she says, Italy's complex history -- it wasn't unified until 1861 -- means that what we think of Italian food is, for the most part, a relatively modern concept. Diego Zancani, emeritus professor of medieval and modern languages at Oxford University and author of \"How We Fell in Love with Italian Food,\" agrees.",
"title": "How this fruit became the star of Italian cooking"
}
]
}
}
}

Example - grouped result

Here is an example of a grouped result query:

  • A vector search (with nearText) finds publications about finance.
  • The generator module explains why these articles are about finance.
import weaviate
import os

client = weaviate.connect_to_local(
headers={
"X-PaLM-Api-Key": "YOUR_PALM_APIKEY",
}
)

try:
reviews = client.collections.get("WineReview")

# instruction for the generative module
generate_prompt = "Explain what occasion these wines might be good for."

response = reviews.generate.near_text(
query="dry red wine",
grouped_task=generate_prompt,
limit=5
)

print(response.generated) # "Grouped task" generations are attributes of the entire response
for o in response.objects:
print(o.properties) # To inspect the retrieved object
finally:
client.close()

Example response - grouped result

{
"data": {
"Get": {
"Publication": [
{
"_additional": {
"generate": {
"error": null,
"groupedResult": "The Financial Times, Wall Street Journal, and The New York Times Company are all about finance because they provide news and analysis on the latest financial markets, economic trends, and business developments. They also provide advice and commentary on personal finance, investments, and other financial topics."
}
},
"name": "Financial Times"
},
{
"_additional": {
"generate": null
},
"name": "Wall Street Journal"
},
{
"_additional": {
"generate": null
},
"name": "The New York Times Company"
}
]
}
}
}

Multi-modality

Added in v1.24.2

Weaviate can leverage multimodality of the gemini-pro-vision model. Thus, the input passed onto gemini-pro-vision can be a combination of text and images, where images are represented as base64 encoded strings.

Additional information

Supported models

You can specify the model as a part of the schema as shown earlier. Available models and names differ between Vertex AI and AI Studio.

Vertex AI:

  • chat-bison (default)
  • gemini-pro
  • gemini-pro-vision (from Weaviate v1.24.2)
  • chat-bison-32k (from Weaviate v1.24.9)
  • chat-bison@002 (from Weaviate v1.24.9)
  • chat-bison-32k@002 (from Weaviate v1.24.9)
  • chat-bison@001 (from Weaviate v1.24.9)

AI Studio:

  • chat-bison-001 (default)
  • gemini-pro
  • gemini-pro-vision (from Weaviate v1.24.2)

Questions and feedback

If you have any questions or feedback, let us know in our user forum.