Generative Search - PaLM
In short
- The Generative PaLM (
generative-palm
) module generates responses based on the data stored in your Weaviate instance. - The module can generate a response for each returned object, or a single response for a group of objects.
- The module adds a
generate {}
operator to the GraphQL_additional {}
property of theGet {}
queries. - Added in Weaviate
v1.19.1
. - You need an API key for a PaLM API to use this module. The module uses the Google Cloud
access token
. - Its usage may incur costs.
- Please check the vendor pricing (e.g. check Google Vertex AI pricing).
- The default model is
chat-bison
.
As of the time of writing (September 2023), you must manually enable the Vertex AI API on your Google Cloud project. You can do so by following the instructions here.
Introduction
generative-palm
generates responses based on the data stored in your Weaviate instance.
The module works in two steps:
- (Weaviate) Run a search query in Weaviate to find relevant objects.
- (PaLM) Use a PaLM model to generate a response based on the results (from the previous step) and the provided prompt or task.
You can use the Generative PaLM module with non-PaLM upstream modules. For example, you could use text2vec-openai
, text2vec-cohere
or text2vec-huggingface
to vectorize and query your data, but then rely on the generative-palm
module to generate a response.
The generative module can provide results for:
- each returned object, using
singleResult{ prompt }
- the group of all results together, using
groupedResult{ task }
You need to input both a query and a prompt (for individual responses) or a task (for all responses).
Inference API key
As the generative-palm
uses a PaLM API endpoint, you must provide a valid PaLM API key to weaviate.
For Google Cloud users
This is called an access token
in Google Cloud.
If you have the Google Cloud CLI tool installed and set up, you can view your token by running the following command:
gcloud auth print-access-token
Providing the key to Weaviate
You can provide your PaLM API key by providing "X-Palm-Api-Key"
through the request header. If you use the Weaviate client, you can do so like this:
- Python
- JavaScript/TypeScript
- Go
- Java
import weaviate
client = weaviate.Client(
url = "https://some-endpoint.weaviate.network/",
additional_headers = {
"X-Palm-Api-Key": "YOUR-PALM-API-KEY", # Replace with your API key
}
)
import weaviate from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'https',
host: 'some-endpoint.weaviate.network',
// Replace with your API key
headers: {
'X-Palm-Api-Key': 'YOUR-PALM-API-KEY',
},
});
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate/entities/models"
)
func main() {
cfg := weaviate.Config{
Host: "some-endpoint.weaviate.network/", // Replace with your endpoint
Scheme: "https",
// Replace with your API key
Headers: map[string]string{
"X-Palm-Api-Key": "YOUR-PALM-API-KEY",
}
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
}
package io.weaviate;
import java.util.ArrayList;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
public class App {
public static void main(String[] args) {
Map<String, String> headers = new HashMap<String, String>() { {
// Replace with your API key
put("X-Palm-Api-Key", "YOUR-PALM-API-KEY");
} };
Config config = new Config("https", "some-endpoint.weaviate.network/", headers);
WeaviateClient client = new WeaviateClient(config);
}
}
Optionally (not recommended), you can provide the PaLM API key as an environment variable.
How to provide the PaLM API key as an environment variable
During the configuration of your Docker instance, by adding PALM_APIKEY
under environment
to your Docker Compose
file, like this:
environment:
PALM_APIKEY: 'your-key-goes-here' # Setting this parameter is optional; you can also provide the key at runtime.
...
Token expiry for Google Cloud users
Google Cloud's OAuth 2.0 access tokens are configured to have a standard lifetime of 1 hour.
Therefore, you must periodically replace the token with a valid one and supply it to Weaviate by re-instantiating the client with the new key.
You can do this manually.
Automating this is a complex, advanced process that is outside the scope of our control. However, here are a couple of possible options for doing so:
With Google Cloud CLI
If you are using the Google Cloud CLI, you could run this through your preferred programming language, and extract the results.
For example, you could periodically run:
client = re_instantiate_weaviate()
Where re_instantiate_weaviate
is something like:
import subprocess
import weaviate
def refresh_token() -> str:
result = subprocess.run(["gcloud", "auth", "print-access-token"], capture_output=True, text=True)
if result.returncode != 0:
print(f"Error refreshing token: {result.stderr}")
return None
return result.stdout.strip()
def re_instantiate_weaviate() -> weaviate.Client:
token = refresh_token()
client = weaviate.Client(
url = "https://some-endpoint.weaviate.network", # Replace with your Weaviate URL
additional_headers = {
"X-Palm-Api-Key": token,
}
)
return client
# Run this every ~60 minutes
client = re_instantiate_weaviate()
With google-auth
Another way is through Google's own authentication library google-auth
.
See the links to google-auth
in Python and Node.js libraries.
You can, then, periodically the refresh
function (see Python docs) to obtain a renewed token, and re-instantiate the Weaviate client.
For example, you could periodically run:
client = re_instantiate_weaviate()
Where re_instantiate_weaviate
is something like:
from google.auth.transport.requests import Request
from google.oauth2.service_account import Credentials
import weaviate
def get_credentials() -> Credentials:
credentials = Credentials.from_service_account_file('path/to/your/service-account.json', scopes=['openid'])
request = Request()
credentials.refresh(request)
return credentials
def re_instantiate_weaviate() -> weaviate.Client:
credentials = get_credentials()
token = credentials.token
client = weaviate.Client(
url = "https://some-endpoint.weaviate.network", # Replace with your Weaviate URL
additional_headers = {
"X-Palm-Api-Key": token,
}
)
return client
# Run this every ~60 minutes
client = re_instantiate_weaviate()
The service account key shown above can be generated by following this guide.
Module configuration
This module is enabled and pre-configured on Weaviate Cloud Services.
Docker Compose file (Weaviate open source only)
You can enable the Generative Palm module in your Docker Compose file (e.g. docker-compose.yml
). Add the generative-palm
module (alongside any other module you may need) to the ENABLE_MODULES
property, like this:
ENABLE_MODULES: 'text2vec-palm,generative-palm'
See a full example of a Docker configuration with generative-palm
Here is a full example of a Docker configuration, which uses the generative-palm
module in combination with text2vec-palm
, and provides the API key:
---
version: '3.4'
services:
weaviate:
command:
- --host
- 0.0.0.0
- --port
- '8080'
- --scheme
- http
image:
semitechnologies/weaviate:1.21.3
ports:
- 8080:8080
restart: on-failure:0
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
DEFAULT_VECTORIZER_MODULE: 'text2vec-palm'
ENABLE_MODULES: 'text2vec-palm,generative-palm'
PALM_APIKEY: sk-foobar # Setting this parameter is optional; you can also provide the key at runtime.
CLUSTER_HOSTNAME: 'node1'
Schema configuration
You can define settings for this module in the schema, including the API endpoint and project information, as well as optional model parameters.
Note that the projectId
parameter is required.
Example schema
For example, the following schema configuration will set the PaLM API information, as well as the optional parameters.
- The
"projectId"
is REQUIRED, and may be something like"cloud-large-language-models"
- The
"apiEndpoint"
is optional, and may be something like:"us-central1-aiplatform.googleapis.com"
, and - The
"modelId"
is optional, and may be something like"chat-bison"
.
{
"classes": [
{
"class": "Document",
"description": "A class called document",
...,
"moduleConfig": {
"generative-palm": {
"projectId": "YOUR-GOOGLE-CLOUD-PROJECT-ID", // Required. Replace with your value: (e.g. "cloud-large-language-models")
"apiEndpoint": "YOUR-API-ENDPOINT", // Optional. Defaults to "us-central1-aiplatform.googleapis.
"modelId": "YOUR-GOOGLE-CLOUD-ENDPOINT-ID", // Optional. Defaults to "chat-bison"
"temperature": 0.2, // Optional
"maxOutputTokens": 512, // Optional
"topK": 3, // Optional
"topP": 0.95, // Optional
}
}
}
]
}
See the relevant PaLM API documentation for further details on these parameters.
New to Weaviate Schemas?
If you are new to Weaviate, check out the Weaviate schema tutorial.
How to use
This module extends the _additional {...}
property with a generate
operator.
generate
takes the following arguments:
Field | Data Type | Required | Example | Description |
---|---|---|---|---|
singleResult {prompt} | string | no | Summarize the following in a tweet: {summary} | Generates a response for each individual search result. You need to include at least one result field in the prompt, between braces. |
groupedResult {task} | string | no | Explain why these results are similar to each other | Generates a single response for all search results |
Example of properties in the prompt
When piping the results to the prompt, at least one field returned by the query must be added to the prompt. If you don't add any fields, Weaviate will throw an error.
For example, assume your schema looks like this:
{
Article {
title
summary
}
}
You can add both title
and summary
to the prompt by enclosing them in curly brackets:
{
Get {
Article {
title
summary
_additional {
generate(
singleResult: {
prompt: """
Summarize the following in a tweet:
{title} - {summary}
"""
}
) {
singleResult
error
}
}
}
}
}
Example - single result
Here is an example of a query where:
- we run a vector search (with
nearText
) to find articles about "Italian food" - then we ask the generator module to describe each result as a Facebook ad.
- the query asks for the
summary
field, which it then includes in theprompt
argument of thegenerate
operator.
- the query asks for the
- GraphQL
- Python
- JavaScript/TypeScript
- Go
- Java
- Curl
{
Get {
Article(
nearText: {
concepts: ["Italian food"]
}
limit: 1
) {
title
summary
_additional {
generate(
singleResult: {
prompt: """
Describe the following as a Facebook Ad: {summary}
"""
}
) {
singleResult
error
}
}
}
}
}
import weaviate
client = weaviate.Client(
url = "https://some-endpoint.weaviate.network/",
additional_headers={
"X-Palm-Api-Key": "YOUR-PALM-API-KEY"
}
)
# instruction for the generative module
generatePrompt = "Describe the following as a Facebook Ad: {summary}"
result = (
client.query
.get("Article", ["title", "summary"])
.with_generate(single_prompt=generatePrompt)
.with_near_text({
"concepts": ["Italian food"]
})
.with_limit(5)
).do()
print(result)
import weaviate, { ApiKey } from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'https',
host: 'some-endpoint.weaviate.network', // Replace with your endpoint
apiKey: new ApiKey('YOUR-WEAVIATE-API-KEY'), // Replace with your Weaviate API key
headers: { 'X-Palm-Api-Key': process.env['PALM_API_KEY'] }, // Replace with your PALM API key
});
// instruction for the generative module
const generatePrompt = 'Describe the following as a Facebook Ad: {summary}';
const response = await client.graphql
.get()
.withClassName('Article')
.withFields('title summary')
.withNearText({
concepts: ['Italian food'],
})
.withGenerate({
singlePrompt: generatePrompt,
})
.withLimit(5)
.do();
console.log(JSON.stringify(response, null, 2));
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate-go-client/v4/weaviate/graphql"
)
func main() {
cfg := weaviate.Config{
Host: "some-endpoint.weaviate.network",
Scheme: "https",
Headers: map[string]string{"X-Palm-Api-Key": "YOUR-PALM-API-KEY"},
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
ctx := context.Background()
fields := []graphql.Field{
{Name: "title"},
{Name: "summary"},
}
concepts := []string{"Italian food"}
nearText := client.GraphQL().NearTextArgBuilder().
WithConcepts(concepts)
gs := graphql.NewGenerativeSearch().SingleResult("\"Describe the following as a Facebook Ad: {summary}\"")
result, err := client.GraphQL().Get().
WithClassName("Article").
WithFields(fields...).
WithNearText(nearText).
withGenerativeSearch(generativeSearch).
WithLimit(5).
Do(ctx)
if err != nil {
panic(err)
}
fmt.Printf("%v", result)
}
package io.weaviate;
import java.util.HashMap;
import java.util.Map;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
import io.weaviate.client.v1.graphql.model.GraphQLResponse;
import io.weaviate.client.v1.graphql.query.argument.NearTextArgument;
import io.weaviate.client.v1.graphql.query.fields.Field;
public class App {
public static void main(String[] args) {
Map<String, String> headers = new HashMap<String, String>() {
{put("X-Palm-Api-Key", "YOUR-PALM-API-KEY");}
};
Config config = new Config("https", "some-endpoint.weaviate.network", headers);
WeaviateClient client = new WeaviateClient(config);
// instruction for the generative module
GenerativeSearchBuilder generativeSearch = GenerativeSearchBuilder.builder()
.singleResultPrompt("\"Describe the following as a Facebook Ad: {summary}\"")
.build();
Field title = Field.builder().name("title").build();
Field summary = Field.builder().name("summary").build();
NearTextArgument nearText = client.graphQL().arguments().nearTextArgBuilder()
.concepts(new String[]{ "Italian food" })
.build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName("Article")
.withFields(title, summary)
.withGenerativeSearch(generativeSearch)
.withNearText(nearText)
.withLimit(5)
.run();
if (result.hasErrors()) {
System.out.println(result.getError());
return;
}
System.out.println(result.getResult());
}
}
echo '{
"query": "{
Get {
Article(
nearText: {
concepts: [\"Italian food\"]
}
limit: 5
) {
title
summary
_additional {
generate(
singleResult: {
prompt: \"\"\"
Describe the following as a Facebook Ad: {summary}
\"\"\"
}
) {
singleResult
error
}
}
}
}
}
"
}' | tr -d "\n" | curl \
-X POST \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $WEAVIATE_API_KEY" \
-H "X-Palm-Api-Key: $PALM_API_KEY" \
-d @- \
https://some-endpoint.weaviate.network/v1/graphql
Example response - single result
{
"data": {
"Get": {
"Article": [
{
"_additional": {
"generate": {
"error": null,
"singleResult": "This Facebook Ad will explore the fascinating history of Italian food and how it has evolved over time. Learn from Dr Eva Del Soldato and Diego Zancani, two experts in Italian food history, about how even the emoji for pasta isn't just pasta -- it's a steaming plate of spaghetti heaped with tomato sauce on top. Discover how Italy's complex history has shaped the Italian food we know and love today."
}
},
"summary": "Even the emoji for pasta isn't just pasta -- it's a steaming plate of spaghetti heaped with tomato sauce on top. But while today we think of tomatoes as inextricably linked to Italian food, that hasn't always been the case. \"People tend to think Italian food was always as it is now -- that Dante was eating pizza,\" says Dr Eva Del Soldato , associate professor of romance languages at the University of Pennsylvania, who leads courses on Italian food history. In fact, she says, Italy's complex history -- it wasn't unified until 1861 -- means that what we think of Italian food is, for the most part, a relatively modern concept. Diego Zancani, emeritus professor of medieval and modern languages at Oxford University and author of \"How We Fell in Love with Italian Food,\" agrees.",
"title": "How this fruit became the star of Italian cooking"
}
]
}
}
}
Example - grouped result
Here is an example of a query where:
- we run a vector search (with
nearText
) to find publications about finance, - then we ask the generator module to explain why these articles are about finance.
- GraphQL
- Python
- JavaScript/TypeScript
- Go
- Java
- Curl
{
Get {
Publication(
nearText: {
concepts: ["magazine or newspaper about finance"]
certainty: 0.75
}
) {
name
_additional {
generate(
groupedResult: {
task: "Explain why these magazines or newspapers are about finance"
}
) {
groupedResult
error
}
}
}
}
}
import weaviate
client = weaviate.Client(
url = "https://some-endpoint.weaviate.network/",
additional_headers={
"X-Palm-Api-Key": "YOUR-PALM-API-KEY"
}
)
# instruction for the generative module
generateTask = "Explain why these magazines or newspapers are about finance"
result = (
client.query
.get("Publication", ["name"])
.with_generate(grouped_task=generateTask)
.with_near_text({
"concepts": ["magazine or newspaper about finance"]
})
.with_limit(5)
).do()
print(result)
import weaviate, { ApiKey } from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'https',
host: 'some-endpoint.weaviate.network', // Replace with your endpoint
apiKey: new ApiKey('YOUR-WEAVIATE-API-KEY'), // Replace with your Weaviate API key
headers: { 'X-Palm-Api-Key': process.env['PALM_API_KEY'] }, // Replace with your PALM API key
});
// instruction for the generative module
const generateTask = 'Explain why these magazines or newspapers are about finance';
const response = await client.graphql
.get()
.withClassName('Article')
.withFields('name')
.withNearText({
concepts: ['magazine or newspaper about finance'],
})
.withGenerate({
groupedTask: generateTask,
})
.withLimit(5)
.do();
console.log(JSON.stringify(response, null, 2));
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate-go-client/v4/weaviate/graphql"
)
func main() {
cfg := weaviate.Config{
Host: "some-endpoint.weaviate.network",
Scheme: "https",
Headers: map[string]string{"X-Palm-Api-Key": "YOUR-PALM-API-KEY"},
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
ctx := context.Background()
name := graphql.Field{Name: "name"}
concepts := []string{"magazine or newspaper about finance"}
nearText := client.GraphQL().NearTextArgBuilder().
WithConcepts(concepts)
gs := graphql.NewGenerativeSearch().GroupedResult("Explain why these magazines or newspapers are about finance")
result, err := client.GraphQL().Get().
WithClassName("Publication").
WithFields(name).
WithGenerativeSearch(gs).
WithNearText(nearText).
WithLimit(5).
Do(ctx)
if err != nil {
panic(err)
}
fmt.Printf("%v", result)
}
package io.weaviate;
import java.util.HashMap;
import java.util.Map;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
import io.weaviate.client.v1.graphql.model.GraphQLResponse;
import io.weaviate.client.v1.graphql.query.argument.NearTextArgument;
import io.weaviate.client.v1.graphql.query.fields.Field;
public class App {
public static void main(String[] args) {
Map<String, String> headers = new HashMap<String, String>() { {
put("X-Palm-Api-Key", "YOUR-PALM-API-KEY");
} };
Config config = new Config("https", "some-endpoint.weaviate.network", headers);
WeaviateClient client = new WeaviateClient(config);
// instruction for the generative module
GenerativeSearchBuilder generativeSearch = GenerativeSearchBuilder.builder()
.groupedResultTask("Explain why these magazines or newspapers are about finance")
.build();
Field name = Field.builder().name("name").build();
NearTextArgument nearText = client.graphQL().arguments().nearTextArgBuilder()
.concepts(new String[]{ "magazine or newspaper about finance" })
.build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName("Publication")
.withFields(name)
.withGenerativeSearch(generativeSearch)
.withNearText(nearText)
.withLimit(5)
.run();
if (result.hasErrors()) {
System.out.println(result.getError());
return;
}
System.out.println(result.getResult());
}
}
echo '{
"query": "{
Get {
Publication(
nearText: {
concepts: [\"magazine or newspaper about finance\"]
}
limit: 5
) {
name
_additional {
generate(
groupedResult: {
task: \"Explain why these magazines or newspapers are about finance\"
}
) {
groupedResult
error
}
}
}
}
}
"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $WEAVIATE_API_KEY" \
-H "X-Palm-Api-Key: $PALM_API_KEY" \
-d @- \
https://some-endpoint.weaviate.network/v1/graphql
Example response - grouped result
{
"data": {
"Get": {
"Publication": [
{
"_additional": {
"generate": {
"error": null,
"groupedResult": "The Financial Times, Wall Street Journal, and The New York Times Company are all about finance because they provide news and analysis on the latest financial markets, economic trends, and business developments. They also provide advice and commentary on personal finance, investments, and other financial topics."
}
},
"name": "Financial Times"
},
{
"_additional": {
"generate": null
},
"name": "Wall Street Journal"
},
{
"_additional": {
"generate": null
},
"name": "The New York Times Company"
}
]
}
}
}
Additional information
Supported models
The chat-bison
model is used by default. The model has the following properties:
- Max input token: 8,192
- Max output tokens: 1,024
- Training data: Up to Feb 2023
More resources
For additional information, try these sources.