text2vec-palm
Overview
The text2vec-palm
module enables Weaviate to obtain vectors using a Google API. You can use this with Google Cloud Vertex AI, or with Google Google AI Studio.
AI Studio (previously called MakerSuite) support was added in version 1.22.4
.
Key notes:
- As it uses a third-party API, you will need an API key.
- Its usage may incur costs.
- Please check the vendor pricing (e.g. check Google Vertex AI pricing), especially before vectorizing large amounts of data.
- This module is available on Weaviate Cloud Services (WCS).
- Enabling this module will enable the
nearText
search operator. - Model names differ between Vertex AI and AI Studio.
- The default model for Vertex AI is
textembedding-gecko@001
. - The default model for AI Studio
embedding-001
.
- The default model for Vertex AI is
Configuring text2vec-palm
for VertexAI or AI Studio
The module can be used with either Google Cloud Vertex AI or AI Studio. The configurations vary slightly for each.
Google Cloud Vertex AI
As of the time of writing (September 2023), you must manually enable the Vertex AI API on your Google Cloud project. You can do so by following the instructions here.
API key for Vertex AI users
This is called an access token
in Google Cloud.
If you have the Google Cloud CLI tool installed and set up, you can view your token by running the following command:
gcloud auth print-access-token
Token expiry for Vertex AI users
By default, Google Cloud's OAuth 2.0 access tokens have a lifetime of 1 hour. You can create tokens that last up to 12 hours. To create longer lasting tokens, follow the instructions in the Google Cloud IAM Guide.
Since the OAuth token is only valid for a limited time, you must periodically replace the token with a new one. After you generate the new token, you have to re-instantiate your Weaviate client to use it.
You can update the OAuth token manually, but manual updates may not be appropriate for your use case.
You can also automate the OAth token update. Weaviate does not control the OAth token update procedure. However, here are some automation options:
With Google Cloud CLI
If you are using the Google Cloud CLI, write a script to periodically update the token and extract the results.
Python code to extract the token looks like this:
client = re_instantiate_weaviate()
This is the re_instantiate_weaviate
function:
import subprocess
import weaviate
def refresh_token() -> str:
result = subprocess.run(["gcloud", "auth", "print-access-token"], capture_output=True, text=True)
if result.returncode != 0:
print(f"Error refreshing token: {result.stderr}")
return None
return result.stdout.strip()
def re_instantiate_weaviate() -> weaviate.Client:
token = refresh_token()
client = weaviate.Client(
url = "https://WEAVIATE_INSTANCE_URL", # Replace WEAVIATE_INSTANCE_URL with the URL
additional_headers = {
"X-PaLM-Api-Key": token,
}
)
return client
# Run this every ~60 minutes
client = re_instantiate_weaviate()
With google-auth
Another way is through Google's own authentication library google-auth
.
See the links to google-auth
in Python and Node.js libraries.
You can, then, periodically the refresh
function (see Python docs) to obtain a renewed token, and re-instantiate the Weaviate client.
For example, you could periodically run:
client = re_instantiate_weaviate()
Where re_instantiate_weaviate
is something like:
from google.auth.transport.requests import Request
from google.oauth2.service_account import Credentials
import weaviate
import os
def get_credentials() -> Credentials:
credentials = Credentials.from_service_account_file(
"path/to/your/service-account.json",
scopes=[
"https://www.googleapis.com/auth/generative-language",
"https://www.googleapis.com/auth/cloud-platform",
],
)
request = Request()
credentials.refresh(request)
return credentials
def re_instantiate_weaviate() -> weaviate.Client:
credentials = get_credentials()
token = credentials.token
client = weaviate.connect_to_wcs( # e.g. if you use the Weaviate Cloud Service
cluster_url="https://WEAVIATE_INSTANCE_URL", # Replace WEAVIATE_INSTANCE_URL with the URL
auth_credentials=weaviate.auth.AuthApiKey(os.getenv("WCS_DEMO_RO_KEY")), # Replace with your WCS key
headers={
"X-PaLM-Api-Key": token,
},
)
return client
# Run this every ~60 minutes
client = re_instantiate_weaviate()
The service account key shown above can be generated by following this guide.
AI Studio
At the time of writing (November 2023), AI Studio is not available in all regions. See this page for the latest information.
API key for AI Studio users
You can obtain an API key by logging in to your AI Studio account and creating an API key. This is the key to pass on to Weaviate. This key does not have an expiry date.
apiEndpoint
for AI Studio users
In the Weaviate class configuration, set the apiEndpoint
to generativelanguage.googleapis.com
.
Weaviate instance configuration
If you use Weaviate Cloud Services (WCS), this module is already enabled and pre-configured. You cannot edit the configuration in WCS.
Docker Compose file
To use text2vec-palm
, you must enable it in your Docker Compose file (docker-compose.yml
). You can do so manually, or create one using the Weaviate configuration tool.
Parameters
ENABLE_MODULES
(Required): The modules to enable. Includetext2vec-palm
to enable the module.DEFAULT_VECTORIZER_MODULE
(Optional): The default vectorizer module. You can set this totext2vec-palm
to make it the default for all classes.PALM_APIKEY
(Optional): Your Google API key. You can also provide the key at query time.
---
version: '3.4'
services:
weaviate:
image: cr.weaviate.io/semitechnologies/weaviate:1.24.10
restart: on-failure:0
ports:
- 8080:8080
- 50051:50051
environment:
QUERY_DEFAULTS_LIMIT: 20
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: "./data"
ENABLE_MODULES: text2vec-palm
DEFAULT_VECTORIZER_MODULE: text2vec-palm
PALM_APIKEY: sk-foobar # Optional; you can also provide the key at query time.
CLUSTER_HOSTNAME: 'node1'
...
Class configuration
You can configure how the module will behave in each class through the Weaviate schema.
API settings
Parameters
projectId
(Only required if using Vertex AI): e.g.cloud-large-language-models
apiEndpoint
(Optional): e.g.us-central1-aiplatform.googleapis.com
modelId
(Optional): e.g.textembedding-gecko@001
(Vertex AI) orembedding-001
(AI Studio)titleProperty
(Optional): The Weaviate property name for thegecko-002
orgecko-003
model to use as the title.
Example
{
"classes": [
{
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-palm",
"moduleConfig": {
"text2vec-palm": {
"projectId": "YOUR-GOOGLE-CLOUD-PROJECT-ID", // Only required if using Vertex AI. Replace with your value: (e.g. "cloud-large-language-models")
"apiEndpoint": "YOUR-API-ENDPOINT", // Optional. Defaults to "us-central1-aiplatform.googleapis.com".
"modelId": "YOUR-MODEL-ID", // Optional.
"titleProperty": "YOUR-TITLE-PROPERTY" // Optional (e.g. "title")
},
},
}
]
}
Vectorization settings
You can set vectorizer behavior using the moduleConfig
section under each class and property:
Class-level
vectorizer
- what module to use to vectorize the data.vectorizeClassName
– whether to vectorize the class name. Default:true
.
Property-level
skip
– whether to skip vectorizing the property altogether. Default:false
vectorizePropertyName
– whether to vectorize the property name. Default:false
Example
{
"classes": [
{
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-palm",
"moduleConfig": {
"text2vec-palm": {
"projectId": "YOUR-GOOGLE-CLOUD-PROJECT-ID", // Only required if using Vertex AI. Replace with your value: (e.g. "cloud-large-language-models")
"apiEndpoint": "YOUR-API-ENDPOINT", // Optional. Defaults to "us-central1-aiplatform.googleapis.com".
"modelId": "YOUR-MODEL-ID", // Optional.
"titleProperty": "YOUR-TITLE-PROPERTY" // Optional (e.g. "title")
"vectorizeClassName": false
},
},
"properties": [
{
"name": "content",
"dataType": ["text"],
"description": "Content that will be vectorized",
"moduleConfig": {
"text2vec-palm": {
"skip": false,
"vectorizePropertyName": false
}
}
}
]
}
]
}
Query-time parameters
API key
You can supply the API key at query time by adding it to the HTTP header:
"X-PaLM-Api-Key": "YOUR-PALM-API-KEY"
Additional information
Available models
You can specify the model as a part of the schema as shown earlier. Model names differ between Vertex AI and AI Studio.
The available models for Vertex AI are:
textembedding-gecko@001
(default)textembedding-gecko@002
textembedding-gecko@003
textembedding-gecko@latest
textembedding-gecko-multilingual@001
textembedding-gecko-multilingual@latest
text-embedding-preview-0409
text-multilingual-embedding-preview-0409
The available models for AI Studio are:
embedding-001
(default)text-embedding-004
For legacy embedding-gecko-001
model users
The embedding-gecko-001
has been deprecated by Google, and will not be available for new users. However, Google may deprecate the endpoint for existing users in the future.
Task type
The Google API requires a task_type
parameter at the time of vectorization for some models.
This is not required with the text2vec-palm
module, as Weaviate determines the task_type
Google API parameter based on the usage context.
During object creation, Weaviate supplies RETRIEVAL_DOCUMENT
as the task type. During search, Weaviate supplies RETRIEVAL_QUERY
as the task type.
Note
For more information, please see the official documentation.
API rate limits
Since this module uses your API key, your account's corresponding rate limits will also apply to the module. Weaviate will output any rate-limit related error messages generated by the API.
If you exceed your rate limit, Weaviate will output the error message generated by the API. If this persists, we suggest requesting to increase your rate limit by contacting Vertex AI support describing your use case with Weaviate.
Import throttling
One potential solution to rate limiting would be to throttle the import within your application. We include an example below.
See code example
- Python
- Go
from weaviate import Client
import time
def configure_batch(client: Client, batch_size: int, batch_target_rate: int):
"""
Configure the weaviate client's batch so it creates objects at `batch_target_rate`.
Parameters
----------
client : Client
The Weaviate client instance.
batch_size : int
The batch size.
batch_target_rate : int
The batch target rate as # of objects per second.
"""
def callback(batch_results: dict) -> None:
# you could print batch errors here
time_took_to_create_batch = batch_size * (client.batch.creation_time/client.batch.recommended_num_objects)
time.sleep(
max(batch_size/batch_target_rate - time_took_to_create_batch + 1, 0)
)
client.batch.configure(
batch_size=batch_size,
timeout_retries=5,
callback=callback,
)
package main
import (
"context"
"time"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate/entities/models"
)
var (
// adjust to your liking
targetRatePerMin = 600
batchSize = 50
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
// replace those 10000 empty objects with your actual data
objects := make([]*models.Object, 10000)
// we aim to send one batch every tickInterval second.
tickInterval := time.Duration(batchSize/targetRatePerMinute) * time.Minute
t := time.NewTicker(tickInterval)
before := time.Now()
for i := 0; i < len(objects); i += batchSize {
// create a fresh batch
batch := client.Batch().ObjectsBatcher()
// add batchSize objects to the batch
for j := i; j < i+batchSize; j++ {
batch = batch.WithObject(objects[i+j])
}
// send off batch
res, err := batch.Do(context.Background())
// TODO: inspect result for individual errors
_ = res
// TODO: check request error
_ = err
// we wait for the next tick. If the previous batch took longer than
// tickInterval, we won't need to wait, effectively making this an
// unthrottled import.
<-t.C
}
}
Usage example
This is an example of a nearText
query with text2vec-palm
.
- Python (v4)
- Python (v3)
- JavaScript/TypeScript
- Go
- Java
- Curl
- GraphQL
import weaviate
from weaviate.classes.query import MetadataQuery, Move
import os
client = weaviate.connect_to_local(
headers={
"X-PaLM-Api-Key": "YOUR_PALM_APIKEY",
}
)
publications = client.collections.get("Publication")
response = publications.query.near_text(
query="fashion",
distance=0.6,
move_to=Move(force=0.85, concepts="haute couture"),
move_away=Move(force=0.45, concepts="finance"),
return_metadata=MetadataQuery(distance=True),
limit=2
)
for o in response.objects:
print(o.properties)
print(o.metadata)
client.close()
import weaviate
client = weaviate.Client(
url="http://localhost:8080",
additional_headers={
"X-PaLM-Api-Key": "YOUR-PALM-API-KEY"
}
)
nearText = {
"concepts": ["fashion"],
"distance": 0.6, # prior to v1.14 use "certainty" instead of "distance"
"moveAwayFrom": {
"concepts": ["finance"],
"force": 0.45
},
"moveTo": {
"concepts": ["haute couture"],
"force": 0.85
}
}
result = (
client.query
.get("Publication", "name")
.with_additional(["certainty OR distance"]) # note that certainty is only supported if distance==cosine
.with_near_text(nearText)
.do()
)
print(result)
import weaviate from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080',
headers: { 'X-PaLM-Api-Key': 'YOUR-PALM-API-KEY' },
});
const response = await client.graphql
.get()
.withClassName('Publication')
.withFields('name _additional { certainty distance }') // note that certainty is only supported if distance==cosine
.withNearText({
concepts: ['fashion'],
distance: 0.6, // prior to v1.14 use certainty instead of distance
moveAwayFrom: {
concepts: ['finance'],
force: 0.45,
},
moveTo: {
concepts: ['haute couture'],
force: 0.85,
},
})
.do();
console.log(response);
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate-go-client/v4/weaviate/graphql"
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
Headers: map[string]string{"X-PaLM-Api-Key": "YOUR-PALM-API-KEY"},
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
className := "Publication"
name := graphql.Field{Name: "name"}
_additional := graphql.Field{
Name: "_additional", Fields: []graphql.Field{
{Name: "certainty"}, // only supported if distance==cosine
{Name: "distance"}, // always supported
},
}
concepts := []string{"fashion"}
distance := float32(0.6)
moveAwayFrom := &graphql.MoveParameters{
Concepts: []string{"finance"},
Force: 0.45,
}
moveTo := &graphql.MoveParameters{
Concepts: []string{"haute couture"},
Force: 0.85,
}
nearText := client.GraphQL().NearTextArgBuilder().
WithConcepts(concepts).
WithDistance(distance). // use WithCertainty(certainty) prior to v1.14
WithMoveTo(moveTo).
WithMoveAwayFrom(moveAwayFrom)
ctx := context.Background()
result, err := client.GraphQL().Get().
WithClassName(className).
WithFields(name, _additional).
WithNearText(nearText).
Do(ctx)
if err != nil {
panic(err)
}
fmt.Printf("%v", result)
}
package io.weaviate;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
import io.weaviate.client.v1.graphql.model.GraphQLResponse;
import io.weaviate.client.v1.graphql.query.argument.NearTextArgument;
import io.weaviate.client.v1.graphql.query.argument.NearTextMoveParameters;
import io.weaviate.client.v1.graphql.query.fields.Field;
import java.util.HashMap;
import java.util.Map;
public class App {
public static void main(String[] args) {
Map<String, String> headers = new HashMap<String, String>() { {
put("X-PaLM-Api-Key", "YOUR-PALM-API-KEY");
} };
Config config = new Config("http", "localhost:8080", headers);
WeaviateClient client = new WeaviateClient(config);
NearTextMoveParameters moveTo = NearTextMoveParameters.builder()
.concepts(new String[]{ "haute couture" }).force(0.85f).build();
NearTextMoveParameters moveAway = NearTextMoveParameters.builder()
.concepts(new String[]{ "finance" }).force(0.45f)
.build();
NearTextArgument nearText = client.graphQL().arguments().nearTextArgBuilder()
.concepts(new String[]{ "fashion" })
.distance(0.6f) // use .certainty(0.7f) prior to v1.14
.moveTo(moveTo)
.moveAwayFrom(moveAway)
.build();
Field name = Field.builder().name("name").build();
Field _additional = Field.builder()
.name("_additional")
.fields(new Field[]{
Field.builder().name("certainty").build(), // only supported if distance==cosine
Field.builder().name("distance").build(), // always supported
}).build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName("Publication")
.withFields(name, _additional)
.withNearText(nearText)
.run();
if (result.hasErrors()) {
System.out.println(result.getError());
return;
}
System.out.println(result.getResult());
}
}
# Note: Under nearText, use `certainty` instead of distance prior to v1.14
# Under _additional, `certainty` is only supported if distance==cosine, but `distance` is always supported
echo '{
"query": "{
Get{
Publication(
nearText: {
concepts: [\"fashion\"],
distance: 0.6,
moveAwayFrom: {
concepts: [\"finance\"],
force: 0.45
},
moveTo: {
concepts: [\"haute couture\"],
force: 0.85
}
}
){
name
_additional {
certainty
distance
}
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer learn-weaviate' \
-H "X-PaLM-Api-Key: $PALM_API_KEY" \
-d @- \
https://edu-demo.weaviate.network/v1/graphql
{
Get{
Publication(
nearText: {
concepts: ["fashion"],
distance: 0.6 # prior to v1.14 use "certainty" instead of "distance"
moveAwayFrom: {
concepts: ["finance"],
force: 0.45
},
moveTo: {
concepts: ["haute couture"],
force: 0.85
}
}
){
name
_additional {
certainty # only supported if distance==cosine.
distance # always supported
}
}
}
}
Questions and feedback
If you have any questions or feedback, let us know in our user forum.