Skip to main content

text2vec-palm

Overview

The text2vec-palm module enables Weaviate to obtain vectors using PaLM embeddings. You can use this with Google Cloud Vertex AI, or with Google Google AI Studio.

Releases and versions

AI Studio (previously called MakerSuite) support was added in version 1.22.4.

Key notes:

  • As it uses a third-party API, you will need an API key.
  • Its usage may incur costs.
    • Please check the vendor pricing (e.g. check Google Vertex AI pricing), especially before vectorizing large amounts of data.
  • This module is available on Weaviate Cloud Services (WCS).
  • Enabling this module will enable the nearText search operator.
  • Model names differ between Vertex AI and AI Studio.
    • The default model for Vertex AI is textembedding-gecko@001.
    • The default model for AI Studio embedding-gecko-001.

Configuring text2vec-palm for VertexAI or AI Studio

The module can be used with either Google Cloud Vertex AI or AI Studio. The configurations vary slightly for each.

Google Cloud Vertex AI

As of the time of writing (September 2023), you must manually enable the Vertex AI API on your Google Cloud project. You can do so by following the instructions here.

API key for Vertex AI users

This is called an access token in Google Cloud.

If you have the Google Cloud CLI tool installed and set up, you can view your token by running the following command:

gcloud auth print-access-token

Token expiry for Vertex AI users

Important

By default, Google Cloud's OAuth 2.0 access tokens have a lifetime of 1 hour. You can create tokens that last up to 12 hours. To create longer lasting tokens, follow the instructions in the Google Cloud IAM Guide.

Since the OAuth token is only valid for a limited time, you must periodically replace the token with a new one. After you generate the new token, you have to re-instantiate your Weaviate client to use it.

You can update the OAuth token manually, but manual updates may not be appropriate for your use case.

You can also automate the OAth token update. Automating this procedure is a complex process that is outside our control. However, here are some automation options:

With Google Cloud CLI

If you are using the Google Cloud CLI, you could run this through your preferred programming language, and extract the results.


For example, you could periodically run:

client = re_instantiate_weaviate()

Where re_instantiate_weaviate is something like:

import subprocess
import weaviate

def refresh_token() -> str:
result = subprocess.run(["gcloud", "auth", "print-access-token"], capture_output=True, text=True)
if result.returncode != 0:
print(f"Error refreshing token: {result.stderr}")
return None
return result.stdout.strip()

def re_instantiate_weaviate() -> weaviate.Client:
token = refresh_token()

client = weaviate.Client(
url = "https://some-endpoint.weaviate.network", # Replace with your Weaviate URL
additional_headers = {
"X-PaLM-Api-Key": token,
}
)
return client

# Run this every ~60 minutes
client = re_instantiate_weaviate()
With google-auth

Another way is through Google's own authentication library google-auth.


See the links to google-auth in Python and Node.js libraries.


You can, then, periodically the refresh function (see Python docs) to obtain a renewed token, and re-instantiate the Weaviate client.

For example, you could periodically run:

client = re_instantiate_weaviate()

Where re_instantiate_weaviate is something like:

from google.auth.transport.requests import Request
from google.oauth2.service_account import Credentials
import weaviate
import os


def get_credentials() -> Credentials:
credentials = Credentials.from_service_account_file(
"path/to/your/service-account.json",
scopes=[
"https://www.googleapis.com/auth/generative-language",
"https://www.googleapis.com/auth/cloud-platform",
],
)
request = Request()
credentials.refresh(request)
return credentials


def re_instantiate_weaviate() -> weaviate.Client:
credentials = get_credentials()
token = credentials.token

client = weaviate.connect_to_wcs( # e.g. if you use the Weaviate Cloud Service
cluster_url="https://some-endpoint.weaviate.network", # Replace with your WCS URL
auth_credentials=weaviate.auth.AuthApiKey(os.getenv("WCS_DEMO_RO_KEY")), # Replace with your WCS key
headers={
"X-PaLM-Api-Key": token,
},
)
return client


# Run this every ~60 minutes
client = re_instantiate_weaviate()

The service account key shown above can be generated by following this guide.

AI Studio

At the time of writing (November 2023), AI Studio is not available in all regions. See this page for the latest information.

API key for AI Studio users

You can obtain an API key by logging in to your AI Studio account and creating an API key. This is the key to pass on to Weaviate. This key does not have an expiry date.

apiEndpoint for AI Studio users

In the Weaviate class configuration, set the apiEndpoint to generativelanguage.googleapis.com.

Weaviate instance configuration

Not applicable to WCS

This module is enabled and pre-configured on Weaviate Cloud Services.

Docker Compose file

To use text2vec-palm, you must enable it in your Docker Compose file (docker-compose.yml). You can do so manually, or create one using the Weaviate configuration tool.

Parameters

  • ENABLE_MODULES (Required): The modules to enable. Include text2vec-palm to enable the module.
  • DEFAULT_VECTORIZER_MODULE (Optional): The default vectorizer module. You can set this to text2vec-palm to make it the default for all classes.
  • PALM_APIKEY (Optional): Your Google API key. You can also provide the key at query time.
---
version: '3.4'
services:
weaviate:
image: cr.weaviate.io/semitechnologies/weaviate:1.24.8
restart: on-failure:0
ports:
- 8080:8080
- 50051:50051
environment:
QUERY_DEFAULTS_LIMIT: 20
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: "./data"
ENABLE_MODULES: text2vec-palm
DEFAULT_VECTORIZER_MODULE: text2vec-palm
PALM_APIKEY: sk-foobar # Optional; you can also provide the key at query time.
CLUSTER_HOSTNAME: 'node1'
...

Class configuration

You can configure how the module will behave in each class through the Weaviate schema.

API settings

Parameters

  • projectId (Only required if using Vertex AI): e.g. cloud-large-language-models
  • apiEndpoint (Optional): e.g. us-central1-aiplatform.googleapis.com
  • modelId (Optional): e.g. textembedding-gecko@001 (Vertex AI) or embedding-gecko-001 (AI Studio)
  • titleProperty (Optional): The Weaviate property name for the gecko-002 or gecko-003 model to use as the title.

Example

{
"classes": [
{
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-palm",
"moduleConfig": {
"text2vec-palm": {
"projectId": "YOUR-GOOGLE-CLOUD-PROJECT-ID", // Only required if using Vertex AI. Replace with your value: (e.g. "cloud-large-language-models")
"apiEndpoint": "YOUR-API-ENDPOINT", // Optional. Defaults to "us-central1-aiplatform.googleapis.com".
"modelId": "YOUR-MODEL-ID", // Optional.
"titleProperty": "YOUR-TITLE-PROPERTY" // Optional (e.g. "title")
},
},
}
]
}

Vectorization settings

You can set vectorizer behavior using the moduleConfig section under each class and property:

Class-level

  • vectorizer - what module to use to vectorize the data.
  • vectorizeClassName – whether to vectorize the class name. Default: true.

Property-level

  • skip – whether to skip vectorizing the property altogether. Default: false
  • vectorizePropertyName – whether to vectorize the property name. Default: false

Example

{
"classes": [
{
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-palm",
"moduleConfig": {
"text2vec-palm": {
"projectId": "YOUR-GOOGLE-CLOUD-PROJECT-ID", // Only required if using Vertex AI. Replace with your value: (e.g. "cloud-large-language-models")
"apiEndpoint": "YOUR-API-ENDPOINT", // Optional. Defaults to "us-central1-aiplatform.googleapis.com".
"modelId": "YOUR-MODEL-ID", // Optional.
"titleProperty": "YOUR-TITLE-PROPERTY" // Optional (e.g. "title")
"vectorizeClassName": false
},
},
"properties": [
{
"name": "content",
"dataType": ["text"],
"description": "Content that will be vectorized",
"moduleConfig": {
"text2vec-palm": {
"skip": false,
"vectorizePropertyName": false
}
}
}
]
}
]
}

Query-time parameters

API key

You can supply the API key at query time by adding it to the HTTP header:

  • "X-PaLM-Api-Key": "YOUR-PALM-API-KEY"

Additional information

Available models

You can specify the model as a part of the schema as shown earlier. Model names differ between Vertex AI and AI Studio.

The available models for Vertex AI are:

  • textembedding-gecko@001 (stable) (default)
  • textembedding-gecko@002 (stable)
  • textembedding-gecko@003 (stable)
  • textembedding-gecko@latest (public preview: an embeddings model with enhanced AI quality)
  • textembedding-gecko-multilingual@001 (stable)
  • textembedding-gecko-multilingual@latest (public preview: an embeddings model designed to use a wide range of non-English languages.)

The available model for AI Studio is:

  • embedding-gecko-001 (stable) (default)

Task type

The Google API requires a task_type parameter at the time of vectorization for some models.

This is not required with the text2vec-palm module, as Weaviate determines the task_type Google API parameter based on the usage context.

During object creation, Weaviate supplies RETRIEVAL_DOCUMENT as the task type. During search, Weaviate supplies RETRIEVAL_QUERY as the task type.

Note

For more information, please see the official documentation.

API rate limits

Since this module uses your API key, your account's corresponding rate limits will also apply to the module. Weaviate will output any rate-limit related error messages generated by the API.

If you exceed your rate limit, Weaviate will output the error message generated by the API. If this persists, we suggest requesting to increase your rate limit by contacting Vertex AI support describing your use case with Weaviate.

Import throttling

One potential solution to rate limiting would be to throttle the import within your application. We include an example below.

See code example
from weaviate import Client
import time

def configure_batch(client: Client, batch_size: int, batch_target_rate: int):
"""
Configure the weaviate client's batch so it creates objects at `batch_target_rate`.

Parameters
----------
client : Client
The Weaviate client instance.
batch_size : int
The batch size.
batch_target_rate : int
The batch target rate as # of objects per second.
"""

def callback(batch_results: dict) -> None:

# you could print batch errors here
time_took_to_create_batch = batch_size * (client.batch.creation_time/client.batch.recommended_num_objects)
time.sleep(
max(batch_size/batch_target_rate - time_took_to_create_batch + 1, 0)
)

client.batch.configure(
batch_size=batch_size,
timeout_retries=5,
callback=callback,
)

Usage example

This is an example of a nearText query with text2vec-palm.

import weaviate
from weaviate.classes.query import MetadataQuery, Move
import os

client = weaviate.connect_to_local(
headers={
"X-PaLM-Api-Key": "YOUR_PALM_APIKEY",
}
)

publications = client.collections.get("Publication")

response = publications.query.near_text(
query="fashion",
distance=0.6,
move_to=Move(force=0.85, concepts="haute couture"),
move_away=Move(force=0.45, concepts="finance"),
return_metadata=MetadataQuery(distance=True),
limit=2
)

for o in response.objects:
print(o.properties)
print(o.metadata)

client.close()

Questions and feedback

If you have any questions or feedback, please let us know on our forum. For example, you can: