Skip to main content

text2vec-palm

LICENSE Weaviate on Stackoverflow badge Weaviate issues on GitHub badge Weaviate version badge Weaviate total Docker pulls badge Go Report Card

Overview

The text2vec-palm module enables Weaviate to obtain vectors using PaLM embeddings.

Available from version v1.19.1

Key notes:

  • As it uses a third-party API, you will need an API key. The module uses the Google Cloud access token.
  • Its usage may incur costs.
    • Please check the vendor pricing (e.g. check Google Vertex AI pricing), especially before vectorizing large amounts of data.
  • This module is available on Weaviate Cloud Services (WCS).
  • Enabling this module will enable the nearText search operator.
  • The default model is textembedding-gecko@001.
Ensure PaLM API is enabled on your Google Cloud project

As of the time of writing (September 2023), you must manually enable the Vertex AI API on your Google Cloud project. You can do so by following the instructions here.

Weaviate instance configuration

Not applicable to WCS

This module is enabled and pre-configured on Weaviate Cloud Services.

Docker Compose file

To use text2vec-palm, you must enable it in your Docker Compose file (docker-compose.yml). You can do so manually, or create one using the Weaviate configuration tool.

Parameters

  • ENABLE_MODULES (Required): The modules to enable. Include text2vec-palm to enable the module.
  • DEFAULT_VECTORIZER_MODULE (Optional): The default vectorizer module. You can set this to text2vec-palm to make it the default for all classes.
  • PALM_APIKEY (Optional): Your PaLM API key. You can also provide the key at query time.
---
version: '3.4'
services:
weaviate:
image: semitechnologies/weaviate:1.21.3
restart: on-failure:0
ports:
- "8080:8080"
environment:
QUERY_DEFAULTS_LIMIT: 20
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: "./data"
ENABLE_MODULES: text2vec-palm
DEFAULT_VECTORIZER_MODULE: text2vec-palm
PALM_APIKEY: sk-foobar # Optional; you can also provide the key at query time.
CLUSTER_HOSTNAME: 'node1'
...

Class configuration

You can configure how the module will behave in each class through the Weaviate schema.

API settings

Parameters

  • projectId (Required): e.g. cloud-large-language-models
  • apiEndpoint (Optional): e.g. us-central1-aiplatform.googleapis.com
  • modelId (Optional): e.g. textembedding-gecko@001 or textembedding-gecko-multilingual@latest

Example

{
"classes": [
{
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-palm",
"moduleConfig": {
"text2vec-palm": {
"projectId": "YOUR-GOOGLE-CLOUD-PROJECT-ID", // Required. Replace with your value: (e.g. "cloud-large-language-models")
"apiEndpoint": "YOUR-API-ENDPOINT", // Optional. Defaults to "us-central1-aiplatform.googleapis.com".
"modelId": "YOUR-GOOGLE-CLOUD-MODEL-ID", // Optional. Defaults to "textembedding-gecko@001".
},
},
}
]
}

Vectorization settings

You can set vectorizer behavior using the moduleConfig section under each class and property:

Class-level

  • vectorizer - what module to use to vectorize the data.
  • vectorizeClassName – whether to vectorize the class name. Default: true.

Property-level

  • skip – whether to skip vectorizing the property altogether. Default: false
  • vectorizePropertyName – whether to vectorize the property name. Default: true

Example

{
"classes": [
{
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-palm",
"moduleConfig": {
"text2vec-palm": {
"projectId": "YOUR-GOOGLE-CLOUD-PROJECT-ID", // Required. Replace with your value: (e.g. "cloud-large-language-models")
"apiEndpoint": "YOUR-API-ENDPOINT", // Optional. Defaults to "us-central1-aiplatform.googleapis.com".
"modelId": "YOUR-GOOGLE-CLOUD-MODEL-ID", // Optional. Defaults to "textembedding-gecko@001".
"vectorizeClassName": "false"
},
},
"properties": [
{
"name": "content",
"dataType": ["text"],
"description": "Content that will be vectorized",
"moduleConfig": {
"text2vec-palm": {
"skip": false,
"vectorizePropertyName": false
}
}
}
]
}
]
}

Query-time parameters

API key

You can supply the API key at query time by adding it to the HTTP header:

  • "X-Palm-Api-Key": "YOUR-PALM-API-KEY"

API key on Google Cloud

This is called an access token in Google Cloud.

If you have the Google Cloud CLI tool installed and set up, you can view your token by running the following command:

gcloud auth print-access-token

Token expiry for Google Cloud users

Important

Google Cloud's OAuth 2.0 access tokens are configured to have a standard lifetime of 1 hour.

Therefore, you must periodically replace the token with a valid one and supply it to Weaviate by re-instantiating the client with the new key.

You can do this manually.

Automating this is a complex, advanced process that is outside the scope of our control. However, here are a couple of possible options for doing so:

With Google Cloud CLI

If you are using the Google Cloud CLI, you could run this through your preferred programming language, and extract the results.


For example, you could periodically run:

client = re_instantiate_weaviate()

Where re_instantiate_weaviate is something like:

import subprocess
import weaviate

def refresh_token() -> str:
result = subprocess.run(["gcloud", "auth", "print-access-token"], capture_output=True, text=True)
if result.returncode != 0:
print(f"Error refreshing token: {result.stderr}")
return None
return result.stdout.strip()

def re_instantiate_weaviate() -> weaviate.Client:
token = refresh_token()

client = weaviate.Client(
url = "https://some-endpoint.weaviate.network", # Replace with your Weaviate URL
additional_headers = {
"X-Palm-Api-Key": token,
}
)
return client

# Run this every ~60 minutes
client = re_instantiate_weaviate()
With google-auth

Another way is through Google's own authentication library google-auth.


See the links to google-auth in Python and Node.js libraries.


You can, then, periodically the refresh function (see Python docs) to obtain a renewed token, and re-instantiate the Weaviate client.

For example, you could periodically run:

client = re_instantiate_weaviate()

Where re_instantiate_weaviate is something like:

from google.auth.transport.requests import Request
from google.oauth2.service_account import Credentials
import weaviate

def get_credentials() -> Credentials:
credentials = Credentials.from_service_account_file('path/to/your/service-account.json', scopes=['openid'])
request = Request()
credentials.refresh(request)
return credentials

def re_instantiate_weaviate() -> weaviate.Client:
credentials = get_credentials()
token = credentials.token

client = weaviate.Client(
url = "https://some-endpoint.weaviate.network", # Replace with your Weaviate URL
additional_headers = {
"X-Palm-Api-Key": token,
}
)
return client

# Run this every ~60 minutes
client = re_instantiate_weaviate()

The service account key shown above can be generated by following this guide.

Additional information

Available models

You can specify the model as a part of the schema as shown earlier.

The available models are:

  • textembedding-gecko@001 (stable)
  • textembedding-gecko@latest (public preview: an embeddings model with enhanced AI quality)
  • textembedding-gecko-multilingual@latest (public preview: an embeddings model designed to use a wide range of non-English languages.)

At the time of writing, the textembedding-gecko models accept a maximum of 3,072 input tokens, and outputs 768-dimensional vector embeddings. For more information, please see the official documentation.

API rate limits

Since this module uses your API key, your account's corresponding rate limits will also apply to the module. Weaviate will output any rate-limit related error messages generated by the API.

If you exceed your rate limit, Weaviate will output the error message generated by the PaLM API. If this persists, we suggest requesting to increase your rate limit by contacting Vertex AI support describing your use case with Weaviate.

Import throttling

One potential solution to rate limiting would be to throttle the import within your application. We include an example below.

See code example
from weaviate import Client
import time

def configure_batch(client: Client, batch_size: int, batch_target_rate: int):
"""
Configure the weaviate client's batch so it creates objects at `batch_target_rate`.

Parameters
----------
client : Client
The Weaviate client instance.
batch_size : int
The batch size.
batch_target_rate : int
The batch target rate as # of objects per second.
"""

def callback(batch_results: dict) -> None:

# you could print batch errors here
time_took_to_create_batch = batch_size * (client.batch.creation_time/client.batch.recommended_num_objects)
time.sleep(
max(batch_size/batch_target_rate - time_took_to_create_batch + 1, 0)
)

client.batch.configure(
batch_size=batch_size,
timeout_retries=5,
callback=callback,
)

Usage example

The below shows a code example of how to use a nearText query with text2vec-palm.

import weaviate

client = weaviate.Client(
url="http://localhost:8080",
additional_headers={
"X-Palm-Api-Key": "YOUR-PALM-API-KEY"
}
)

nearText = {
"concepts": ["fashion"],
"distance": 0.6, # prior to v1.14 use "certainty" instead of "distance"
"moveAwayFrom": {
"concepts": ["finance"],
"force": 0.45
},
"moveTo": {
"concepts": ["haute couture"],
"force": 0.85
}
}

result = (
client.query
.get("Publication", "name")
.with_additional(["certainty OR distance"]) # note that certainty is only supported if distance==cosine
.with_near_text(nearText)
.do()
)

print(result)

More resources

For additional information, try these sources.

  1. Frequently Asked Questions
  2. Weaviate Community Forum
  3. Knowledge base of old issues
  4. Stackoverflow
  5. Weaviate slack channel