Skip to main content

text2vec-palm

LICENSEย Weaviate on Stackoverflow badgeย Weaviate issues on GitHub badgeย Weaviate version badgeย Weaviate total Docker pulls badgeย Go Report Card

In shortโ€‹

  • This module uses a third-party API and may incur costs.
  • Check the vendor pricing (e.g. check Google Vertex AI pricing) before vectorizing large amounts of data.
  • Weaviate automatically parallelizes requests to the API when using the batch endpoint.
  • Added in Weaviate v1.19.1.
  • You need an API key for a PaLM API to use this module.
  • The default model is textembedding-gecko.

Overviewโ€‹

The text2vec-palm module enables you to use PaLM embeddings in Weaviate to represent data objects and run semantic (nearText) queries.

Inference API keyโ€‹

Important: Provide PaLM API key to Weaviate

As the text2vec-palm uses a PaLM API endpoint, you must provide a valid PaLM API key to weaviate.

For Google Cloud usersโ€‹

This is called an access token in Google Cloud.

If you have the Google Cloud CLI tool installed and set up, you can view your token by running the following command:

gcloud auth print-access-token

Providing the key to Weaviateโ€‹

You can provide your PaLM API key by providing "X-Palm-Api-Key" through the request header. If you use the Weaviate client, you can do so like this:

import weaviate

client = weaviate.Client(
url = "https://some-endpoint.weaviate.network/",
additional_headers = {
"X-Palm-Api-Key": "YOUR-PALM-API-KEY", # Replace with your API key
}
)

Optionally (not recommended), you can provide the PaLM API key as an environment variable.

How to provide the PaLM API key as an environment variable

During the configuration of your Docker instance, by adding PALM_APIKEY under environment to your docker-compose file, like this:

environment:
PALM_APIKEY: 'your-key-goes-here' # Setting this parameter is optional; you can also provide the key at runtime.
...

Token expiry for Google Cloud usersโ€‹

Important

Google Cloud's OAuth 2.0 access tokens are configured to have a standard lifetime of 1 hour.

Therefore, you must periodically replace the token with a valid one and supply it to Weaviate by re-instantiating the client with the new key.

You can do this manually.

Automating this is a complex, advanced process that is outside the scope of our control. However, here are a couple of possible options for doing so:

With Google Cloud CLI

If you are using the Google Cloud CLI, you could run this through your preferred programming language, and extract the results.


For example, you could periodically run:

client = re_instantiate_weaviate()

Where re_instantiate_weaviate is something like:

import subprocess
import weaviate

def refresh_token() -> str:
result = subprocess.run(["gcloud", "auth", "print-access-token"], capture_output=True, text=True)
if result.returncode != 0:
print(f"Error refreshing token: {result.stderr}")
return None
return result.stdout.strip()

def re_instantiate_weaviate() -> weaviate.Client:
token = refresh_token()

client = weaviate.Client(
url = "https://some-endpoint.weaviate.network", # Replace with your Weaviate URL
additional_headers = {
"X-Palm-Api-Key": token,
}
)
return client

# Run this every ~60 minutes
client = re_instantiate_weaviate()
With google-auth

Another way is through Google's own authentication library google-auth.


See the links to google-auth in Python and Node.js libraries.


You can, then, periodically the refresh function (see Python docs) to obtain a renewed token, and re-instantiate the Weaviate client.

For example, you could periodically run:

client = re_instantiate_weaviate()

Where re_instantiate_weaviate is something like:

from google.auth.transport.requests import Request
from google.oauth2.service_account import Credentials
import weaviate

def get_credentials() -> Credentials:
credentials = Credentials.from_service_account_file('path/to/your/service-account.json', scopes=['openid'])
request = Request()
credentials.refresh(request)
return credentials

def re_instantiate_weaviate() -> weaviate.Client:
credentials = get_credentials()
token = credentials.token

client = weaviate.Client(
url = "https://some-endpoint.weaviate.network", # Replace with your Weaviate URL
additional_headers = {
"X-Palm-Api-Key": token,
}
)
return client

# Run this every ~60 minutes
client = re_instantiate_weaviate()

The service account key shown above can be generated by following this guide.

Module configurationโ€‹

Not applicable to WCS

This module is enabled and pre-configured on Weaviate Cloud Services.

Configuration file (Weaviate open source only)โ€‹

Through the configuration file (e.g. docker-compose.yaml), you can:

  • enable the text2vec-palm module,
  • set it as the default vectorizer, and
  • provide the API key for it.

Using the following variables:

ENABLE_MODULES: 'text2vec-palm,generative-palm'
DEFAULT_VECTORIZER_MODULE: text2vec-palm
PALM_APIKEY: sk-foobar
See a full example of a Docker configuration with text2vec-palm
---
version: '3.4'
services:
weaviate:
image: semitechnologies/weaviate:1.19.6
restart: on-failure:0
ports:
- "8080:8080"
environment:
QUERY_DEFAULTS_LIMIT: 20
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: "./data"
DEFAULT_VECTORIZER_MODULE: text2vec-palm
ENABLE_MODULES: text2vec-palm
PALM_APIKEY: sk-foobar # For use with PaLM. Setting this parameter is optional; you can also provide the key at runtime.
CLUSTER_HOSTNAME: 'node1'
...
note

Schema configurationโ€‹

You can provide additional module configurations through the schema. You can learn about schemas here.

For text2vec-palm, you can set the vectorizer model and vectorizer behavior using parameters in the moduleConfig section of your schema:

Note that the projectId parameter is required.

Example schemaโ€‹

For example, the following schema configuration will set the PaLM API information.

  • The "projectId" is REQUIRED, and may be something like "cloud-large-language-models"
  • The "apiEndpoint" is optional, and may be something like: "us-central1-aiplatform.googleapis.com", and
  • The "modelId" is optional, and may be something like "textembedding-gecko".
{
"classes": [
{
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-palm",
"moduleConfig": {
"text2vec-palm": {
"projectId": "YOUR-GOOGLE-CLOUD-PROJECT-ID", // Required. Replace with your value: (e.g. "cloud-large-language-models")
"apiEndpoint": "YOUR-API-ENDPOINT", // Optional. Defaults to "us-central1-aiplatform.googleapis.com".
"modelId": "YOUR-GOOGLE-CLOUD-MODEL-ID", // Optional. Defaults to "textembedding-gecko".
},
},
}
]
}

Vectorizer behaviorโ€‹

Set property-level vectorizer behavior using the moduleConfig section under each property:

{
"classes": [
{
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-palm",
"moduleConfig": {
"text2vec-palm": {
// See above for module parameters
},
},
"properties": [
{
"dataType": ["text"],
"description": "Content that will be vectorized",
"moduleConfig": {
"text2vec-palm": {
"skip": false,
"vectorizePropertyName": false
},
},
"name": "content"
}
]
}
]
}

Usageโ€‹

Enabling this module will make GraphQL vector search operators available.

Exampleโ€‹

{
Get{
Publication(
nearText: {
concepts: ["fashion"],
distance: 0.6 # prior to v1.14 use "certainty" instead of "distance"
moveAwayFrom: {
concepts: ["finance"],
force: 0.45
},
moveTo: {
concepts: ["haute couture"],
force: 0.85
}
}
){
name
_additional {
certainty # only supported if distance==cosine.
distance # always supported
}
}
}
}

Additional informationโ€‹

Available modelโ€‹

You can specify the model as a part of the schema as shown earlier.

Currently, the only available model is textembedding-gecko.

The textembedding-gecko model accepts a maximum of 3,072 input tokens, and outputs 768-dimensional vector embeddings.

Rate limitsโ€‹

Since you will obtain embeddings using your own API key, any corresponding rate limits related to your account will apply to your use with Weaviate also.

If you exceed your rate limit, Weaviate will output the error message generated by the PaLM API. If this persists, we suggest requesting to increase your rate limit by contacting Vertex AI support describing your use case with Weaviate.

Throttle the import inside your applicationโ€‹

One way of dealing with rate limits is to throttle the import within your application. For example, when using the Weaviate client:

from weaviate import Client
import time

def configure_batch(client: Client, batch_size: int, batch_target_rate: int):
"""
Configure the weaviate client's batch so it creates objects at `batch_target_rate`.

Parameters
----------
client : Client
The Weaviate client instance.
batch_size : int
The batch size.
batch_target_rate : int
The batch target rate as # of objects per second.
"""

def callback(batch_results: dict) -> None:

# you could print batch errors here
time_took_to_create_batch = batch_size * (client.batch.creation_time/client.batch.recommended_num_objects)
time.sleep(
max(batch_size/batch_target_rate - time_took_to_create_batch + 1, 0)
)

client.batch.configure(
batch_size=batch_size,
timeout_retries=5,
callback=callback,
)

More resourcesโ€‹

If you can't find the answer to your question here, please look at the:

  1. Frequently Asked Questions. Or,
  2. Knowledge base of old issues. Or,
  3. For questions: Stackoverflow. Or,
  4. For more involved discussion: Weaviate Community Forum. Or,
  5. We also have a Slack channel.