text2vec-palm
In shortโ
- This module uses a third-party API and may incur costs.
- Check the vendor pricing (e.g. check Google Vertex AI pricing) before vectorizing large amounts of data.
- Weaviate automatically parallelizes requests to the API when using the batch endpoint.
- Added in Weaviate
v1.19.1
. - You need an API key for a PaLM API to use this module.
- The default model is
textembedding-gecko
.
Overviewโ
The text2vec-palm
module enables you to use PaLM embeddings in Weaviate to represent data objects and run semantic (nearText
) queries.
Inference API keyโ
As the text2vec-palm
uses a PaLM API endpoint, you must provide a valid PaLM API key to weaviate.
For Google Cloud usersโ
This is called an access token
in Google Cloud.
If you have the Google Cloud CLI tool installed and set up, you can view your token by running the following command:
gcloud auth print-access-token
Providing the key to Weaviateโ
You can provide your PaLM API key by providing "X-Palm-Api-Key"
through the request header. If you use the Weaviate client, you can do so like this:
- Python
- JavaScript
- Go
- Java
import weaviate
client = weaviate.Client(
url = "https://some-endpoint.weaviate.network/",
additional_headers = {
"X-Palm-Api-Key": "YOUR-PALM-API-KEY", # Replace with your API key
}
)
const weaviate = require("weaviate-ts-client");
const client = weaviate.client({
scheme: 'https',
host: 'some-endpoint.weaviate.network',
// Replace with your API key
headers: {
'X-Palm-Api-Key': 'YOUR-PALM-API-KEY',
},
});
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate/entities/models"
)
func main() {
cfg := weaviate.Config{
Host: "some-endpoint.weaviate.network/", // Replace with your endpoint
Scheme: "https",
// Replace with your API key
Headers: map[string]string{
"X-Palm-Api-Key": "YOUR-PALM-API-KEY",
}
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
}
package io.weaviate;
import java.util.ArrayList;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
public class App {
public static void main(String[] args) {
Map<String, String> headers = new HashMap<String, String>() { {
// Replace with your API key
put("X-Palm-Api-Key", "YOUR-PALM-API-KEY");
} };
Config config = new Config("https", "some-endpoint.weaviate.network/", headers);
WeaviateClient client = new WeaviateClient(config);
}
}
Optionally (not recommended), you can provide the PaLM API key as an environment variable.
How to provide the PaLM API key as an environment variable
During the configuration of your Docker instance, by adding PALM_APIKEY
under environment
to your docker-compose
file, like this:
environment:
PALM_APIKEY: 'your-key-goes-here' # Setting this parameter is optional; you can also provide the key at runtime.
...
Token expiry for Google Cloud usersโ
Google Cloud's OAuth 2.0 access tokens are configured to have a standard lifetime of 1 hour.
Therefore, you must periodically replace the token with a valid one and supply it to Weaviate by re-instantiating the client with the new key.
You can do this manually.
Automating this is a complex, advanced process that is outside the scope of our control. However, here are a couple of possible options for doing so:
With Google Cloud CLI
If you are using the Google Cloud CLI, you could run this through your preferred programming language, and extract the results.
For example, you could periodically run:
client = re_instantiate_weaviate()
Where re_instantiate_weaviate
is something like:
import subprocess
import weaviate
def refresh_token() -> str:
result = subprocess.run(["gcloud", "auth", "print-access-token"], capture_output=True, text=True)
if result.returncode != 0:
print(f"Error refreshing token: {result.stderr}")
return None
return result.stdout.strip()
def re_instantiate_weaviate() -> weaviate.Client:
token = refresh_token()
client = weaviate.Client(
url = "https://some-endpoint.weaviate.network", # Replace with your Weaviate URL
additional_headers = {
"X-Palm-Api-Key": token,
}
)
return client
# Run this every ~60 minutes
client = re_instantiate_weaviate()
With google-auth
Another way is through Google's own authentication library google-auth
.
See the links to google-auth
in Python and Node.js libraries.
You can, then, periodically the refresh
function (see Python docs) to obtain a renewed token, and re-instantiate the Weaviate client.
For example, you could periodically run:
client = re_instantiate_weaviate()
Where re_instantiate_weaviate
is something like:
from google.auth.transport.requests import Request
from google.oauth2.service_account import Credentials
import weaviate
def get_credentials() -> Credentials:
credentials = Credentials.from_service_account_file('path/to/your/service-account.json', scopes=['openid'])
request = Request()
credentials.refresh(request)
return credentials
def re_instantiate_weaviate() -> weaviate.Client:
credentials = get_credentials()
token = credentials.token
client = weaviate.Client(
url = "https://some-endpoint.weaviate.network", # Replace with your Weaviate URL
additional_headers = {
"X-Palm-Api-Key": token,
}
)
return client
# Run this every ~60 minutes
client = re_instantiate_weaviate()
The service account key shown above can be generated by following this guide.
Module configurationโ
This module is enabled and pre-configured on Weaviate Cloud Services.
Configuration file (Weaviate open source only)โ
Through the configuration file (e.g. docker-compose.yaml
), you can:
- enable the
text2vec-palm
module, - set it as the default vectorizer, and
- provide the API key for it.
Using the following variables:
ENABLE_MODULES: 'text2vec-palm,generative-palm'
DEFAULT_VECTORIZER_MODULE: text2vec-palm
PALM_APIKEY: sk-foobar
See a full example of a Docker configuration with text2vec-palm
---
version: '3.4'
services:
weaviate:
image: semitechnologies/weaviate:1.19.6
restart: on-failure:0
ports:
- "8080:8080"
environment:
QUERY_DEFAULTS_LIMIT: 20
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: "./data"
DEFAULT_VECTORIZER_MODULE: text2vec-palm
ENABLE_MODULES: text2vec-palm
PALM_APIKEY: sk-foobar # For use with PaLM. Setting this parameter is optional; you can also provide the key at runtime.
CLUSTER_HOSTNAME: 'node1'
...
- You can also use the Weaviate configuration tool to create a Weaviate setup with this module.
- The
PALM_APIKEY
environment variable is optional and you can instead provide the key at insert/query time as an HTTP header (see the 'usage' section for instructions)
Schema configurationโ
You can provide additional module configurations through the schema. You can learn about schemas here.
For text2vec-palm
, you can set the vectorizer model and vectorizer behavior using parameters in the moduleConfig
section of your schema:
Note that the projectId
parameter is required.
Example schemaโ
For example, the following schema configuration will set the PaLM API information.
- The
"projectId"
is REQUIRED, and may be something like"cloud-large-language-models"
- The
"apiEndpoint"
is optional, and may be something like:"us-central1-aiplatform.googleapis.com"
, and - The
"modelId"
is optional, and may be something like"textembedding-gecko"
.
{
"classes": [
{
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-palm",
"moduleConfig": {
"text2vec-palm": {
"projectId": "YOUR-GOOGLE-CLOUD-PROJECT-ID", // Required. Replace with your value: (e.g. "cloud-large-language-models")
"apiEndpoint": "YOUR-API-ENDPOINT", // Optional. Defaults to "us-central1-aiplatform.googleapis.com".
"modelId": "YOUR-GOOGLE-CLOUD-MODEL-ID", // Optional. Defaults to "textembedding-gecko".
},
},
}
]
}
Vectorizer behaviorโ
Set property-level vectorizer behavior using the moduleConfig
section under each property:
{
"classes": [
{
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-palm",
"moduleConfig": {
"text2vec-palm": {
// See above for module parameters
},
},
"properties": [
{
"dataType": ["text"],
"description": "Content that will be vectorized",
"moduleConfig": {
"text2vec-palm": {
"skip": false,
"vectorizePropertyName": false
},
},
"name": "content"
}
]
}
]
}
Usageโ
Enabling this module will make GraphQL vector search operators available.
Exampleโ
- GraphQL
- Python
- JavaScript
- Go
- Java
- Curl
{
Get{
Publication(
nearText: {
concepts: ["fashion"],
distance: 0.6 # prior to v1.14 use "certainty" instead of "distance"
moveAwayFrom: {
concepts: ["finance"],
force: 0.45
},
moveTo: {
concepts: ["haute couture"],
force: 0.85
}
}
){
name
_additional {
certainty # only supported if distance==cosine.
distance # always supported
}
}
}
}
import weaviate
client = weaviate.Client(
url="http://localhost:8080",
additional_headers={
"X-Palm-Api-Key": "YOUR-PALM-API-KEY"
}
)
nearText = {
"concepts": ["fashion"],
"distance": 0.6, # prior to v1.14 use "certainty" instead of "distance"
"moveAwayFrom": {
"concepts": ["finance"],
"force": 0.45
},
"moveTo": {
"concepts": ["haute couture"],
"force": 0.85
}
}
result = (
client.query
.get("Publication", "name")
.with_additional(["certainty OR distance"]) # note that certainty is only supported if distance==cosine
.with_near_text(nearText)
.do()
)
print(result)
const weaviate = require('weaviate-client');
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080',
headers: {'X-Palm-Api-Key': 'YOUR-PALM-API-KEY'},
});
client.graphql
.get()
.withClassName('Publication')
.withFields('name _additional{certainty distance}') // note that certainty is only supported if distance==cosine
.withNearText({
concepts: ['fashion'],
distance: 0.6, // prior to v1.14 use certainty instead of distance
moveAwayFrom: {
concepts: ['finance'],
force: 0.45
},
moveTo: {
concepts: ['haute couture'],
force: 0.85
}
})
.do()
.then(console.log)
.catch(console.error);
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate-go-client/v4/weaviate/graphql"
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
Headers: map[string]string{"X-Palm-Api-Key": "YOUR-PALM-API-KEY"},
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
className := "Publication"
name := graphql.Field{Name: "name"}
_additional := graphql.Field{
Name: "_additional", Fields: []graphql.Field{
{Name: "certainty"}, // only supported if distance==cosine
{Name: "distance"}, // always supported
},
}
concepts := []string{"fashion"}
distance := float32(0.6)
moveAwayFrom := &graphql.MoveParameters{
Concepts: []string{"finance"},
Force: 0.45,
}
moveTo := &graphql.MoveParameters{
Concepts: []string{"haute couture"},
Force: 0.85,
}
nearText := client.GraphQL().NearTextArgBuilder().
WithConcepts(concepts).
WithDistance(distance). // use WithCertainty(certainty) prior to v1.14
WithMoveTo(moveTo).
WithMoveAwayFrom(moveAwayFrom)
ctx := context.Background()
result, err := client.GraphQL().Get().
WithClassName(className).
WithFields(name, _additional).
WithNearText(nearText).
Do(ctx)
if err != nil {
panic(err)
}
fmt.Printf("%v", result)
}
package io.weaviate;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
import io.weaviate.client.v1.graphql.model.GraphQLResponse;
import io.weaviate.client.v1.graphql.query.argument.NearTextArgument;
import io.weaviate.client.v1.graphql.query.argument.NearTextMoveParameters;
import io.weaviate.client.v1.graphql.query.fields.Field;
import java.util.HashMap;
import java.util.Map;
public class App {
public static void main(String[] args) {
Map<String, String> headers = new HashMap<String, String>() { {
put("X-Palm-Api-Key", "YOUR-PALM-API-KEY");
} };
Config config = new Config("http", "localhost:8080", headers);
WeaviateClient client = new WeaviateClient(config);
NearTextMoveParameters moveTo = NearTextMoveParameters.builder()
.concepts(new String[]{ "haute couture" }).force(0.85f).build();
NearTextMoveParameters moveAway = NearTextMoveParameters.builder()
.concepts(new String[]{ "finance" }).force(0.45f)
.build();
NearTextArgument nearText = client.graphQL().arguments().nearTextArgBuilder()
.concepts(new String[]{ "fashion" })
.distance(0.6f) // use .certainty(0.7f) prior to v1.14
.moveTo(moveTo)
.moveAwayFrom(moveAway)
.build();
Field name = Field.builder().name("name").build();
Field _additional = Field.builder()
.name("_additional")
.fields(new Field[]{
Field.builder().name("certainty").build(), // only supported if distance==cosine
Field.builder().name("distance").build(), // always supported
}).build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName("Publication")
.withFields(name, _additional)
.withNearText(nearText)
.run();
if (result.hasErrors()) {
System.out.println(result.getError());
return;
}
System.out.println(result.getResult());
}
}
$ echo '{
"query": "{
Get{
Publication(
nearText: {
concepts: [\"fashion\"],
distance: 0.6, // use certainty instead of distance prior to v1.14
moveAwayFrom: {
concepts: [\"finance\"],
force: 0.45
},
moveTo: {
concepts: [\"haute couture\"],
force: 0.85
}
}
){
name
_additional {
certainty // only supported if distance==cosine
distance // always supported
}
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H "X-Palm-Api-Key: YOUR-PALM-API-KEY" \
-d @- \
http://localhost:8080/v1/graphql
Additional informationโ
Available modelโ
You can specify the model as a part of the schema as shown earlier.
Currently, the only available model is textembedding-gecko
.
The textembedding-gecko
model accepts a maximum of 3,072 input tokens, and outputs 768-dimensional vector embeddings.
Rate limitsโ
Since you will obtain embeddings using your own API key, any corresponding rate limits related to your account will apply to your use with Weaviate also.
If you exceed your rate limit, Weaviate will output the error message generated by the PaLM API. If this persists, we suggest requesting to increase your rate limit by contacting Vertex AI support describing your use case with Weaviate.
Throttle the import inside your applicationโ
One way of dealing with rate limits is to throttle the import within your application. For example, when using the Weaviate client:
- Python
- Go
from weaviate import Client
import time
def configure_batch(client: Client, batch_size: int, batch_target_rate: int):
"""
Configure the weaviate client's batch so it creates objects at `batch_target_rate`.
Parameters
----------
client : Client
The Weaviate client instance.
batch_size : int
The batch size.
batch_target_rate : int
The batch target rate as # of objects per second.
"""
def callback(batch_results: dict) -> None:
# you could print batch errors here
time_took_to_create_batch = batch_size * (client.batch.creation_time/client.batch.recommended_num_objects)
time.sleep(
max(batch_size/batch_target_rate - time_took_to_create_batch + 1, 0)
)
client.batch.configure(
batch_size=batch_size,
timeout_retries=5,
callback=callback,
)
package main
import (
"context"
"time"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate/entities/models"
)
var (
// adjust to your liking
targetRatePerMin = 600
batchSize = 50
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
// replace those 10000 empty objects with your actual data
objects := make([]*models.Object, 10000)
// we aim to send one batch every tickInterval second.
tickInterval := time.Duration(batchSize/targetRatePerMinute) * time.Minute
t := time.NewTicker(tickInterval)
before := time.Now()
for i := 0; i < len(objects); i += batchSize {
// create a fresh batch
batch := client.Batch().ObjectsBatcher()
// add batchSize objects to the batch
for j := i; j < i+batchSize; j++ {
batch = batch.WithObject(objects[i+j])
}
// send off batch
res, err := batch.Do(context.Background())
// TODO: inspect result for individual errors
_ = res
// TODO: check request error
_ = err
// we wait for the next tick. If the previous batch took longer than
// tickInterval, we won't need to wait, effectively making this an
// unthrottled import.
<-t.C
}
}
More resourcesโ
If you can't find the answer to your question here, please look at the:
- Frequently Asked Questions. Or,
- Knowledge base of old issues. Or,
- For questions: Stackoverflow. Or,
- For more involved discussion: Weaviate Community Forum. Or,
- We also have a Slack channel.