text2vec-palm
Overview
The text2vec-palm
module enables Weaviate to obtain vectors using PaLM embeddings.
v1.19.1
Key notes:
- As it uses a third-party API, you will need an API key. The module uses the Google Cloud
access token
. - Its usage may incur costs.
- Please check the vendor pricing (e.g. check Google Vertex AI pricing), especially before vectorizing large amounts of data.
- This module is available on Weaviate Cloud Services (WCS).
- Enabling this module will enable the
nearText
search operator. - The default model is
textembedding-gecko@001
.
As of the time of writing (September 2023), you must manually enable the Vertex AI API on your Google Cloud project. You can do so by following the instructions here.
Weaviate instance configuration
This module is enabled and pre-configured on Weaviate Cloud Services.
Docker Compose file
To use text2vec-palm
, you must enable it in your Docker Compose file (docker-compose.yml
). You can do so manually, or create one using the Weaviate configuration tool.
Parameters
ENABLE_MODULES
(Required): The modules to enable. Includetext2vec-palm
to enable the module.DEFAULT_VECTORIZER_MODULE
(Optional): The default vectorizer module. You can set this totext2vec-palm
to make it the default for all classes.PALM_APIKEY
(Optional): Your PaLM API key. You can also provide the key at query time.
---
version: '3.4'
services:
weaviate:
image: semitechnologies/weaviate:1.21.3
restart: on-failure:0
ports:
- "8080:8080"
environment:
QUERY_DEFAULTS_LIMIT: 20
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: "./data"
ENABLE_MODULES: text2vec-palm
DEFAULT_VECTORIZER_MODULE: text2vec-palm
PALM_APIKEY: sk-foobar # Optional; you can also provide the key at query time.
CLUSTER_HOSTNAME: 'node1'
...
Class configuration
You can configure how the module will behave in each class through the Weaviate schema.
API settings
Parameters
projectId
(Required): e.g.cloud-large-language-models
apiEndpoint
(Optional): e.g.us-central1-aiplatform.googleapis.com
modelId
(Optional): e.g.textembedding-gecko@001
ortextembedding-gecko-multilingual@latest
Example
{
"classes": [
{
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-palm",
"moduleConfig": {
"text2vec-palm": {
"projectId": "YOUR-GOOGLE-CLOUD-PROJECT-ID", // Required. Replace with your value: (e.g. "cloud-large-language-models")
"apiEndpoint": "YOUR-API-ENDPOINT", // Optional. Defaults to "us-central1-aiplatform.googleapis.com".
"modelId": "YOUR-GOOGLE-CLOUD-MODEL-ID", // Optional. Defaults to "textembedding-gecko@001".
},
},
}
]
}
Vectorization settings
You can set vectorizer behavior using the moduleConfig
section under each class and property:
Class-level
vectorizer
- what module to use to vectorize the data.vectorizeClassName
– whether to vectorize the class name. Default:true
.
Property-level
skip
– whether to skip vectorizing the property altogether. Default:false
vectorizePropertyName
– whether to vectorize the property name. Default:true
Example
{
"classes": [
{
"class": "Document",
"description": "A class called document",
"vectorizer": "text2vec-palm",
"moduleConfig": {
"text2vec-palm": {
"projectId": "YOUR-GOOGLE-CLOUD-PROJECT-ID", // Required. Replace with your value: (e.g. "cloud-large-language-models")
"apiEndpoint": "YOUR-API-ENDPOINT", // Optional. Defaults to "us-central1-aiplatform.googleapis.com".
"modelId": "YOUR-GOOGLE-CLOUD-MODEL-ID", // Optional. Defaults to "textembedding-gecko@001".
"vectorizeClassName": "false"
},
},
"properties": [
{
"name": "content",
"dataType": ["text"],
"description": "Content that will be vectorized",
"moduleConfig": {
"text2vec-palm": {
"skip": false,
"vectorizePropertyName": false
}
}
}
]
}
]
}
Query-time parameters
API key
You can supply the API key at query time by adding it to the HTTP header:
"X-Palm-Api-Key": "YOUR-PALM-API-KEY"
API key on Google Cloud
This is called an access token
in Google Cloud.
If you have the Google Cloud CLI tool installed and set up, you can view your token by running the following command:
gcloud auth print-access-token
Token expiry for Google Cloud users
Google Cloud's OAuth 2.0 access tokens are configured to have a standard lifetime of 1 hour.
Therefore, you must periodically replace the token with a valid one and supply it to Weaviate by re-instantiating the client with the new key.
You can do this manually.
Automating this is a complex, advanced process that is outside the scope of our control. However, here are a couple of possible options for doing so:
With Google Cloud CLI
If you are using the Google Cloud CLI, you could run this through your preferred programming language, and extract the results.
For example, you could periodically run:
client = re_instantiate_weaviate()
Where re_instantiate_weaviate
is something like:
import subprocess
import weaviate
def refresh_token() -> str:
result = subprocess.run(["gcloud", "auth", "print-access-token"], capture_output=True, text=True)
if result.returncode != 0:
print(f"Error refreshing token: {result.stderr}")
return None
return result.stdout.strip()
def re_instantiate_weaviate() -> weaviate.Client:
token = refresh_token()
client = weaviate.Client(
url = "https://some-endpoint.weaviate.network", # Replace with your Weaviate URL
additional_headers = {
"X-Palm-Api-Key": token,
}
)
return client
# Run this every ~60 minutes
client = re_instantiate_weaviate()
With google-auth
Another way is through Google's own authentication library google-auth
.
See the links to google-auth
in Python and Node.js libraries.
You can, then, periodically the refresh
function (see Python docs) to obtain a renewed token, and re-instantiate the Weaviate client.
For example, you could periodically run:
client = re_instantiate_weaviate()
Where re_instantiate_weaviate
is something like:
from google.auth.transport.requests import Request
from google.oauth2.service_account import Credentials
import weaviate
def get_credentials() -> Credentials:
credentials = Credentials.from_service_account_file('path/to/your/service-account.json', scopes=['openid'])
request = Request()
credentials.refresh(request)
return credentials
def re_instantiate_weaviate() -> weaviate.Client:
credentials = get_credentials()
token = credentials.token
client = weaviate.Client(
url = "https://some-endpoint.weaviate.network", # Replace with your Weaviate URL
additional_headers = {
"X-Palm-Api-Key": token,
}
)
return client
# Run this every ~60 minutes
client = re_instantiate_weaviate()
The service account key shown above can be generated by following this guide.
Additional information
Available models
You can specify the model as a part of the schema as shown earlier.
The available models are:
textembedding-gecko@001
(stable)textembedding-gecko@latest
(public preview: an embeddings model with enhanced AI quality)textembedding-gecko-multilingual@latest
(public preview: an embeddings model designed to use a wide range of non-English languages.)
At the time of writing, the textembedding-gecko
models accept a maximum of 3,072 input tokens, and outputs 768-dimensional vector embeddings. For more information, please see the official documentation.
API rate limits
Since this module uses your API key, your account's corresponding rate limits will also apply to the module. Weaviate will output any rate-limit related error messages generated by the API.
If you exceed your rate limit, Weaviate will output the error message generated by the PaLM API. If this persists, we suggest requesting to increase your rate limit by contacting Vertex AI support describing your use case with Weaviate.
Import throttling
One potential solution to rate limiting would be to throttle the import within your application. We include an example below.
See code example
- Python
- Go
from weaviate import Client
import time
def configure_batch(client: Client, batch_size: int, batch_target_rate: int):
"""
Configure the weaviate client's batch so it creates objects at `batch_target_rate`.
Parameters
----------
client : Client
The Weaviate client instance.
batch_size : int
The batch size.
batch_target_rate : int
The batch target rate as # of objects per second.
"""
def callback(batch_results: dict) -> None:
# you could print batch errors here
time_took_to_create_batch = batch_size * (client.batch.creation_time/client.batch.recommended_num_objects)
time.sleep(
max(batch_size/batch_target_rate - time_took_to_create_batch + 1, 0)
)
client.batch.configure(
batch_size=batch_size,
timeout_retries=5,
callback=callback,
)
package main
import (
"context"
"time"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate/entities/models"
)
var (
// adjust to your liking
targetRatePerMin = 600
batchSize = 50
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
// replace those 10000 empty objects with your actual data
objects := make([]*models.Object, 10000)
// we aim to send one batch every tickInterval second.
tickInterval := time.Duration(batchSize/targetRatePerMinute) * time.Minute
t := time.NewTicker(tickInterval)
before := time.Now()
for i := 0; i < len(objects); i += batchSize {
// create a fresh batch
batch := client.Batch().ObjectsBatcher()
// add batchSize objects to the batch
for j := i; j < i+batchSize; j++ {
batch = batch.WithObject(objects[i+j])
}
// send off batch
res, err := batch.Do(context.Background())
// TODO: inspect result for individual errors
_ = res
// TODO: check request error
_ = err
// we wait for the next tick. If the previous batch took longer than
// tickInterval, we won't need to wait, effectively making this an
// unthrottled import.
<-t.C
}
}
Usage example
The below shows a code example of how to use a nearText
query with text2vec-palm
.
- Python
- JavaScript/TypeScript
- Go
- Java
- Curl
- GraphQL
import weaviate
client = weaviate.Client(
url="http://localhost:8080",
additional_headers={
"X-Palm-Api-Key": "YOUR-PALM-API-KEY"
}
)
nearText = {
"concepts": ["fashion"],
"distance": 0.6, # prior to v1.14 use "certainty" instead of "distance"
"moveAwayFrom": {
"concepts": ["finance"],
"force": 0.45
},
"moveTo": {
"concepts": ["haute couture"],
"force": 0.85
}
}
result = (
client.query
.get("Publication", "name")
.with_additional(["certainty OR distance"]) # note that certainty is only supported if distance==cosine
.with_near_text(nearText)
.do()
)
print(result)
import weaviate from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080',
headers: { 'X-Palm-Api-Key': 'YOUR-PALM-API-KEY' },
});
const response = await client.graphql
.get()
.withClassName('Publication')
.withFields('name _additional { certainty distance }') // note that certainty is only supported if distance==cosine
.withNearText({
concepts: ['fashion'],
distance: 0.6, // prior to v1.14 use certainty instead of distance
moveAwayFrom: {
concepts: ['finance'],
force: 0.45,
},
moveTo: {
concepts: ['haute couture'],
force: 0.85,
},
})
.do();
console.log(response);
package main
import (
"context"
"fmt"
"github.com/weaviate/weaviate-go-client/v4/weaviate"
"github.com/weaviate/weaviate-go-client/v4/weaviate/graphql"
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
Headers: map[string]string{"X-Palm-Api-Key": "YOUR-PALM-API-KEY"},
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
className := "Publication"
name := graphql.Field{Name: "name"}
_additional := graphql.Field{
Name: "_additional", Fields: []graphql.Field{
{Name: "certainty"}, // only supported if distance==cosine
{Name: "distance"}, // always supported
},
}
concepts := []string{"fashion"}
distance := float32(0.6)
moveAwayFrom := &graphql.MoveParameters{
Concepts: []string{"finance"},
Force: 0.45,
}
moveTo := &graphql.MoveParameters{
Concepts: []string{"haute couture"},
Force: 0.85,
}
nearText := client.GraphQL().NearTextArgBuilder().
WithConcepts(concepts).
WithDistance(distance). // use WithCertainty(certainty) prior to v1.14
WithMoveTo(moveTo).
WithMoveAwayFrom(moveAwayFrom)
ctx := context.Background()
result, err := client.GraphQL().Get().
WithClassName(className).
WithFields(name, _additional).
WithNearText(nearText).
Do(ctx)
if err != nil {
panic(err)
}
fmt.Printf("%v", result)
}
package io.weaviate;
import io.weaviate.client.Config;
import io.weaviate.client.WeaviateClient;
import io.weaviate.client.base.Result;
import io.weaviate.client.v1.graphql.model.GraphQLResponse;
import io.weaviate.client.v1.graphql.query.argument.NearTextArgument;
import io.weaviate.client.v1.graphql.query.argument.NearTextMoveParameters;
import io.weaviate.client.v1.graphql.query.fields.Field;
import java.util.HashMap;
import java.util.Map;
public class App {
public static void main(String[] args) {
Map<String, String> headers = new HashMap<String, String>() { {
put("X-Palm-Api-Key", "YOUR-PALM-API-KEY");
} };
Config config = new Config("http", "localhost:8080", headers);
WeaviateClient client = new WeaviateClient(config);
NearTextMoveParameters moveTo = NearTextMoveParameters.builder()
.concepts(new String[]{ "haute couture" }).force(0.85f).build();
NearTextMoveParameters moveAway = NearTextMoveParameters.builder()
.concepts(new String[]{ "finance" }).force(0.45f)
.build();
NearTextArgument nearText = client.graphQL().arguments().nearTextArgBuilder()
.concepts(new String[]{ "fashion" })
.distance(0.6f) // use .certainty(0.7f) prior to v1.14
.moveTo(moveTo)
.moveAwayFrom(moveAway)
.build();
Field name = Field.builder().name("name").build();
Field _additional = Field.builder()
.name("_additional")
.fields(new Field[]{
Field.builder().name("certainty").build(), // only supported if distance==cosine
Field.builder().name("distance").build(), // always supported
}).build();
Result<GraphQLResponse> result = client.graphQL().get()
.withClassName("Publication")
.withFields(name, _additional)
.withNearText(nearText)
.run();
if (result.hasErrors()) {
System.out.println(result.getError());
return;
}
System.out.println(result.getResult());
}
}
# Note: Under nearText, use `certainty` instead of distance prior to v1.14
# Under _additional, `certainty` is only supported if distance==cosine, but `distance` is always supported
echo '{
"query": "{
Get{
Publication(
nearText: {
concepts: [\"fashion\"],
distance: 0.6,
moveAwayFrom: {
concepts: [\"finance\"],
force: 0.45
},
moveTo: {
concepts: [\"haute couture\"],
force: 0.85
}
}
){
name
_additional {
certainty
distance
}
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer learn-weaviate' \
-H "X-Palm-Api-Key: $PALM_API_KEY" \
-d @- \
https://edu-demo.weaviate.network/v1/graphql
{
Get{
Publication(
nearText: {
concepts: ["fashion"],
distance: 0.6 # prior to v1.14 use "certainty" instead of "distance"
moveAwayFrom: {
concepts: ["finance"],
force: 0.45
},
moveTo: {
concepts: ["haute couture"],
force: 0.85
}
}
){
name
_additional {
certainty # only supported if distance==cosine.
distance # always supported
}
}
}
}
More resources
For additional information, try these sources.