Weaviate Transformation Agent
To be notified when this agent is released, sign up here for updates.
The Weaviate Transformation Agent is an agentic service designed to augment and transform data using foundation models.
The Transformation Agent can be used to append new properties and/or update existing properties of data, for new or existing objects in Weaviate.
This can help you to improve the quality of your objects in your Weaviate collections, ready for further use in your applications.
Architecture
The Transformation Agent is provided as a service on Weaviate Cloud.
The Transformation Agent can be called upon to perform one or more transformation operations at a time. Each operation is performed:
- On new data being imported into Weaviate, or on existing data in Weaviate.
- To create a new property, or update an existing property.
- Based on the context of one or more existing properties and a specified set of instructions.
The Transformation Agent can thus be used to enhance the data at import time, or to update properties on existing objects.
Transformation Agent: visualized workflow
Let's dive into a little more detail about the Transformation Agent, using a few example workflows:
Add new properties to data at import time
In this example, the Transformation Agent is used to add new properties to data at import time. The Transformation Agent is provided with a set of instructions for creating new properties, and the set of new objects to be added to Weaviate.
The figure below shows the workflow:
The Transformation Agent works as follows at a high level:
- The Transformation Agent works with a foundation model to create the new property, based on the instructions provided and the context of the specified existing properties (steps 1-2).
- Insert the transformed objects to Weaviate. Weaviate vectorizes the data as needed using the specified vectorizer integration. (Steps 3-5)
- Receive the job status from Weaviate, which is returned to the user (Step 6).
As a result, Weaviate is populated with transformed versions of the input data provided by the user.
Update properties on existing objects
In this example, the Transformation Agent is used to update existing properties on objects that already exist in Weaviate. The Transformation Agent is provided with a set of instructions for how to update the existing properties, and which of the existing objects to update.
The figure below shows the workflow:
The Transformation Agent works as follows at a high level:
- The Transformation Agent retrieves the existing objects from Weaviate, based on the specified criteria (steps 1-2).
- The Transformation Agent works with a foundation model to create new versions of the property, based on the instructions provided and the context of the specified existing properties (steps 3-4).
- Update the transformed objects in Weaviate. Weaviate vectorizes the data as needed using the specified vectorizer integration. (Steps 5-7)
- Receive the job status from Weaviate, which is returned to the user (Step 8).
As a result, the specified objects in Weaviate are updated, with the new versions of the specified properties. For clarity, this would not change the number of objects in Weaviate, but would update the properties of the specified objects.
Usage
To be notified when this agent is released, sign up here for updates.
Transformation operations are asynchronous, and the Transformation Agent will return a job ID to the user. The user can then use this job ID to check the status of the job, and retrieve the results when the job is complete.
To use the Transformation Agent, you must provide the following:
- The Weaviate Cloud instance details (e.g. the
WeaviateClient
object in Python) to the Transformation Agent. - Either new objects to be added to Weaviate, or existing objects to be updated.
- A list of the transformation operations to be performed.
Prerequisites
The Transformation Agent is tightly integrated with Weaviate Cloud. As a result, the Transformation Agent is available exclusively for use with a Weaviate Cloud instance, and a supported version of the client library.
Connect to Weaviate
You must connect to the Weaviate Cloud instance to use the Transformation Agent. Connect to the Weaviate Cloud instance using the Weaviate client library.
- Python[agents]
# [🚧 UNDER CONSTRUCTION 🚧] This Weaviate Agent is not available just yet.
# These snippets are indicative. The syntax may change when this Agent is released.
import os
import weaviate
from weaviate.classes.init import Auth
headers = {
# Provide your required API key(s), e.g. Cohere, OpenAI, etc. for the configured vectorizer(s)
"X-INFERENCE-PROVIDER-API-KEY": os.environ.get("YOUR_INFERENCE_PROVIDER_KEY", ""),
}
client = weaviate.connect_to_weaviate_cloud(
cluster_url=os.environ.get("WCD_URL"),
auth_credentials=Auth.api_key(os.environ.get("WCD_API_KEY")),
headers=headers,
)
Define transformation operations
A transformation operation requires:
- Type
- Targets (e.g. objects to be updated, or new objects to be added)
- Instructions
- Context (e.g. existing properties to be used as context)
Here are a few examples of transformation operations:
Append new properties to data
Properties of various types can be added to the data, based on one or more existing properties. See the following example operations:
- Python[agents]
# [🚧 UNDER CONSTRUCTION 🚧] This Weaviate Agent is not available just yet.
# These snippets are indicative. The syntax may change when this Agent is released.
from weaviate.agents.transformation import Operations
from weaviate.classes.config import DataType
is_premium_product_op = Operations.append_property(
property_name="is_premium_product",
data_type=DataType.BOOL,
view_properties=["reviews", "price", "rating", "description"],
instruction="""Determine if the product is a premium product,
a product is considered premium if it has a high rating (4 or above),
a high price (above 60), and positive reviews""",
)
product_descriptors_op = Operations.append_property(
property_name="product_descriptors",
data_type=DataType.TEXT_ARRAY,
view_properties=["description"],
instruction="""Extract the product descriptors from the description.
Descriptors are a list of words that describe the product succinctly for SEO optimization""",
)
Update existing properties
Existing properties can be updated based on the context of one or more existing properties. See the following example operations:
- Python[agents]
# [🚧 UNDER CONSTRUCTION 🚧] This Weaviate Agent is not available just yet.
# These snippets are indicative. The syntax may change when this Agent is released.
from weaviate.agents.transformation import Operations
from weaviate.classes.config import DataType
name_update_op = Operations.update_property(
property_name="name",
view_properties=["name", "description", "category", "brand", "colour"],
instruction="""Update the name to ensure it contains more details about the products colour and brand, and category.
The name should be a single sentence that describes the product""",
)
Transform at insert
Once the transformation operations are defined, you can create a Transformation Agent, and use it to transform new data at insert.
The vectorization will only occur after the transformation operations are completed and the data is inserted into Weaviate.
The Transformation Agent will return a job ID when the operations are started.
- Python[agents]
# [🚧 UNDER CONSTRUCTION 🚧] This Weaviate Agent is not available just yet.
# These snippets are indicative. The syntax may change when this Agent is released.
from weaviate.agents.transformation import TransformationAgent
ta = TransformationAgent(
client=client,
collection="ecommerce",
operations=[
is_premium_product_op,
product_descriptors_op,
name_update_op,
],
)
ta.update_and_insert([
{ "name": "Foo", "description": "...", "reviews": ["...", "..."] "price": 25, "rating": 3 },
{ "name": "Bar", "description": "...", "reviews": ["...", "..."] "price": 50, "rating": 4 },
])
# Note this is an async function
operation_workflow_ids = await ta.update_all()
print(operation_workflow_ids) # Use this to track the status of the operations
Transform collection data
You can also use the Transformation Agent to transform data in an existing collection. The Transformation Agent will update the specified objects in the collection with the new properties. The objects will be re-vectorized as needed.
The Transformation Agent will return a job ID when the operations are started.
- Python[agents]
# [🚧 UNDER CONSTRUCTION 🚧] This Weaviate Agent is not available just yet.
# These snippets are indicative. The syntax may change when this Agent is released.
from weaviate.agents.transformation import TransformationAgent
ta = TransformationAgent(
client=client,
collection="ecommerce",
operations=[
is_premium_product_op,
product_descriptors_op,
name_update_op,
],
)
# Note this is an async function
operation_workflow_ids = await ta.update_all()
print(operation_workflow_ids) # Use this to track the status of the operations
Monitor job status
You can use the job ID to monitor the status of the job, and retrieve a response when the job is complete.
- Python[agents]
# [🚧 UNDER CONSTRUCTION 🚧] This Weaviate Agent is not available just yet.
# These snippets are indicative. The syntax may change when this Agent is released.
print(ta.fetch_operation_status(operation_workflow_ids))
Questions and feedback
If you have any questions or feedback, let us know in the user forum.