Python
Overview
This page broadly covers the Weaviate Python client (v4
release). For usage information not specific to the Python client, such as code examples, see the relevant pages in the Weaviate documentation. Some frequently used sections are listed here for convenience.
Installation
v3
to v4
If you are migrating from the v3
client to the v4
, please see this dedicated guide.
The Python client library is developed and tested using Python 3.8+. It is available on PyPI.org, and can be installed with:
pip install -U weaviate-client # For beta versions: `pip install --pre -U "weaviate-client==4.*"`
Requirements
gRPC
The v4
client uses remote procedure calls (RPCs) under-the-hood. Accordingly, a port for gRPC must be open to your Weaviate server.
docker-compose.yml example
If you are running Weaviate with Docker, you can map the default port (50051
) by adding the following to your docker-compose.yml
file:
ports:
- 8080:8080
- 50051:50051
WCS compatibility
The free (sandbox) tier of WCS is compatible with the v4
client as of 31 January, 2024. Sandboxes created before this date will not be compatible with the v4
client.
Weaviate server version
The v4
client requires Weaviate 1.23.7
or higher. Generally, we encourage you to use the latest version of the Python client and the Weaviate server.
High-level ideas
Helper classes
The client library provides numerous additional Python classes to provide IDE assistance and typing help. You can import them individually, like so:
from weaviate.classes.config import Property, ConfigFactory
from weaviate.classes.data import DataObject
from weaviate.classes.query import Filter
But it may be convenient to import the whole set of classes like this. You will see both usage styles in our documentation.
import weaviate.classes as wvc
For discoverability, the classes are arranged into submodules.
See the list of submodules
Module | Description |
---|---|
weaviate.classes.config | Collection creation / modification |
weaviate.classes.data | CUD operations |
weaviate.classes.query | query/search operations |
weaviate.classes.aggregate | aggregate operations |
weaviate.classes.generic | generics |
weaviate.classes.init | initialization |
weaviate.classes.tenants | tenants |
weaviate.classes.batch | batch operations |
Connection termination
You must ensure your client connections are closed. You can use client.close()
, or use a context manager to close client connections for you.
client.close()
with try
/ finally
This will close the client connection when the try
block is complete (or if an exception is raised).
import weaviate
client = weaviate.connect_to_local() # Connect with default parameters
try:
pass # Do something with the client
finally:
client.close() # Ensure the connection is closed
Context manager
This will close the client connection when you leave the with
block.
import weaviate
with weaviate.connect_to_local() as client:
# Do something with the client
pass
# The connection is closed automatically when the context manager exits
Instantiate a client
There are multiple ways to connect to your Weaviate instance. To instantiate a client, use one of these styles:
- Python client v4 helper methods
- Python client v4 explicit connection
- Python client v3 style connection
Python client v4 helper functions
weaviate.connect_to_wcs()
weaviate.connect_to_local()
weaviate.connect_to_embedded()
weaviate.connect_to_custom()
- WCS
- Local
- Embedded
- Custom
import weaviate
import os
client = weaviate.connect_to_wcs(
cluster_url=os.getenv("WCS_DEMO_URL"), # Replace with your WCS URL
auth_credentials=weaviate.auth.AuthApiKey(os.getenv("WCS_DEMO_RO_KEY")), # Replace with your WCS key
headers={'X-OpenAI-Api-key': os.getenv("OPENAI_APIKEY")} # Replace with your OpenAI API key
)
import weaviate
client = weaviate.connect_to_local() # Connect with default parameters
import weaviate
client = weaviate.connect_to_embedded() # Connect with default parameters
import weaviate
client = weaviate.connect_to_custom(
http_host="localhost",
http_port="8080",
http_secure=False,
grpc_host="localhost",
grpc_port="50051",
grpc_secure=False,
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY") # Or any other inference API keys
}
)
The v4
client helper functions provide some optional parameters to customize your client.
External API keys
To add API keys for services such as Cohere or OpenAI, use the headers
parameter.
import weaviate
import os
client = weaviate.connect_to_local(
headers={"X-OpenAI-Api": os.getenv("OPENAI_APIKEY")}
)
Timeout values
You can set timeout values, in seconds, for the client. Use the Timeout
class to configure the timeout values for initialization checks as well as query and insert operations.
import weaviate
from weaviate.classes.init import AdditionalConfig, Timeout
client = weaviate.connect_to_local(
port=8080,
grpc_port=50051,
additional_config=AdditionalConfig(
timeout=Timeout(init=2, query=45, insert=120) # Values in seconds
)
)
generate
(RAG) queriesIf you see errors while using the generate
submodule, try increasing the query timeout values (Timeout(query=60)
).
The generate
submodule uses a large language model to generate text. The submodule is dependent on the speed of the language model and any API that serves the language model.
Increase the timeout values to allow the client to wait longer for the language model to respond.
Authentication
Some of the connect
helper functions take authentication credentials. For example, connect_to_wcs
accepts a WCS API key or OIDC authentication credentials.
- API Key
- OIDC Credentials
import weaviate
import os
client = weaviate.connect_to_wcs(
cluster_url=os.getenv("WCS_DEMO_URL"), # Replace with your WCS URL
auth_credentials=weaviate.auth.AuthApiKey(os.getenv("WCS_DEMO_RO_KEY")), # Replace with your WCS key
headers={'X-OpenAI-Api-key': os.getenv("OPENAI_APIKEY")} # Replace with your OpenAI API key
)
import weaviate
client = weaviate.connect_to_wcs(
cluster_url=os.getenv("WCS_DEMO_URL"), # Replace with your WCS URL
auth_credentials=weaviate.auth.AuthClientPassword(
username=os.getenv("WCS_USERNAME"), # Your WCS username
password=os.getenv("WCS_PASSWORD") # Your WCS password
)
)
For OIDC authentication with the Client Credentials flow, use the AuthClientCredentials
class.
For OIDC authentication with the Refresh Token flow, use the AuthBearerToken
class.
If the helper functions do not provide the customization you need, use the WeaviateClient
class to instantiate the client.
Python client v4 explicit connection
If you need to pass custom parameters, use the weaviate.WeaviateClient
class to instantiate a client. This is the most flexible way to instantiate the client object.
Please note that when directly instantiating a connection, you must connect to the server manually by calling the .connect()
method.
import weaviate
from weaviate.connect import ConnectionParams
from weaviate.classes.init import AdditionalConfig, Timeout
import os
client = weaviate.WeaviateClient(
connection_params=ConnectionParams.from_params(
http_host="localhost",
http_port="8099",
http_secure=False,
grpc_host="localhost",
grpc_port="50052",
grpc_secure=False,
),
auth_client_secret=weaviate.auth.AuthApiKey("secr3tk3y"),
additional_headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY")
},
additional_config=AdditionalConfig(
timeout=Timeout(init=2, query=45, insert=120), # Values in seconds
),
)
client.connect() # When directly instantiating, you need to connect manually
Python client v3 API
To create an older, v3
style Client
object, use the weaviate.Client
class. This method available for backwards compatibility. Where possible, use a client v4 connection.
To create a v3
style client, refer to the v3
client documentation.
Initial connection checks
When establishing a connection to the Weaviate server, the client performs a series of checks. These includes checks for the server version, and to make sure that the REST and gRPC ports are available.
You can set skip_init_checks
to True
to skip these checks.
import weaviate
client = weaviate.connect_to_local(
skip_init_checks=True
)
In most cases, you should use the default False
setting for skip_init_checks
. However, setting skip_init_checks=True
may be a useful temporary measure if you have connection issues.
For additional connection configuration, see Timeout values.
Batching
The v4
client offers two ways to perform batch imports. From the client object directly, or from the collection object.
We recommend using the collection object to perform batch imports of single collections or tenants. If you are importing objects across many collections, such as in a multi-tenancy configuration, using client.batch
may be more convenient.
Batch sizing
There are three methods to configure the batching behavior. They are dynamic
, fixed_size
and rate_limit
.
Method | Description | When to use |
---|---|---|
dynamic | The batch size and the number of concurrent requests are dynamically adjusted on-the-fly during import, depending on the server load. | Recommended starting point. |
fixed_size | The batch size and number of concurrent requests are fixed to sizes specified by the user. | When you want to specify fixed parameters. |
rate_limit | The number of objects sent to Weaviate is rate limited (specified as n_objects per minute). | When you want to avoid hitting third-party vectorization API rate limits. |
Usage
We recommend using a context manager as shown below.
These methods return completely localized context managers. Accordingly, attributes of one batch such as failed_objects
and failed_references
will not be included in any subsequent calls.
- Dynamic
- Fixed Size
- Rate limited
import weaviate
client = weaviate.connect_to_local()
try:
with client.batch.dynamic() as batch: # or <collection>.batch.dynamic()
# Batch import objects/references - e.g.:
batch.add_object(properties={"title": "Multitenancy"}, collection="WikiArticle", uuid=src_uuid)
batch.add_object(properties={"title": "Database schema"}, collection="WikiArticle", uuid=tgt_uuid)
batch.add_reference(from_collection="WikiArticle", from_uuid=src_uuid, from_property="linkedArticle", to=tgt_uuid)
finally:
client.close()
import weaviate
client = weaviate.connect_to_local()
try:
with client.batch.fixed_size(batch_size=100, concurrent_requests=4) as batch: # or <collection>.batch.fixed_size()
# Batch import objects/references - e.g.:
batch.add_object(properties={"title": "Multitenancy"}, collection="WikiArticle", uuid=src_uuid)
batch.add_object(properties={"title": "Database schema"}, collection="WikiArticle", uuid=tgt_uuid)
batch.add_reference(from_collection="WikiArticle", from_uuid=src_uuid, from_property="linkedArticle", to=tgt_uuid)
finally:
client.close()
import weaviate
client = weaviate.connect_to_local()
try:
with client.batch.rate_limit(requests_per_minute=600) as batch: # or <collection>.batch.rate_limit()
# Batch import objects/references - e.g.:
batch.add_object(properties={"title": "Multitenancy"}, collection="WikiArticle", uuid=src_uuid)
batch.add_object(properties={"title": "Database schema"}, collection="WikiArticle", uuid=tgt_uuid)
batch.add_reference(from_collection="WikiArticle", from_uuid=src_uuid, from_property="linkedArticle", to=tgt_uuid)
finally:
client.close()
In the batching process, if the background thread responsible for sending the batches raises an exception this is now re-raised in the main thread.
Error handling
During a batch import, any failed objects or references will be stored for retrieval. Additionally, a running count of failed objects and references is maintained.
The counter can be accessed through batch.number_errors
within the context manager.
A list of failed objects can be obtained through batch.failed_objects
and a list of failed references can be obtained through batch.failed_references
.
Note that these lists are reset when a batching process is initialized. So make sure to retrieve them before starting a new batch import block.
import weaviate
import weaviate.classes as wvc
client = weaviate.connect_to_local()
try:
# ===== First batch import block =====
with client.batch.rate_limit(requests_per_minute=600) as batch: # or <collection>.batch.rate_limit()
# Batch import objects/references
for i in source_iterable: # Some insertion loop
if batch.number_errors > 10: # Monitor errors during insertion
# Break or raise an exception
pass
# Note these are outside the `with` block - they are populated after the context manager exits
failed_objs_a = client.batch.failed_objects # Get failed objects from the first batch import
failed_refs_a = client.batch.failed_references # Get failed references from the first batch import
# ===== Second batch import block =====
# This will clear the failed objects/references
with client.batch.rate_limit(requests_per_minute=600) as batch: # or <collection>.batch.rate_limit()
# Batch import objects/references
for i in source_iterable: # Some insertion loop
if batch.number_errors > 10: # Monitor errors during insertion
# Break or raise an exception
pass
# Note these are outside the `with` block - they are populated after the context manager exits
failed_objs_b = client.batch.failed_objects # Get failed objects from the second batch import
failed_refs_b = client.batch.failed_references # Get failed references from the second batch import
finally:
client.close()
Working with collections
Instantiate a collection
You can instantiate a collection object by creating a collection, or by retrieving an existing collection.
- Create a collection
- With cross-references
- Get a collection
import weaviate
import weaviate.classes.config as wvcc
client = weaviate.connect_to_local()
try:
# Note that you can use `client.collections.create_from_dict()` to create a collection from a v3-client-style JSON object
collection = client.collections.create(
name="TestArticle",
vectorizer_config=wvcc.Configure.Vectorizer.text2vec_cohere(),
generative_config=wvcc.Configure.Generative.cohere(),
properties=[
wvcc.Property(
name="title",
data_type=wvcc.DataType.TEXT
)
]
)
finally:
client.close()
import weaviate
import weaviate.classes.config as wvcc
client = weaviate.connect_to_local()
try:
articles = client.collections.create(
name="TestArticle",
vectorizer_config=wvcc.Configure.Vectorizer.text2vec_cohere(),
generative_config=wvcc.Configure.Generative.cohere(),
properties=[
wvcc.Property(
name="title",
data_type=wvcc.DataType.TEXT
)
]
)
authors = client.collections.create(
name="TestAuthor",
vectorizer_config=wvcc.Configure.Vectorizer.text2vec_cohere(),
generative_config=wvcc.Configure.Generative.cohere(),
properties=[
wvcc.Property(
name="name",
data_type=wvcc.DataType.TEXT
)
],
references=[
wvcc.ReferenceProperty(
name="wroteArticle",
target_collection="TestArticle"
)
]
)
finally:
client.close()
import weaviate
client = weaviate.connect_to_local()
try:
collection = client.collections.get("TestArticle")
finally:
client.close()
Collection submodules
Operations in the v4
client are grouped into submodules. The key submodules for interacting with objects are:
data
: CUD operations (read operations are inquery
)batch
: Batch import operationsquery
: Search operationsgenerate
: Retrieval augmented generation operations- Build on top of
query
operations
- Build on top of
aggregate
: Aggregation operations
data
The data
submodule contains all object-level CUD operations, including:
insert
for creating objects.- This function takes the object properties as a dictionary.
insert_many
for batch creating multiple objects.- This function takes the object properties as a dictionary or as a
DataObject
instance.
- This function takes the object properties as a dictionary or as a
update
for updating objects (forPATCH
operations).replace
for replacing objects (forPUT
operations).delete_by_id
for deleting objects by ID.delete_many
for batch deletion.reference_xxx
for reference operations, includingreference_add
,reference_add_many
,reference_update
andreference_delete
.
See some examples below. Note that each function will return varying types of objects.
insert_many
sends one requestAs of 4.4b1
, insert_many
sends one request for the entire function call. A future release may
send multiple requests as batches.
- Insert
- Insert many
- Delete by id
- Delete many
questions = client.collections.get("JeopardyQuestion")
new_uuid = questions.data.insert(
properties={
"question": "This is the capital of Australia."
},
references={ # For adding cross-references
"hasCategory": target_uuid
}
)
questions = client.collections.get("JeopardyQuestion")
properties = [{"question": f"Test Question {i+1}"} for i in range(5)]
response = questions.data.insert_many(properties)
questions = client.collections.get("JeopardyQuestion")
deleted = questions.data.delete_by_id(uuid=new_uuid)
from weaviate.classes.query import Filter
questions = client.collections.get("JeopardyQuestion")
response = questions.data.delete_many(
where=Filter.by_property(name="question").equal("Test Question")
)
insert_many
with DataObjects
The insert_many
function takes a list of DataObject
instances or a list of dictionaries. This is useful if you want to specify additional information to the properties, such as cross-references, object uuid, or a custom vector.
from weaviate.util import generate_uuid5
questions = client.collections.get("JeopardyQuestion")
data_objects = list()
for i in range(5):
properties = {"question": f"Test Question {i+1}"}
data_object = wvc.data.DataObject(
properties=properties,
uuid=generate_uuid5(properties)
)
data_objects.append(data_object)
response = questions.data.insert_many(data_objects)
Cross-reference creation
Cross-references should be added under a references
parameter in the relevant function/method, with a structure like:
{
"<REFERENCE_PROPERTY_NAME>": "<TARGET_UUID>"
}
For example:
from weaviate.util import generate_uuid5
questions = client.collections.get("JeopardyQuestion")
data_objects = list()
for i in range(5):
properties = {"question": f"Test Question {i+1}"}
data_object = wvc.data.DataObject(
properties=properties,
references={
"hasCategory": target_uuid
},
uuid=generate_uuid5(properties)
)
data_objects.append(data_object)
response = questions.data.insert_many(data_objects)
Using the properties
parameter to add references is deprecated and will be removed in the future.
query
The query
submodule contains all object-level query operations, including fetch_objects
for retrieving objects without additional search parameters, bm25
for keyword search, near_<xxx>
for vector search operators, hybrid
for hybrid search and so on.
These queries return a _QueryReturn
object, which contains a list of _Object
objects.
- BM25
- Near text
questions = client.collections.get("JeopardyQuestion")
response = questions.query.bm25(
query="animal",
limit=2
)
for o in response.objects:
print(o.properties) # Object properties
questions = client.collections.get("JeopardyQuestion")
response = questions.query.near_text(
query="animal",
limit=2
)
for o in response.objects:
print(o.properties) # Object properties
Queries with custom returns
You can further specify:
- Whether to include the object vector (via
include_vector
)- Default is
False
- Default is
- Which properties to include (via
return_properties
)- All properties are returned by default
- Which references to include (via
return_references
) - Which metadata to include
- No metadata is returned by default
Each object includes its UUID as well as all properties by default.
For example:
- Default
- Customized returns
questions = client.collections.get("JeopardyQuestion")
response = questions.query.bm25(
query="animal",
limit=2
)
for o in response.objects:
print(o.properties) # All properties by default
print(o.references) # References not returned by default
print(o.uuid) # UUID included by default
print(o.vector) # No vector
print(o.metadata) # No metadata
questions = client.collections.get("JeopardyQuestion")
response = questions.query.bm25(
query="animal",
include_vector=True,
return_properties=["question"],
return_metadata=wvc.query.MetadataQuery(distance=True),
return_references=wvc.query.QueryReference(
link_on="hasCategory",
return_properties=["title"],
return_metadata=wvc.query.MetadataQuery(creation_time=True)
),
limit=2
)
for o in response.objects:
print(o.properties) # Selected properties only
print(o.references) # Selected references
print(o.uuid) # UUID included by default
print(o.vector) # With vector
print(o.metadata) # With selected metadata
query
+ group by
Results of a query can be grouped by a property as shown here.
The results are organized by both their individual objects as well as the group.
- The
objects
attribute is a list of objects, each containing abelongs_to_group
property to indicate which group it belongs to. - The
group
attribute is a dictionary with each key indicating the value of the group, and the value being a list of objects belonging to that group.
questions = client.collections.get("JeopardyQuestion")
response = questions.query.near_text(
query="animal",
distance=0.2,
group_by=wvc.query.GroupBy(
prop="points",
number_of_groups=3,
objects_per_group=5
)
)
for k, v in response.groups.items(): # View by group
print(k, v)
for o in response.objects: # View by object
print(o)
generate
The RAG / generative search functionality is a two-step process involving a search followed by prompting a large language model. Therefore, function names are shared across the query
and generate
submodules, with additional parameters available in the generate
submodule.
- Generate
- Query
questions = client.collections.get("JeopardyQuestion")
response = questions.generate.bm25(
query="animal",
limit=2,
grouped_task="What do these animals have in common?",
single_prompt="Translate the following into French: {answer}"
)
print(response.generated) # Generated text from grouped task
for o in response.objects:
print(o.generated) # Generated text from single prompt
print(o.properties) # Object properties
questions = client.collections.get("JeopardyQuestion")
response = questions.query.bm25(
query="animal",
limit=2
)
for o in response.objects:
print(o.properties) # Object properties
Outputs of the generate
submodule queries include generate
attributes at the top level for the grouped_task
tasks, while generate
attributes attached with each object contain results from single_prompt
tasks.
aggregate
To use the aggregate
submodule, supply one or more ways to aggregate the data. For example, they could be by a count of objects matching the criteria, or by a metric aggregating the objects' properties.
- Count
- Metric
from weaviate.classes.query import Filter
questions = client.collections.get("JeopardyQuestion")
response = questions.aggregate.over_all(
filters=Filter.by_property(name="question").like("*animal*"),
total_count=True
)
print(response.total_count)
questions = client.collections.get("JeopardyQuestion")
response = questions.aggregate.near_text(
query="animal",
object_limit=5,
return_metrics=wvc.query.Metrics("points").integer(mean=True)
)
print(response.properties)
aggregate
+ group by
Results of a query can be grouped and aggregated as shown here.
The results are organized the group, returning a list of groups.
from weaviate.classes.aggregate import GroupByAggregate
questions = client.collections.get("JeopardyQuestion")
response = questions.aggregate.near_text(
query="animal",
distance=0.2,
group_by=GroupByAggregate(prop="points"),
return_metrics=wvc.query.Metrics("points").integer(mean=True)
)
for o in response.groups:
print(o)
Collection iterator (cursor
API)
The v4
client adds a Pythonic iterator method for each collection. This wraps the cursor
API and allows you to iterate over all objects in a collection.
This example fetches all the objects, and their properties, from the questions
collection.
all_objects = [question for question in questions.iterator()]
You can specify which properties to retrieve. This example fetches the answer
property.
all_object_answers = [question for question in questions.iterator(return_properties=["answer"])]
You can also specify which metadata to retrieve. This example fetches the creation_time
metadata.
all_object_ids = [question for question in questions.iterator(return_metadata=wvc.query.MetadataQuery(creation_time=True))] # Get selected metadata
Since the cursor
API requires the object UUID for indexing, the uuid
metadata is always retrieved.
You can also get the size of the collection by using the built-in len
function.
articles = client.collections.get("Article")
print(len(articles))
Data model and generics
You can choose to provide a generic type to a query or data operation. This can be beneficial as the generic class is used to extract the return properties and statically type the response.
from typing import TypedDict
questions = client.collections.get("JeopardyQuestion")
class Question(TypedDict):
question: str
answer: str
points: int
response = questions.query.fetch_objects(
limit=2,
return_properties=Question, # Your generic class is used to extract the return properties and statically type the response
return_metadata=wvc.query.MetadataQuery(creation_time=True) # MetaDataQuery object is used to specify the metadata to be returned in the response
)
Migration guides
v3
to v4
If you are migrating from the v3
client to the v4
, please see this dedicated guide.
Beta releases
Migration guides - beta releases
Changes in v4.4b9
weaviate.connect_to_x
methods
The timeout
argument in now a part of the additional_config
argument. It takes the class weaviate.config.AdditionalConfig
as input.
Queries
All optional arguments to methods in the query
namespace now are enforced as keyword arguments.
There is now runtime logic for parsing query arguments enforcing the correct type.
Batch processing
Introduction of three distinct algorithms using different batching styles under-the-hood:
client.batch.dynamic()
client.batch.fixed_size()
client.batch.rate_limit()
client.batch.dynamic() as batch
is a drop-in replacement for the previous client.batch as batch
, which is now deprecated and will be removed on release.
with client.batch.dynamic() as batch:
...
is equivalent to:
with client.batch as batch:
...
client.batch.fixed_size() as batch
is a way to configure your batching algorithm to only use a fixed size.
with client.batch.dynamic() as batch:
...
is equivalent to:
client.batch.configure_fixed_size()
with client.batch as batch:
...
client.batch.rate_limit() as batch
is a new way to help avoid hitting third-party vectorization API rate limits. By specifying request_per_minute
in the
rate_limit()
method, you can force the batching algorithm to send objects to Weaviate at the speed your third-party API is capable of processing objects.
These methods now return completely localized context managers. This means that failed_objects
and failed_references
of one batch won't be included
in any subsequent calls.
Finally, if the background thread responsible for sending the batches raises an exception this is now re-raised in the main thread rather than silently erroring.
Filters
The argument prop
in Filter.by_property
has been renamed to name
Ref counting is now achievable using Filter.by_ref_count(ref)
rather than Filter([ref])
Changes in v4.4b8
Reference filters
Reference filters have a simplified syntax. The new syntax looks like this:
Filter.by_ref("ref").by_property("target_property")
Changes in v4.4b7
Library imports
Importing directly from weaviate
is deprecated. Use import weaviate.classes as wvc
instead.
Close client connections
Starting in v4.4b7, you have to explicitly close your client connections. There are two ways to close client connections.
Use client.close()
to explicitly close your client connections.
import weaviate
client = weaviate.connect_to_local()
print(client.is_ready())
client.close()
Use a context manager to close client connections for you.
import weaviate
with weaviate.connect_to_local() as client:
print(client.is_ready())
# Python closes the client when you leave the 'with' block
Batch processing
The v4.4b7 client introduces changes to client.batch
.
client.batch
requires a context manager.- Manual mode is removed, you cannot send batches with
.create_objects
. - Batch size and the number of concurrent requests are dynamically assigned. Use
batch.configure_fixed_size
to specify values. - The
add_reference
method is updated. - The
to_object_collection
method is removed.
Updated client.batch
parameters
Old value | Value in v4.4b7 |
---|---|
from_object_uuid: UUID | from_uuid: UUID |
from_object_collection: str | from_collection: str |
from_property_name: str | from_property: str |
to_object_uuid: UUID | to: Union[WeaviateReference, List[UUID]] |
to_object_collection: Optional[str] = None | |
tenant: Optional[str] = None | tenant: Optional[str] = None |
Filter syntax
Filter syntax is updated in v4.4b7.
NOTE: The filter reference syntax is simplified in 4.4b8.
Old syntax | New syntax in v4.4b7 |
---|---|
Filter(path=property) | Filter.by_property(property) |
Filter(path=["ref", "target_class", "target_property"]) | Filter.by_ref().link_on("ref").by_property("target_property") |
FilterMetadata.ByXX | Filter.by_id() Filter.by_creation_time() Filter.by_update_time() |
The pre-4.4b7 filter syntax is deprecated. The new, v4.4b7 syntax looks like this.
import weaviate
import datetime
import weaviate.classes as wvc
client = weaviate.connect_to_local()
jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.fetch_objects(
filters=wvc.query.Filter.by_property("round").equal("Double Jeopardy!") &
wvc.query.Filter.by_creation_time().greater_or_equal(datetime.datetime(2005, 1, 1)) |
wvc.query.Filter.by_creation_time().greater_or_equal(datetime.datetime(2000, 12, 31)),
limit=3
)
client.close()
reference_add_many
updated
The reference_add_many
syntax is updated; DataReferenceOneToMany
is now DataReference
.
collection.data.reference_add_many(
[
DataReference(
from_property="ref",
from_uuid=uuid_from,
to_uuid=*one or a list of UUIDs*,
)
]
)
References
Multi-target references updated. These are the new functions:
ReferenceProperty.MultiTarget
DataReference.MultiTarget
QueryReference.MultiTarget
Use ReferenceToMulti
for multi-target references.
Older client changes
References
- References are now added through a
references
parameter during collection creation, object insertion and queries. See examples for: - The
FromReference
class is now calledQueryReference
.
Reorganization of classes/parameters
weaviate.classes
submodule further split into:weaviate.classes.config
weaviate.classes.data
weaviate.classes.query
weaviate.classes.generic
vector_index_config
parameter factory functions forwvc.config.Configure
andwvc.config.Reconfigure
have changed to, e.g.:client.collections.create(
name="YourCollection",
vector_index_config=wvc.config.Configure.VectorIndex.hnsw(
distance_metric=wvc.config.VectorDistances.COSINE,
vector_cache_max_objects=1000000,
quantizer=wvc.config.Configure.VectorIndex.Quantizer.pq()
),
)vector_index_type
parameter has been removed.
vectorize_class_name
parameter in theProperty
constructor method isvectorize_collection_name
.[collection].data.update()
/.replace()
*args order changed, aiming to accommodate not providing properties when updating.[collection].data.reference_add
/.reference_delete
/.reference_replace
theref
keyword was renamed toto
.collections.create()
/get()
:data_model
kwarg to keyword to provide generics was renamed todata_model_properties
.[object].metadata.uuid
is now[object].uuid
.[object].metadata.creation_time_unix
is now[object].metadata.creation_time
.[object].metadata.last_update_time_unix
is now[object].metadata.last_update
.quantitizer
is renamed toquantizer
- To request the vector in the returned data, use the
include_vector
parameter (example).
Data types
- Time metadata (for creation and last updated time) now returns a
datetime
object, and the parameters are renamed tocreation_time
andlast_update_time
underMetadataQuery
.metadata.creation_time.timestamp() * 1000
will return the same value as before.
query.fetch_object_by_id()
now uses gRPC under the hood (rather than REST), and returns objects in the same format as other queries.UUID
andDATE
properties are returned as typed objects.
Best practices and notes
Exception handling
The client library raises exceptions for various error conditions. These include, for example:
weaviate.exceptions.WeaviateConnectionError
for failed connections.weaviate.exceptions.WeaviateQueryError
for failed queries.weaviate.exceptions.WeaviateBatchError
for failed batch operations.weaviate.exceptions.WeaviateClosedClientError
for operations on a closed client.
Each of these exceptions inherit from weaviate.exceptions.WeaviateBaseError
, and can be caught using this base class, as shown below.
try:
collection = client.collections.get("NonExistentCollection")
collection.query.fetch_objects(limit=2)
except weaviate.exceptions.WeaviateBaseError as e:
print(f"Caught a Weaviate error: {e.message}")
You can review this module which defines the exceptions that can be raised by the client library.
The client library doc strings also provide information on the exceptions that can be raised by each method. You can view these by using the help
function in Python, by using the ?
operator in Jupyter notebooks, or by using an IDE, such as hover-over tooltips in VSCode.
Thread-safety
While the Python client is fundamentally designed to be thread-safe, it's important to note that due to its dependency on the requests
library, complete thread safety isn't guaranteed.
This is an area that we are looking to improve in the future.
Please be particularly aware that the batching algorithm within our client is not thread-safe. Keeping this in mind will help ensure smoother, more predictable operations when using our Python client in multi-threaded environments.
If you are performing batching in a multi-threaded scenario, ensure that only one of the threads is performing the batching workflow at any given time. No two threads can use the same client.batch
object at one time.
Response object structure
Each query response object typically include multiple attributes. Consider this query.
questions = client.collections.get("JeopardyQuestion")
response = questions.generate.near_text(
query="history",
limit=2,
single_prompt="Translate this into French {question}",
grouped_task="Summarize this into a sentence",
return_metadata=wvc.query.MetadataQuery(
distance=True,
creation_time=True
)
)
print("Grouped Task generated outputs:")
print(response.generated)
for o in response.objects:
print(f"Outputs for object {o.uuid}")
print(f"Generated text:")
print(o.generated)
print(f"Properties:")
print(o.properties)
print(f"Metadata")
print(o.metadata)
Each response includes attributes such as objects
and generated
. Then, each object in objects
include multiple attributes such as uuid
, vector
, properties
, references
, metadata
and generated
.
_GenerativeReturn(objects=[_GenerativeObject(uuid=UUID('61e29275-8f53-5e28-a355-347d45a847b3'), metadata=_MetadataReturn(creation_time=datetime.datetime(2024, 1, 2, 18, 3, 7, 475000, tzinfo=datetime.timezone.utc), last_update_time=None, distance=0.19253945350646973, certainty=None, score=None, explain_score=None, is_consistent=None, rerank_score=None), properties={'points': 1000.0, 'answer': 'Daniel Boorstein', 'air_date': datetime.datetime(1990, 3, 26, 0, 0, tzinfo=datetime.timezone.utc), 'round': 'Double Jeopardy!', 'question': 'This historian & former Librarian of Congress was teaching history at Harvard while studying law at Yale'}, references=None, vector=None, generated="Cet historien et ancien bibliothécaire du Congrès enseignait l'histoire à Harvard tout en étudiant le droit à Yale."), _GenerativeObject(uuid=UUID('e987d1a1-2599-5dd8-bd22-4f3b0338539a'), metadata=_MetadataReturn(creation_time=datetime.datetime(2024, 1, 2, 18, 3, 8, 185000, tzinfo=datetime.timezone.utc), last_update_time=None, distance=0.193121075630188, certainty=None, score=None, explain_score=None, is_consistent=None, rerank_score=None), properties={'points': 400.0, 'air_date': datetime.datetime(2007, 5, 11, 0, 0, tzinfo=datetime.timezone.utc), 'answer': 'an opinion', 'round': 'Jeopardy!', 'question': 'This, a personal view or belief, comes from the Old French for "to think"'}, references=None, vector=None, generated='Ceci, une opinion personnelle ou une croyance, provient du vieux français signifiant "penser".')], generated='Daniel Boorstein, a historian and former Librarian of Congress, taught history at Harvard while studying law at Yale, and an opinion is a personal view or belief derived from the Old French word for "to think".')
To limit the response payload, you can specify which properties and metadata to return.
Input argument validation
The client library performs input argument validation by default to make sure that the input types match the expected types.
You can disable this validation to improve performance. You can do this by setting the skip_argument_validation
parameter to True
when you instantiate a collection object, with collections.get
, or with collections.create
for example.
# Configure the `performant_articles` to skip argument validation on its methods
performant_articles = client.collections.get("Article", skip_argument_validation=True)
This may be useful in cases where you are using the client library in a production environment, where you can be confident that the input arguments are typed correctly.
Tab completion in Jupyter notebooks
If you use a browser to run the Python client with a Jupyter notebook, press Tab
for code completion while you edit. If you use VSCode to run your Jupyter notebook, press control
+ space
for code completion.
Raw GraphQL queries
To provide raw GraphQL queries, you can use the client.graphql_raw_query
method (previously client.query.raw
in the v3
client). This method takes a string as input.
Code examples & resources
Usage information for various operations and features can be found throughout the Weaviate documentation.
Some frequently used sections are the how-to guides for Managing data and Queries. The how-to guides include concise examples for common operations.
In particular, check out the pages for:
- Client instantiation,
- Manage collections,
- Batch import
- Cross-reference
- Basic search
- Similarity search
- Filters
The Weaviate API reference pages for search and REST may also be useful starting points.
Client releases
These charts show the Weaviate client releases associated with Weaviate core releases.
Current minor releases
Weaviate Version | Release Date | Python | TypeScript | Go | Java |
---|---|---|---|---|---|
1.24.10 | 2024-04-19 | 4.5.6 | 2.1.1 | 4.13.1 | 4.6.0 |
1.24.9 | 2024-04-17 | '' | '' | '' | '' |
1.24.8 | 2024-04-08 | '' | '' | '' | '' |
1.24.7 | 2024-04-05 | '' | '' | '' | '' |
1.24.6 | 2024-03-26 | 4.5.4 | '' | '' | '' |
1.24.5 | 2024-03-21 | '' | '' | '' | '' |
1.24.4 | 2024-03-15 | 4.5.1 | '' | '' | '' |
1.24.3 | 2024-03-14 | '' | '' | '' | '' |
1.24.2 | 2024-03-13 | '' | '' | '' | '' |
1.24.1 | 2024-03-01 | 4.5.0 | 2.1.0 | 4.12.0 | 4.5.1 |
Major releases
Weaviate Version | Release Date | Python | TypeScript | Go | Java |
---|---|---|---|---|---|
1.24.0 | 2024-02-27 | 4.5.0 | 2.1.0 | 4.12.0 | 4.5.1 |
1.23.0 | 2023-12-18 | 3.26.0 | 2.0.0 | '' | 4.4.2 |
1.22.0 | 2023-10-27 | 3.25.0 | '' | 4.10.0 | 4.3.0 |
1.21.0 | 2023-08-17 | 3.22.1 | 1.4.0 | 4.9.0 | 4.2.1 |
1.20.0 | 2023-07-06 | 3.22.0 | '' | '' | 4.2.0 |
1.19.0 | 2023-05-04 | 3.17.0 | 1.1.01 | 4.7.1 | 4.0.1 |
1.18.0 | 2023-03-07 | 3.13.0 | 2.14.5 | 4.6.2 | 3.6.4 |
1.17.0 | 2022-12-20 | 3.9.0 | 2.14.0 | 4.5.0 | 3.5.0 |
1.16.0 | 2022-10-31 | 3.8.0 | 2.13.0 | 4.4.0 | 3.4.0 |
1.15.0 | 2022-09-07 | '' | 2.12.0 | 4.3.0 | 3.3.0 |
1.14.0 | 2022-07-07 | 3.6.0 | 2.11.0 | 4.2.0 | 3.2.0 |
1.13.0 | 2022-05-03 | 3.4.2 | 2.9.0 | 4.0.0 | 2.4.0 |
1.12.0 | 2022-04-05 | 3.4.0 | 2.8.0 | 3.0.0 | '' |
1.11.0 | 2022-03-14 | 3.2.5 | 2.7.0 | 2.6.0 | 2.3.0 |
1.10.0 | 2022-01-27 | '' | 2.5.0 | 2.4.1 | 2.1.1 |
1.9.0 | 2021-12-10 | '' | '' | 2.4.0 | 2.1.0 |
1.8.0 | 2021-11-30 | '' | '' | '' | '' |
1.7.0 | 2021-09-01 | 3.1.1 | 2.4.0 | 2.3.0 | 1.1.0 |
1.6.0 | 2021-08-11 | 2.4.0 | 2.3.0 | 2.2.0 | '' |
1.5.0 | 2021-07-13 | '' | '' | '' | '' |
1.4.0 | 2021-06-09 | '' | '' | '' | '' |
1.3.0 | 2021-04-23 | '' | 2.1.0 | 2.1.0 | 1.0.0 |
1.2.0 | 2021-03-15 | 2.2.0 | 2.0.0 | 1.1.0 | - |
1.1.0 | 2021-02-10 | 2.1.0 | '' | '' | - |
1.0.0 | 2021-01-14 | 2.0.0 | '' | '' | - |
- The TypeScript client replaced the JavaScript client on 2023-03-17.↩
Change logs
For more detailed information on client updates, check the change logs. The logs are hosted here:
Questions and feedback
If you have any questions or feedback, please let us know on our forum. For example, you can: