Skip to main content

Python (v4)

Beta version

The Python client is currently in beta, and we want to hear from you. You can test the new client locally, or on paid instances of Weaviate Cloud Services (WCS). It is not yet available on the free (sandbox) tier of WCS. If you notice any bugs, or have any feedback, please let us know on this forum thread


This page describes the v4 Python client for Weaviate.

The full set of features is covered in the client documentation pages. This page covers key ideas and aspects of the new Python client.

Key changes from v3​

This client is also called the collections client, because it adds new collection-level (previously called "class") interactions.

This client also includes numerous additional Python classes to provide IDE assistance and typing help. You can import them individually, like so:

from weaviate.classes import Property, ConfigFactory, DataObject

But it may be convenient to import the whole set of classes like this.

import weaviate.classes as wvc


The Python library is available on The package can be installed using pip. The client is developed and tested for Python 3.8 to 3.12.

Install the client with the following command:

pip install --pre -U "weaviate-client==4.*"


Weaviate version​

Beta version

The API may change on the client-side and the server-side, especially during the beta period. Accordingly, we encourage you to use the latest version of the Python client and the Weaviate server.

The v4 client is designed for use with Weaviate 1.22 and higher to take advantage of the gRPC API. If you are using an older version of Weaviate, or otherwise unable to use gRPC, please use the v3 client, or the legacy instantiation method through the weaviate.Client class which is still available.

Please refer to the v3 client documentation if you are using this instantiation method.

gRPC port​

A port for gRPC must be open on your Weaviate server. If you are running Weaviate locally, you can open the default port (50051) by adding the following to your docker-compose.yml file:

- "8080:8080"
- "50051:50051"

WCS availability​

You can test the new client locally, or on paid instances of Weaviate Cloud Services (WCS). It is not yet available on the free (sandbox) tier of WCS.


You can instantiate the client using one of multiple methods. For example, you can use one of the following helper connect functions:

  • weaviate.connect_to_wcs()
  • weaviate.connect_to_local()
  • weaviate.connect_to_embedded()
  • weaviate.connect_to_custom()

See the examples below:

Note: As of December 2023, WCS sandboxes are not compatible with the v4 client.

import weaviate

client = weaviate.connect_to_wcs(

Or, you can instantiate a weaviate.WeaviateClient object directly.

API keys for external API use​

You can pass on API keys for services such as Cohere, OpenAI and so on through additional headers. For example:

import weaviate
import os

client = weaviate.connect_to_local(
headers={"X-OpenAI-Api": os.environ["OPENAI_APIKEY"]}

Timeout values​

You can set timeout values for the client as a tuple (connection timeout & read timeout time) in seconds.

import weaviate

client = weaviate.connect_to_local(port=8080, grpc_port=50051, timeout=(5, 15))


Some helper connect functions allow you to pass on authentication credentials.

For example, the connect_to_wcs method allows for a WCS api key or OIDC authentication credentials to be passed in.

import weaviate

client = weaviate.connect_to_wcs(

The client also supports OIDC authentication with Client Credentials flow and Refresh Token flow. They are available through the AuthClientCredentials and AuthBearerToken classes respectively.

If a particular helper function does not support the desired workflow, directly instantiate the WeaviateClient object.

Advanced: Direct instantiation​

You can also instantiate a client (WeaviateClient) object directly and pass on custom parameters. This is the most flexible way to instantiate the client.

import weaviate

client = weaviate.WeaviateClient(
"X-OpenAI-Api-Key": os.environ["OPENAI_APIKEY"]
timeout=(5, 15)

V3 Client instantiation​

You can instantiate a v3 style Client object using the weaviate.Client class. This is the legacy instantiation method, and is still available for backwards compatibility.

Please refer to the v3 client documentation if you are using this instantiation method.

Working with collections​

Instantiate a collection​

You can instantiate a collection object by creating a collection, or by retrieving an existing collection.

import weaviate
import weaviate.classes as wvc

client = weaviate.connect_to_local()

collection = client.collections.create(

Collection submodules​

Operations in the v4 client are grouped into submodules. The key submodules for interacting with objects are:

  • data: CUD operations (read operations are in query)
  • query: Search operations
  • generate: Retrieval augmented generation operations
    • Build on top of query operations
  • aggregate: Aggregation operations
  • query_group_by: Object-level group by operations
  • aggregate_group_by: Aggregation-level group by operations


The data submodule contains all object-level CUD operations, including:

  • insert for creating objects.
    • This function takes the object properties as a dictionary.
  • insert_many for batch creating multiple objects.
    • This function takes the object properties as a dictionary or as a DataObject instance.
  • update for updating objects (for PATCH operations).
  • replace for replacing objects (for PUT operations).
  • delete_by_id for deleting objects by ID.
  • delete_many for batch deletion.
  • reference_xxx for reference operations, including reference_add, reference_add_many, reference_update and reference_delete.

See some examples below. Note that each function will return varying types of objects.

insert_many sends one request

As of 4.4b1, insert_many will send one request for the entire function call. We are evaluating modifing this to send multiple requests by matches in the future.

questions = client.collections.get("JeopardyQuestion")

tmp_uuid =
"question": "This is the capital of Australia."


The query submodule contains all object-level query operations, including fetch_objects for retrieving objects without additional search parameters, bm25 for keyword search, near_<xxx> for vector search operators, hybrid for hybrid search and so on.

These queries return a _QueryReturn object, which contains a list of _Object objects.

questions = client.collections.get("JeopardyQuestion")
response = questions.query.bm25(

for o in response.objects:
print( # Object properties

You can further specify:

  • Whether to include the object vector (via include_vector)
    • Default is False
  • Which properties to include (via return_properties)
    • All properties are returned by default
  • Which metadata to include
    • No metadata is returned by default

Each object includes its UUID as well as all properties by default.

For example:

questions = client.collections.get("JeopardyQuestion")
response = questions.query.bm25(

for o in response.objects:
print( # All properties by default
print(o.uuid) # UUID included by default
print(o.vector) # No vector
print(o.metadata) # No metadata


The RAG / generative search functionality is a two-step process involving a search followed by prompting a large language model. Therefore, function names are shared across the query and generate submodules, with additional parameters available in the generate submodule.

questions = client.collections.get("JeopardyQuestion")
response = questions.generate.bm25(
grouped_task="What do these animals have in common?",
single_prompt="Translate the following into French: {answer}"

print(response.generated) # Generated text from grouped task
for o in response.objects:
print(o.generated) # Generated text from single prompt
print( # Object properties

Outputs of the generate submodule queries include generate attributes at the top level for the grouped_task tasks, while generate attributes attached with each object contain results from single_prompt tasks.


To use the aggregate submodule, supply one or more ways to aggregate the data. For example, they could be by a count of objects matching the criteria, or by a metric aggregating the objects' properties.

questions = client.collections.get("JeopardyQuestion")
response = questions.aggregate.over_all(



Results of a query can be grouped by a property as shown here.

The results are organized by both their individual objects as well as the group.

  • The objects attribute is a list of objects, each containing a belongs_to_group property to indicate which group it belongs to.
  • The group attribute is a dictionary with each key indicating the value of the group, and the value being a list of objects belonging to that group.
questions = client.collections.get("JeopardyQuestion")
response = questions.query_group_by.near_text(

for k, v in response.groups.items(): # View by group
print(k, v)

for o in response.objects: # View by object


Results of a query can be grouped and aggregated as shown here.

The results are organized the group, returning a list of groups.

questions = client.collections.get("JeopardyQuestion")
response = questions.aggregate_group_by.near_text(

for o in response:

Collection iterator (cursor API)​

The v4 client adds a Pythonic iterator method for each collection. This wraps the cursor API and allows you to iterate over all objects in a collection.

This will fetch all objects in the questions collection, including its properties.

all_objects = [question for question in questions.iterator()]

You can specify what properties to retrieve. This will only fetch the title property.

all_object_answer_ids = [question for question in questions.iterator(return_properties=["answer"])]

You can also specify what metadata to retrieve. This will only fetch the creation_time_unix metadata.

all_object_ids = [question for question in questions.iterator(return_metadata=wvc.MetadataQuery(creation_time_unix=True))]  # Only return IDs

Note that as the cursor API inherently requires the object UUID for indexing, the uuid metadata is always retrieved.

Data model / generics​

You can choose to provide a generic type to a query or data operation. This can be beneficial as the generic class is used to extract the return properties and statically type the response.

from typing import TypedDict

questions = client.collections.get("JeopardyQuestion")

class Question(TypedDict):
question: str
answer: str
points: int

response = questions.query.fetch_objects(
return_properties=Question, # Your generic class is used to extract the return properties and statically type the response
return_metadata=wvc.MetadataQuery(creation_time_unix=True) # MetaDataQuery object is used to specify the metadata to be returned in the response

Best practices and notes​


While the Python client is fundamentally designed to be thread-safe, it's important to note that due to its dependency on the requests library, complete thread safety isn't guaranteed.

This is an area that we are looking to improve in the future.

Please be particularly aware that the batching algorithm within our client is not thread-safe. Keeping this in mind will help ensure smoother, more predictable operations when using our Python client in multi-threaded environments.

If you are performing batching in a multi-threaded scenario, ensure that only one of the threads is performing the batching workflow at any given time. No two threads can use the same client.batch object at one time.

Response object structure​

Each query response object typically include multiple attributes. Consider this query.

questions = client.collections.get("JeopardyQuestion")
response = questions.generate.near_text(
single_prompt="Translate this into French {question}",
grouped_task="Summarize this into a sentence",


Each response includes attributes such as objects and generated. Then, each object in objects include multiple attributes such as uuid, vector, properties, metadata and generated.

_GenerativeReturn(objects=[_GenerativeObject(uuid=UUID('f448a778-78bb-5565-9b3b-fd4aed03dad0'), metadata=_MetadataReturn(creation_time_unix=1701373868665, last_update_time_unix=None, distance=0.19842731952667236, certainty=None, score=None, explain_score=None, is_consistent=None), properties={'points': 100.0, 'answer': 'Greyhound', 'air_date': '1996-11-15T00:00:00Z', 'round': 'Jeopardy!', 'question': 'A Hibbing, Minn. museum traces the history of this bus company founded there in 1914 using Hupmobiles'}, vector=None, generated="Un musΓ©e Γ  Hibbing, dans le Minnesota, retrace l'histoire de cette compagnie de bus fondΓ©e en 1914 en utilisant des Hupmobiles."), _GenerativeObject(uuid=UUID('28ec4a1a-c68e-5cff-b392-f74bf26aef62'), metadata=_MetadataReturn(creation_time_unix=1701373868664, last_update_time_unix=None, distance=0.2006242275238037, certainty=None, score=None, explain_score=None, is_consistent=None), properties={'points': 200.0, 'air_date': '1996-03-26T00:00:00Z', 'answer': 'Harvard', 'round': 'Double Jeopardy!', 'question': "1995 marked the 200th anniversary of this university's Hasty Pudding Club"}, vector=None, generated='1995 a marquΓ© le 200e anniversaire du Hasty Pudding Club de cette universitΓ©.')], generated="The Greyhound bus company, founded in Hibbing, Minn. in 1914, is traced in a museum using Hupmobiles, while Harvard University's Hasty Pudding Club celebrated its 200th anniversary in 1995.")

To limit the response payload, you can specify which properties and metadata to return.

Additionally, to view the response object in a more readable format, you can use the json.dumps() function as shown below

import json

questions = client.collections.get("JeopardyQuestion")
response = questions.query.fetch_objects(limit=1)

# Print result object properties
for o in response.objects:
print(json.dumps(, indent=2))

This is the formatted output.

"points": 100.0,
"answer": "Jonah",
"air_date": "2001-01-10T00:00:00Z",
"round": "Jeopardy!",
"question": "This prophet passed the time he spent inside a fish offering up prayers"

Tab completion in Jupyter notebooks​

If you use a browser to run the Python client with a Jupyter notebook, press Tab for code completion while you edit. If you use VSCode to run your Jupyter notebook, press control + space for code completion.

Client releases​

This chart matches Weaviate database releases with Weaviate client releases. It lists the most recent database version when each client version was released.

The chart includes point releases for the most recent major and minor versions of the client. Earlier client releases have less detailed version information.

Weaviate VersionRelease DatePythonTypeScriptGoJava

  1. The TypeScript client replaced the JavaScript client on 2023-03-17.↩

Change logs​

For more detailed information on client updates, check the change logs. The logs are hosted here: