Skip to main content

Migrate from v3 to v4

Python client version

The current Python client version is v4.9.3

The v4 Weaviate Python client API is a complete rewrite, aimed at an improved overall user experience. It is therefore also very different to the v3 API, and will require re-learning of changed patterns in the way you interact with Weaviate.

While this may introduce some overhead, we believe the v4 API is a significant improvement to your developer experience. For instance, using the v4 client will allow you to take full advantage faster speeds through the gRPC API, and additional static analysis for IDE assistance through strong typing.

Due to the extensive API surface changes, this guide does not cover every change. Instead, this guide is designed to help you understand the major changes and how to migrate your code at a high level.

For code examples, refer to the documentation throughout the site, starting with these suggested sections.

Installation

To go from v3 to v4, you must

  1. Upgrade the client library:

    pip install -U weaviate-client
  2. Upgrade Weaviate to a compatible version

    • Weaviate 1.23.7 is required for v4.4.1. Generally, we recommend you use the latest versions of Weaviate and the client.
  3. Make sure a port for gRPC is open to Weaviate.

    • The default port is 50051.
    docker-compose.yml example

    If you are running Weaviate with Docker, you can map the default port (50051) by adding the following to your docker-compose.yml file:

        ports:
    - 8080:8080
    - 50051:50051

Instantiate a client

The v4 client is instantiated by the WeaviateClient object. The WeaviateClient object is the main entry point for all API operations.

You can instantiate the WeaviateClient object directly. However, in most cases it is easier to use a connection helper function such as connect_to_local or connect_to_weaviate_cloud.

import weaviate
from weaviate.classes.init import Auth
import os

# Best practice: store your credentials in environment variables
wcd_url = os.environ["WCD_DEMO_URL"]
wcd_api_key = os.environ["WCD_DEMO_RO_KEY"]
openai_api_key = os.environ["OPENAI_APIKEY"]

client = weaviate.connect_to_weaviate_cloud(
cluster_url=wcd_url, # Replace with your Weaviate Cloud URL
auth_credentials=Auth.api_key(wcd_api_key), # Replace with your Weaviate Cloud key
headers={'X-OpenAI-Api-key': openai_api_key} # Replace with your OpenAI API key
)

To configure connection timeout values, see Timeout values.

The v3 API style Client object is still available, and will be deprecated in the future.

Major changes

The v4 client API is very different from the v3 API. Major user-facing changes in the v4 client include:

  • Extensive use of helper classes
  • Interaction with collections
  • Removal of builder patterns

Helper classes

The v4 client makes extensive use of helper classes. These classes provide strong typing and thus static type checking. It also makes coding easier through your IDE's auto-completion feature.

When you are coding, check the auto-complete frequently. It provides useful guidance for API changes and client options.

import weaviate
import weaviate.classes.config as wvcc

client = weaviate.connect_to_local()

try:
# Note that you can use `client.collections.create_from_dict()` to create a collection from a v3-client-style JSON object
collection = client.collections.create(
name="TestArticle",
vectorizer_config=wvcc.Configure.Vectorizer.text2vec_cohere(),
generative_config=wvcc.Configure.Generative.cohere(),
properties=[
wvcc.Property(
name="title",
data_type=wvcc.DataType.TEXT
)
]
)

finally:
client.close()

The wvc namespace exposes commonly used classes in the v4 API. The namespace is divided further into submodules based on their primary purpose.

import weaviate.classes as wvc

Interact with collections

When you connect to a Weaviate database, the v4 API returns a WeaviateClient object, while the v3 API returns a Client object.

The v3 API's interactions were built around the client object (an instance of Client). This includes server interactions for CRUD and search operations.

In the v4 API, the main starting points for your interaction with Weaviate follow a different paradigm.

Server-level interactions such as checking readiness (client.is_ready()) or getting node statuses (client.cluster.nodes()) still remain with client (now an instance of WeaviateClient).

CRUD and search operations are now performed against a Collection object to reflect that these operations target a particular collection.

This example below shows a function with a Collection typing hint).

from weaviate.collections import Collection

my_collection = client.collections.get(collection_name)

def work_with_collection(collection: Collection):
# Do something with the collection, e.g.:
r = collection.query.near_text(query="financial report summary")
return r

response = work_with_collection(my_collection)

The collection object includes its name as an attribute. Accordingly, operations such as a near_text query can be performed without specifying the collection name. The v4 collection object has a more focussed namespace in comparison to the breadth of operations available with the v3 client object. This simplifies your code and reduces the potential for errors.

jeopardy = client.collections.get("JeopardyQuestion")

data_object = jeopardy.query.fetch_object_by_id("00ff6900-e64f-5d94-90db-c8cfa3fc851b")

print(data_object.properties)

Terminology changes (e.g. class -> collection)

Some of the terms within the Weaviate ecosystem are changing, and the client has changed accordingly:

  • A Weaviate "Class" is now called a "Collection". A collection stores a set of data objects together with their vector embeddings.
  • A "Schema" is now called a "Collection Configuration", a set of settings that define collection name, vectorizers, index configurations, property definitions, and so on.

Due to the architectural changes as well as changes to the terminology, most of the API has been changed. Expect to find differences in the way you interact with Weaviate.

For example, client.collections.list_all() is the replacement for client.schema.get().

Manage data has more details and additional sample code for working with data, such as working with collections. See searches for further details on various queries and filters.

Collection creation from JSON

You can still create a collection from a JSON definition. This may be a useful way to migrate your existing data, for example. You could fetch an existing definition and then use it to create a new collection.

import weaviate

client = weaviate.connect_to_local()

try:
collection_definition = {
"class": "TestArticle",
"properties": [
{
"name": "title",
"dataType": ["text"],
},
{
"name": "body",
"dataType": ["text"],
},
],
}

client.collections.create_from_dict(collection_definition)

finally:
client.close()

Removal of builder patterns

The builder patterns for constructing queries have been removed. Builder patterns could be confusing, and led to runtime errors that could not be picked up with static analysis.

Instead, construct queries in the v4 API using specific methods and its parameters.

from weaviate.classes.query import MetadataQuery

jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.near_text(
query="animals in movies",
limit=2,
return_metadata=MetadataQuery(distance=True)
)

for o in response.objects:
print(o.properties)
print(o.metadata.distance)

Additionally, many arguments are now constructed using helper classes (e.g. MetadataQuery or Filter) which makes it easier to use and reduces errors through IDE assistance and static analysis.

How to migrate your code

The migration will likely involve significant changes to your codebase. Review the Python client library documentation to get started, including instantiation details and various submodules.

Then, take a look at the how-to guides for Managing data and Queries.

In particular, check out the pages for:

Questions and feedback

If you have any questions or feedback, let us know in the user forum.