Migrate from v3 to v4
The current Python client version is v4.8.1
The v4
Weaviate Python client API is a complete rewrite, aimed at an improved overall user experience. It is therefore also very different to the v3
API, and will require re-learning of changed patterns in the way you interact with Weaviate.
While this may introduce some overhead, we believe the v4
API is a significant improvement to your developer experience. For instance, using the v4
client will allow you to take full advantage faster speeds through the gRPC API, and additional static analysis for IDE assistance through strong typing.
Due to the extensive API surface changes, this guide does not cover every change. Instead, this guide is designed to help you understand the major changes and how to migrate your code at a high level.
For code examples, refer to the documentation throughout the site, starting with these suggested sections.
Installation
To go from v3
to v4
, you must
Upgrade the client library:
pip install -U weaviate-client
Upgrade Weaviate to a compatible version
- Weaviate
1.23.7
is required forv4.4.1
. Generally, we recommend you use the latest versions of Weaviate and the client.
- Weaviate
Make sure a port for gRPC is open to Weaviate.
- The default port is 50051.
docker-compose.yml example
If you are running Weaviate with Docker, you can map the default port (
50051
) by adding the following to yourdocker-compose.yml
file:ports:
- 8080:8080
- 50051:50051
Instantiate a client
The v4
client is instantiated by the WeaviateClient
object. The WeaviateClient
object is the main entry point for all API operations.
You can instantiate the WeaviateClient
object directly. However, in most cases it is easier to use a connection helper function such as connect_to_local
or connect_to_weaviate_cloud
.
- WCD
- Local
- Embedded
- Custom
import weaviate
from weaviate.classes.init import Auth
import os
# Best practice: store your credentials in environment variables
wcd_url = os.environ["WCD_DEMO_URL"]
wcd_api_key = os.environ["WCD_DEMO_RO_KEY"]
openai_api_key = os.environ["OPENAI_APIKEY"]
client = weaviate.connect_to_weaviate_cloud(
cluster_url=wcd_url, # Replace with your Weaviate Cloud URL
auth_credentials=Auth.api_key(wcd_api_key), # Replace with your Weaviate Cloud key
headers={'X-OpenAI-Api-key': openai_api_key} # Replace with your OpenAI API key
)
To configure connection timeout values, see Timeout values.
import weaviate
client = weaviate.connect_to_local() # Connect with default parameters
import weaviate
client = weaviate.connect_to_embedded() # Connect with default parameters
import weaviate
client = weaviate.connect_to_custom(
http_host="localhost",
http_port=8080,
http_secure=False,
grpc_host="localhost",
grpc_port=50051,
grpc_secure=False,
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY") # Or any other inference API keys
}
)
The v3
API style Client
object is still available, and will be deprecated in the future.
Major changes
The v4
client API is very different from the v3
API. Major user-facing changes in the v4
client include:
- Extensive use of helper classes
- Interaction with collections
- Removal of builder patterns
Helper classes
The v4
client makes extensive use of helper classes. These classes provide strong typing and thus static type checking. It also makes coding easier through your IDE's auto-completion feature.
When you are coding, check the auto-complete frequently. It provides useful guidance for API changes and client options.
- Create a collection
- NearText query
import weaviate
import weaviate.classes.config as wvcc
client = weaviate.connect_to_local()
try:
# Note that you can use `client.collections.create_from_dict()` to create a collection from a v3-client-style JSON object
collection = client.collections.create(
name="TestArticle",
vectorizer_config=wvcc.Configure.Vectorizer.text2vec_cohere(),
generative_config=wvcc.Configure.Generative.cohere(),
properties=[
wvcc.Property(
name="title",
data_type=wvcc.DataType.TEXT
)
]
)
finally:
client.close()
import weaviate
import weaviate.classes as wvc
from weaviate.classes.query import Move
import os
client = weaviate.connect_to_local()
try:
publications = client.collections.get("Publication")
response = publications.query.near_text(
query="fashion",
distance=0.6,
move_to=Move(force=0.85, concepts="haute couture"),
move_away=Move(force=0.45, concepts="finance"),
return_metadata=wvc.query.MetadataQuery(distance=True),
limit=2
)
for o in response.objects:
print(o.properties)
print(o.metadata)
finally:
client.close()
The wvc
namespace exposes commonly used classes in the v4
API. The namespace is divided further into submodules based on their primary purpose.
import weaviate.classes as wvc
Interact with collections
When you connect to a Weaviate database, the v4 API returns a WeaviateClient
object, while the v3 API returns a Client
object.
The v3
API's interactions were built around the client
object (an instance of Client
). This includes server interactions for CRUD and search operations.
In the v4
API, the main starting points for your interaction with Weaviate follow a different paradigm.
Server-level interactions such as checking readiness (client.is_ready()
) or getting node statuses (client.cluster.nodes()
) still remain with client
(now an instance of WeaviateClient
).
CRUD and search operations are now performed against a Collection
object to reflect that these operations target a particular collection.
This example below shows a function with a Collection
typing hint).
from weaviate.collections import Collection
my_collection = client.collections.get(collection_name)
def work_with_collection(collection: Collection):
# Do something with the collection, e.g.:
r = collection.query.near_text(query="financial report summary")
return r
response = work_with_collection(my_collection)
The collection object includes its name as an attribute. Accordingly, operations such as a near_text
query can be performed without specifying the collection name. The v4
collection object has a more focussed namespace in comparison to the breadth of operations available with the v3
client object. This simplifies your code and reduces the potential for errors.
- Python Client v4
- Python Client v3
jeopardy = client.collections.get("JeopardyQuestion")
data_object = jeopardy.query.fetch_object_by_id("00ff6900-e64f-5d94-90db-c8cfa3fc851b")
print(data_object.properties)
data_object = client.data_object.get_by_id(
"00ff6900-e64f-5d94-90db-c8cfa3fc851b",
class_name="JeopardyQuestion",
)
print(json.dumps(data_object, indent=2))
Terminology changes (e.g. class -> collection)
Some of the terms within the Weaviate ecosystem are changing, and the client has changed accordingly:
- A Weaviate "Class" is now called a "Collection". A collection stores a set of data objects together with their vector embeddings.
- A "Schema" is now called a "Collection Configuration", a set of settings that define collection name, vectorizers, index configurations, property definitions, and so on.
Due to the architectural changes as well as changes to the terminology, most of the API has been changed. Expect to find differences in the way you interact with Weaviate.
For example, client.collections.list_all()
is the replacement for client.schema.get()
.
Manage data has more details and additional sample code for working with data, such as working with collections. See searches for further details on various queries and filters.
Collection creation from JSON
You can still create a collection from a JSON definition. This may be a useful way to migrate your existing data, for example. You could fetch an existing definition and then use it to create a new collection.
import weaviate
client = weaviate.connect_to_local()
try:
collection_definition = {
"class": "TestArticle",
"properties": [
{
"name": "title",
"dataType": ["text"],
},
{
"name": "body",
"dataType": ["text"],
},
],
}
client.collections.create_from_dict(collection_definition)
finally:
client.close()
Removal of builder patterns
The builder patterns for constructing queries have been removed. Builder patterns could be confusing, and led to runtime errors that could not be picked up with static analysis.
Instead, construct queries in the v4
API using specific methods and its parameters.
- Python Client v4
- Python Client v3
from weaviate.classes.query import MetadataQuery
jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.near_text(
query="animals in movies",
limit=2,
return_metadata=MetadataQuery(distance=True)
)
for o in response.objects:
print(o.properties)
print(o.metadata.distance)
response = (
client.query
.get("JeopardyQuestion", ["question", "answer"])
.with_near_text({
"concepts": ["animals in movies"]
})
.with_limit(2)
.with_additional(["distance"])
.do()
)
print(json.dumps(response, indent=2))
Additionally, many arguments are now constructed using helper classes (e.g. MetadataQuery
or Filter
) which makes it easier to use and reduces errors through IDE assistance and static analysis.
How to migrate your code
The migration will likely involve significant changes to your codebase. Review the Python client library documentation to get started, including instantiation details and various submodules.
Then, take a look at the how-to guides for Managing data and Queries.
In particular, check out the pages for:
- Client instantiation,
- Manage collections,
- Batch import
- Cross-reference
- Basic search
- Similarity search
- Filters
Questions and feedback
If you have any questions or feedback, let us know in the user forum.