Skip to main content

Quickstart

Expected time: 30 minutes

What you will learn

This quickstart shows you how to combine Weaviate Cloud and the Weaviate Embeddings service to:

  1. Set up a Weaviate Cloud instance. (10 minutes)
  2. Add and vectorize your data using Weaviate Embeddings. (10 minutes)
  3. Perform a semantic (vector) search and hybrid search. (10 minutes)

Notes:

  • The code examples here are self-contained. You can copy and paste them into your own environment to try them out.

Requirements

To use Weaviate Embeddings, you will need:

  • A Weaviate Cloud Sandbox running at least Weaviate 1.28.5
  • A Weaviate client library that supports Weaviate Embeddings:
    • Python client version 4.9.5 or higher
    • JavaScript/TypeScript client version 3.2.5 or higher
    • Go/Java clients are not yet officially supported; you must pass the X-Weaviate-Api-Key and X-Weaviate-Cluster-Url headers manually upon instantiation as shown below.

Step 1: Set up Weaviate

1.1 Create a new cluster

To create a free Sandbox cluster in Weaviate Cloud, follow these instructions.

TIP: Use the latest Weaviate version!

When possible, try to use the latest Weaviate version. New releases include cutting-edge features, performance enhancements, and critical security updates to keep your application safe and up-to-date.

1.2 Install a client library

We recommend using a client library to work with Weaviate. Follow the instructions below to install one of the official client libraries, available in Python, JavaScript/TypeScript, Go, and Java.

Install the latest, Python client v4, by adding weaviate-client to your Python environment with pip:

pip install -U weaviate-client

1.3 Connect to Weaviate Cloud

Weaviate Embeddings is integrated with Weaviate Cloud. Your Weaviate Cloud credentials will be used to authorize your Weaviate Cloud instance's access for Weaviate Embeddings.

import weaviate
from weaviate.classes.init import Auth
import os

# Best practice: store your credentials in environment variables
wcd_url = os.getenv("WEAVIATE_URL")
wcd_key = os.getenv("WEAVIATE_API_KEY")

client = weaviate.connect_to_weaviate_cloud(
cluster_url=wcd_url, # Weaviate URL: "REST Endpoint" in Weaviate Cloud console
auth_credentials=Auth.api_key(wcd_key), # Weaviate API key: "ADMIN" API key in Weaviate Cloud console
)

print(client.is_ready()) # Should print: `True`

# Work with Weaviate

client.close()

Step 2: Populate the database

2.1 Define a collection

Now we can define a collection that will store our data. When creating a collection, you need to specify one of the available models for the vectorizer to use. This model will be used to create vector embeddings from your data.

from weaviate.classes.config import Configure

client.collections.create(
"DemoCollection",
vectorizer_config=[
Configure.NamedVectors.text2vec_weaviate(
name="title_vector",
source_properties=["title"],
model="Snowflake/snowflake-arctic-embed-l-v2.0",
# Further options
# dimensions=256
# base_url="<custom_weaviate_embeddings_url>",
)
],
# Additional parameters not shown
)

For more information about the available model options visit the Choose a model page.

2.2 Import objects

After configuring the vectorizer, import data into Weaviate. Weaviate generates embeddings for text objects using the specified model.

source_objects = [
{"title": "The Shawshank Redemption", "description": "A wrongfully imprisoned man forms an inspiring friendship while finding hope and redemption in the darkest of places."},
{"title": "The Godfather", "description": "A powerful mafia family struggles to balance loyalty, power, and betrayal in this iconic crime saga."},
{"title": "The Dark Knight", "description": "Batman faces his greatest challenge as he battles the chaos unleashed by the Joker in Gotham City."},
{"title": "Jingle All the Way", "description": "A desperate father goes to hilarious lengths to secure the season's hottest toy for his son on Christmas Eve."},
{"title": "A Christmas Carol", "description": "A miserly old man is transformed after being visited by three ghosts on Christmas Eve in this timeless tale of redemption."}
]

collection = client.collections.get("DemoCollection")

with collection.batch.dynamic() as batch:
for src_obj in source_objects:
# The model provider integration will automatically vectorize the object
batch.add_object(
properties={
"title": src_obj["title"],
"description": src_obj["description"],
},
# vector=vector # Optionally provide a pre-obtained vector
)
if batch.number_errors > 10:
print("Batch import stopped due to excessive errors.")
break

failed_objects = collection.batch.failed_objects
if failed_objects:
print(f"Number of failed imports: {len(failed_objects)}")
print(f"First failed object: {failed_objects[0]}")

Step 3: Query your data

Once the vectorizer is configured, Weaviate will perform vector search operations using the specified model.

When you perform a vector search, Weaviate converts the text query into an embedding using the specified model and returns the most similar objects from the database.

The query below returns the n most similar objects from the database, set by limit.

collection = client.collections.get("DemoCollection")

response = collection.query.near_text(
query="A holiday film", # The model provider integration will automatically vectorize the query
limit=2
)

for obj in response.objects:
print(obj.properties["title"])

Next steps

Support

For help with Serverless Cloud, Enterprise Cloud, and Bring Your Own Cloud accounts, contact Weaviate support directly to open a support ticket.

For questions and support from the Weaviate community, try these resources:

To add a support plan, contact Weaviate sales.