Skip to main content

Embedded Weaviate

Experimental

Embedded Weaviate is experimental software. APIs and parameters may change.

Embedded Weaviate is a new deployment model that runs a Weaviate instance from your application code rather than from a stand-alone Weaviate server installation.

When Embedded Weaviate starts for the first time, it creates a permanent datastore in the location set in your persistence_data_path. When your client exits, the Embedded Weaviate instance also exits, but the data persists . The next time the client runs, it starts a new instance of Embedded Weaviate. New Embedded Weaviate instances use the data that is saved in the datastore.

Start an Embedded Weaviate instance

There are two ways to instantiate a Python client:

  • connect_to_embedded() If you don't need to pass custom connection information, use this method.
  • WeaviateClient() If you need to pass custom connection information, use this method.

Connect with connect_to_embedded().

import weaviate
import os

client = weaviate.connect_to_embedded(
version=weaviate_version, # e.g. version="1.23.10"
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY") # Replace with your API key
},
)

# Add your client code here.

Connect with WeaviateClient().

import weaviate
from weaviate.embedded import EmbeddedOptions
import os

client = weaviate.WeaviateClient(
embedded_options=EmbeddedOptions(
additional_env_vars={
"ENABLE_MODULES": "backup-filesystem,text2vec-openai,text2vec-cohere,text2vec-huggingface,ref2vec-centroid,generative-openai,qna-openai",
"BACKUP_FILESYSTEM_PATH": "/tmp/backups"
}
)
# Add additional options here (see Python client docs for syntax)
)

client.connect() # If instantiating `WeaviateClient` directly, you must call `connect()` to connect to the server.

# Add your client code here.

# Uncomment the next line to exit the Embedded Weaviate server.
# client.close()

To avoid connection errors with WeaviateClient(), use client.close() or a context manager to close your client connections before you exit your script or application.

When you exit the client, the Embedded Weaviate instance also exits.

Configuration options

To configure Embedded Weaviate, set these variables in your instantiation code or pass them as parameters when you invoke your client. You can also pass them as system environment variables. All parameters are optional.

ParameterTypeDefaultDescription
additional_env_varsstringNone.Pass additional environment variables, such as API keys, to the server.
binary_pathstringvariesBinary download directory. If the binary is not present, the client downloads the binary.

If XDG_CACHE_HOME is set, the default is: XDG_CACHE_HOME/weaviate-embedded/

If XDG_CACHE_HOME is not set, the default is: ~/.cache/weaviate-embedded
hostnamestring127.0.0.1Hostname or IP address
persistence_data_pathstringvariesData storage directory.

If XDG_DATA_HOME is set, the default is: XDG_DATA_HOME/weaviate/

If XDG_DATA_HOME is not set, the default is: ~/.local/share/weaviate
portinteger8079The Weaviate server request port.
versionstringLatest stableSpecify the version with one of the following:
-"latest"
- The version number as a string: "1.19.6"
- The URL of a Weaviate binary (See below)
Do not modify XDG_CACHE_HOME or XDG_DATA_HOME

The XDG_DATA_HOME and XDG_CACHE_HOME environment variables are widely used system variables. If you modify them, you may break other applications.

Default modules

The following modules are enabled by default:

  • generative-openai
  • qna-openai
  • ref2vec-centroid
  • text2vec-cohere
  • text2vec-huggingface
  • text2vec-openai

To enabled additional modules, add them to your instantiation code.

For example, to add the backup-s3 module, instantiate your client like this:

import weaviate
from weaviate.embedded import EmbeddedOptions
import os

client = weaviate.WeaviateClient(
embedded_options=EmbeddedOptions(
additional_env_vars={
"ENABLE_MODULES": "backup-filesystem,text2vec-openai,text2vec-cohere,text2vec-huggingface,ref2vec-centroid,generative-openai,qna-openai",
"BACKUP_FILESYSTEM_PATH": "/tmp/backups"
}
)
# Add additional options here. For syntax, see the Python client documentation.
)

client.connect() # If instantiating `WeaviateClient` directly, you must call `connect()` to connect to the server.

# Add your client code here.

# Remember to run your client code in a context manager or call client.close()
# before exiting the client to avoid connection errors.

Binary sources

Weaviate core releases include executable Linux binaries. When you instantiate an Embedded Weaviate client, the client checks for local copies of the binary packages. If the client finds the binary files, it runs them to create a temporary Weaviate instance. If not, the client downloads the binaries and saves them in your binary_path directory.

The Embedded Weaviate instance goes away when your client exits. However, the client does not delete the binary files. The next time your client runs, it checks for the binaries and uses the saved binaries if they exist.

File list

For a list of the files that are included in a release, see the Assets section of the Release Notes page for that release on GitHub.

File URL

To get the URL for a particular binary archive file, follow these steps:

  1. Find the Weaviate core release you want on the Release Notes page.
  2. Click to the release notes for that version. The Assets section includes linux-amd64 and linux-arm64 binaries in tar.gz format.
  3. Copy the link to the full URL of the tar.gz file for your platform.

For example, the URL for the Weaviate 1.19.6 AMD64 binary is:

https://github.com/weaviate/weaviate/releases/download/v1.19.6/weaviate-v1.19.6-linux-amd64.tar.gz.

Functional overview

Weaviate core usually runs as a stand-alone server that clients connect to in order to access data. An Embedded Weaviate instance is a process that runs in conjunction with a client script or application. Embedded Weaviate instances can access a persistent datastore, but the instances exit when the client exits.

When your client runs, it checks for a stored Weaviate binary. If it finds one, the client uses that binary to create an Embedded Weaviate instance. If not, the client downloads the binary.

The instance also checks for an existing data store. Clients reuse the same data store, updates persist between client invocations.

When you exit the client script or application, the Embedded Weaviate instance also exits:

  • Scripts: The Embedded Weaviate instance exits when the script exits.
  • Applications: The Embedded Weaviate instance exits when the application exits.
  • Jupyter Notebooks: The Embedded Weaviate instance exits when the Jupyter notebook is no longer active.

Embedded server output

The embedded server pipes STDOUT and STDERR to the client. To redirect STDERR in a command terminal, run your script like this:

python3 your_embedded_client_script.py 2>/dev/null

Supported Environments

Embedded Weaviate is supported on Linux and macOS.

Client languages

Embedded Weaviate is supported for Python and TypeScript clients.

Python clients

Python v3 client support is new in v3.15.4 for Linux and v3.21.0 for macOS. The Python client v4 requires server version v1.23.7 or higher.

TypeScript clients

The embedded TypeScript client is no longer a part of the standard TypeScript client.

The embedded client has additional dependencies that are not included in the standard client. However, the embedded client extends the original TypeScript client so after you instantiate an Embedded Weaviate instance, the embedded TypeScript client works the same way as the standard client.

To install the embedded TypeScript client, run this command:

npm install weaviate-ts-embedded

The TypeScript clients are in these GitHub repositories:

Questions and feedback

If you have any questions or feedback, please let us know on our forum. For example, you can: