Embedded Weaviate
Embedded Weaviate is experimental software. APIs and parameters may change.
Embedded Weaviate is a deployment model that runs a Weaviate instance from your application code rather than from a stand-alone Weaviate server installation.
When Embedded Weaviate starts for the first time, it creates a permanent datastore in the location set in your persistence_data_path
. When your client exits, the Embedded Weaviate instance also exits, but the data persists.
The next time the client runs, the client starts a new instance of Embedded Weaviate. New Embedded Weaviate instances use the data that is saved in the datastore.
Start an Embedded Weaviate instance
- Python Client v4
- Python Client v3
- JS/TS Client v2
import weaviate
import os
client = weaviate.connect_to_embedded(
version=weaviate_version, # e.g. version="1.23.10"
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY") # Replace with your API key
},
)
# Add your client code here.
import weaviate
from weaviate.embedded import EmbeddedOptions
client = weaviate.Client(
embedded_options=EmbeddedOptions()
)
data_obj = {
"name": "Chardonnay",
"description": "Goes with fish"
}
client.data_object.create(data_obj, "Wine")
import weaviate, { EmbeddedOptions } from 'weaviate-ts-embedded';
const client = weaviate.client(new EmbeddedOptions());
await client.embedded.start();
const response = await client.data
.creator()
.withClassName('Wine')
.withProperties({
name: 'Chardonnay',
description: 'Goes with fish',
})
.do();
console.log(JSON.stringify(response, null, 2));
When you exit the client, the Embedded Weaviate instance also exits.
Custom connection configuration
To pass additional configuration details to your embedded instance, use a custom connection:
- Python Client v4
- Python Client v3
- JS/TS Client v2
import weaviate
from weaviate.embedded import EmbeddedOptions
import os
client = weaviate.WeaviateClient(
embedded_options=EmbeddedOptions(
additional_env_vars={
"ENABLE_MODULES": "backup-filesystem,text2vec-openai,text2vec-cohere,text2vec-huggingface,ref2vec-centroid,generative-openai,qna-openai",
"BACKUP_FILESYSTEM_PATH": "/tmp/backups"
}
)
# Add additional options here (see Python client docs for syntax)
)
client.connect() # Call `connect()` to connect to the server when you use `WeaviateClient`
# Add your client code here.
# Uncomment the next line to exit the Embedded Weaviate server.
# client.close()
import weaviate
from weaviate.embedded import EmbeddedOptions
client = weaviate.Client(
embedded_options=EmbeddedOptions(
additional_env_vars={
"ENABLE_MODULES":
"backup-s3,text2vec-openai,text2vec-cohere,text2vec-huggingface,ref2vec-centroid,generative-openai,qna-openai"}
)
)
import weaviate, { EmbeddedOptions } from 'weaviate-ts-embedded';
const client = weaviate.client(
new EmbeddedOptions({
env: {
ENABLE_MODULES: "backup-s3,text2vec-openai,text2vec-cohere,text2vec-huggingface,ref2vec-centroid,generative-openai,qna-openai",
},
})
);
Configuration options
To configure Embedded Weaviate, set these variables in your instantiation code or pass them as parameters when you invoke your client. You can also pass them as system environment variables. All parameters are optional.
Parameter | Type | Default | Description |
---|---|---|---|
additional_env_vars | string | None. | Pass additional environment variables, such as API keys, to the server. |
binary_path | string | varies | Binary download directory. If the binary is not present, the client downloads the binary. If XDG_CACHE_HOME is set, the default is: XDG_CACHE_HOME/weaviate-embedded/ If XDG_CACHE_HOME is not set, the default is: ~/.cache/weaviate-embedded |
hostname | string | 127.0.0.1 | Hostname or IP address |
persistence_data_path | string | varies | Data storage directory. If XDG_DATA_HOME is set, the default is: XDG_DATA_HOME/weaviate/ If XDG_DATA_HOME is not set, the default is: ~/.local/share/weaviate |
port | integer | 8079 | The Weaviate server request port. |
version | string | Latest stable | Specify the version with one of the following: - "latest" - The version number as a string: "1.19.6" - The URL of a Weaviate binary (See below) |
XDG_CACHE_HOME
or XDG_DATA_HOME
The XDG_DATA_HOME
and XDG_CACHE_HOME
environment variables are widely used system variables. If you modify them, you may break other applications.
Default modules
The following modules are enabled by default:
generative-openai
qna-openai
ref2vec-centroid
text2vec-cohere
text2vec-huggingface
text2vec-openai
To enabled additional modules, add them to your instantiation code.
For example, to add the backup-s3
module, instantiate your client like this:
- Python Client v4
- Python Client v3
- JS/TS Client v2
import weaviate
from weaviate.embedded import EmbeddedOptions
import os
client = weaviate.WeaviateClient(
embedded_options=EmbeddedOptions(
additional_env_vars={
"ENABLE_MODULES": "backup-filesystem,text2vec-openai,text2vec-cohere,text2vec-huggingface,ref2vec-centroid,generative-openai,qna-openai",
"BACKUP_FILESYSTEM_PATH": "/tmp/backups"
}
)
# Add additional options here. For syntax, see the Python client documentation.
)
# Run your client code in a context manager or call client.close()
# before exiting the client to avoid connection errors.
client.connect() # Call `connect()` to connect to the server when you use `WeaviateClient`
# Add your client code here.
import weaviate
from weaviate.embedded import EmbeddedOptions
client = weaviate.Client(
embedded_options=EmbeddedOptions(
additional_env_vars={
"ENABLE_MODULES":
"backup-s3,text2vec-openai,text2vec-cohere,text2vec-huggingface,ref2vec-centroid,generative-openai,qna-openai"}
)
)
import weaviate, { EmbeddedOptions } from 'weaviate-ts-embedded';
const client = weaviate.client(
new EmbeddedOptions({
env: {
ENABLE_MODULES: "backup-s3,text2vec-openai,text2vec-cohere,text2vec-huggingface,ref2vec-centroid,generative-openai,qna-openai",
},
})
);
Binary sources
Weaviate core releases include executable Linux binaries. When you instantiate an Embedded Weaviate client, the client checks for local copies of the binary packages. If the client finds the binary files, it runs them to create a temporary Weaviate instance. If not, the client downloads the binaries and saves them in your binary_path
directory.
The Embedded Weaviate instance goes away when your client exits. However, the client does not delete the binary files. The next time your client runs, it checks for the binaries and uses the saved binaries if they exist.
File list
For a list of the files that are included in a release, see the Assets section of the Release Notes page for that release on GitHub.
File URL
To get the URL for a particular binary archive file, follow these steps:
- Find the Weaviate core release you want on the Release Notes page.
- Click to the release notes for that version. The Assets section includes
linux-amd64
andlinux-arm64
binaries intar.gz
format. - Copy the link to the full URL of the
tar.gz
file for your platform.
For example, the URL for the Weaviate 1.19.6
AMD64
binary is:
https://github.com/weaviate/weaviate/releases/download/v1.19.6/weaviate-v1.19.6-linux-amd64.tar.gz
.
Functional overview
Weaviate core usually runs as a stand-alone server that clients connect to in order to access data. An Embedded Weaviate instance is a process that runs in conjunction with a client script or application. Embedded Weaviate instances can access a persistent datastore, but the instances exit when the client exits.
When your client runs, it checks for a stored Weaviate binary. If it finds one, the client uses that binary to create an Embedded Weaviate instance. If not, the client downloads the binary.
The instance also checks for an existing data store. Clients reuse the same data store, updates persist between client invocations.
When you exit the client script or application, the Embedded Weaviate instance also exits:
- Scripts: The Embedded Weaviate instance exits when the script exits.
- Applications: The Embedded Weaviate instance exits when the application exits.
- Jupyter Notebooks: The Embedded Weaviate instance exits when the Jupyter notebook is no longer active.
Embedded server output
The embedded server pipes STDOUT
and STDERR
to the client. To redirect STDERR
in a command terminal, run your script like this:
python3 your_embedded_client_script.py 2>/dev/null
Supported Environments
Embedded Weaviate is supported on Linux and macOS.
Client languages
Embedded Weaviate is supported for Python and TypeScript clients.
Python clients
Python v3 client support is new in v3.15.4
for Linux and v3.21.0
for macOS. The Python client v4 requires server version v1.23.7 or higher.
TypeScript clients
The embedded TypeScript client is no longer a part of the standard TypeScript client.
The embedded client has additional dependencies that are not included in the standard client. However, the embedded client extends the original TypeScript client so after you instantiate an Embedded Weaviate instance, the embedded TypeScript client works the same way as the standard client.
To install the embedded TypeScript client, run this command:
npm install weaviate-ts-embedded
The TypeScript clients are in these GitHub repositories:
Questions and feedback
If you have any questions or feedback, let us know in the user forum.