Embedded Weaviate
Overviewโ
Embedded Weaviate is a new deployment model, which allows you to start a Weaviate instance, straight in your application code using a Weaviate Client library.
Embedded Weaviate is still in the Experimental phase.
Some of the APIs and parameters might change over time, as we work towards a perfect implementation.
How does it work?โ
With every Weaviate release we also publish executable Linux binaries (see assets).
This allows launching the Weaviate database server from the client instantiation call, which makes the "installation" step invisible by pushing it to the background:
- Python
- JavaScript / TypeScript
import weaviate
from weaviate.embedded import EmbeddedOptions
client = weaviate.Client(
embedded_options=EmbeddedOptions()
)
data_obj = {
"name": "Chardonnay",
"description": "Goes with fish"
}
client.data_object.create(data_obj, "Wine")
import { weaviate, EmbeddedClient, EmbeddedOptions } from 'weaviate-ts-embedded';
const client: EmbeddedClient = weaviate.client(new EmbeddedOptions());
await client.embedded.start();
const result = await client.data
.creator()
.withClassName('Wine')
.withProperties({
name: 'Chardonnay',
description: 'Goes with fish',
})
.do();
client.embedded.stop();
Embedded optionsโ
The Weaviate server spawned from the client can be configured via parameters passed at instantiation time, and via environment variables. All parameters are optional.
Parameter | Type | Description | Default value |
---|---|---|---|
persistence_data_path | string | Directory where the files making up the database are stored. | When the XDG_DATA_HOME env variable is set, the default value is:XDG_DATA_HOME/weaviate/ Otherwise it is: ~/.local/share/weaviate |
binary_path | string | Directory where to download the binary. If deleted, the client will download the binary again. | When the XDG_CACHE_HOME env variable is set, the default value is:XDG_CACHE_HOME/weaviate-embedded/ Otherwise it is: ~/.cache/weaviate-embedded |
version | string | Version takes two types of input: - version number - for example "1.18.3" or "latest" - full URL pointing to a Linux AMD64 or ARM64 binary | Latest stable version |
port | integer | Which port the Weaviate server will listen to. Useful when running multiple instances in parallel. | 6666 |
hostname | string | Hostname/IP to bind to. | 127.0.0.1 |
additional_env_vars | key: value | Useful to pass additional environment variables to the server, such as API keys. |
It is not recommended to modify the XDG_DATA_HOME
and XDG_CACHE_HOME
environment variables from the standard XDG Base Directory values, as that might affect many other (non-Weaviate related) applications and services running on the same server.
To find the full URL for version
:
- head to Weaviate releases,
- find the Assets section for the required Weaviate version
- and copy the link to required
(name).tar.gz
file.
For example, here is the URL of the Weaviate 1.18.2
AMD64
binary: https://github.com/weaviate/weaviate/releases/download/v1.18.2/weaviate-v1.18.2-linux-amd64.tar.gz
.
Default modulesโ
The following modules are enabled by default:
generative-openai
qna-openai
ref2vec-centroid
text2vec-cohere
text2vec-huggingface
text2vec-openai
Additional modules can be enabled by setting additional environment variables as laid out above. For instance, to add a module called backup-s3
to the set, you would pass it at instantiation as follows:
Python:
import weaviate
from weaviate.embedded import EmbeddedOptions
client = weaviate.Client(
embedded_options=EmbeddedOptions(
additional_env_vars={
"ENABLE_MODULES":
"backup-s3,text2vec-openai,text2vec-cohere,text2vec-huggingface,ref2vec-centroid,generative-openai,qna-openai"}
)
)
TypeScript:
import weaviate, { EmbeddedClient, EmbeddedOptions } from 'weaviate-ts-embedded';
const client: EmbeddedClient = weaviate.client(
new EmbeddedOptions({
env: {
ENABLE_MODULES: "backup-s3,text2vec-openai,text2vec-cohere,text2vec-huggingface,ref2vec-centroid,generative-openai,qna-openai",
},
})
);
Starting Embedded Weaviate under the hoodโ
Here's what happens behind the scenes when the client uses the embedded options in the instantiation call:
- The client downloads a Weaviate release from GitHub and caches it
- It then spawns a Weaviate process with a data directory configured to a specific location, and listening to the specified port (by default 6666)
- The server's STDOUT and STDERR are piped to the client
- The client connects to this server process (e.g. to
http://127.0.0.1:6666
) and runs the client code - After running the code (when the application terminates), the client shuts down the Weaviate process
- The data directory is preserved, so subsequent invocations have access to the data from all previous invocations, across all clients using the embedded option.
Lifecycleโ
The embedded instance will stay alive for as long as the parent application is running.
When the application exits (e.g. due to an exception or by reaching the end of the script), Weaviate will shut down the embedded instance, but the data will persist.
An Embedded instance will stay alive for as long as the Jupyter notebook is active.
This is really useful, as it will let you experiment and work with your Weaviate projects and examples.
Supported Environmentsโ
Operating Systemsโ
Embedded Weaviate is currently supported on Linux only.
We are actively working to provide support for MacOS. We hope to share an update in the near future.
Language Clientsโ
Pythonโ
The Python client โ v3.15.4
or newer
TypeScriptโ
Due to use of server-side dependencies which are not available in the browser platform, the embedded TypeScript client has been split out into its own project. Therefore the original non-embedded TypeScript client can remain isomorphic.
The TypeScript embedded client simply extends the original TypeScript client, so once instantiated it can be used exactly the same way to interact with Weaviate. It can be installed with the following command:
npm install weaviate-ts-embedded
GitHub repositories:
More Resourcesโ
If you can't find the answer to your question here, please look at the:
- Frequently Asked Questions. Or,
- Knowledge base of old issues. Or,
- For questions: Stackoverflow. Or,
- For more involved discussion: Weaviate Community Forum. Or,
- We also have a Slack channel.