Snowpark Container Services (SPCS)
Snowflake provides a hosted solution, Snowpark Container Services (SPCS), that runs containers inside the Snowflake ecosystem. To configure a Weaviate instance that runs in SPCS, follow the steps on this page.
The code in this guide configures a sample SPCS instance. The sample instance demonstrates how to run Weaviate in Snowpark. To configure your own SPCS instance, change the database name, warehouse name, image repository name, and other example values to match your deployment.
Configure your instance
1. Log into Snowflake
Download the SnowSQL client. Use the SnowSQL client to connect to Snowflake.
snowsql -a "YOURINSTANCE" -u "YOURUSER"
2. Setup the user environment
Configure roles and services.
Configure OAUTH
Set up OAUTH integration. Snowflake uses OAUTH to authenticate users to your service.
USE ROLE ACCOUNTADMIN;
CREATE SECURITY INTEGRATION SNOWSERVICES_INGRESS_OAUTH
TYPE=oauth
OAUTH_CLIENT=snowservices_ingress
ENABLED=true;
Grant SYSADMIN privileges
The SYSADMIN uses the BIND SERVICE ENDPOINT to create services.
USE ROLE ACCOUNTADMIN;
GRANT BIND SERVICE ENDPOINT ON ACCOUNT TO ROLE SYSADMIN;
Create Weaviate role and user
Create a role, and a user, for the Weaviate instance. The Jupyter server uses the Weaviate user.
USE ROLE SECURITYADMIN;
CREATE ROLE WEAVIATE_ROLE;
USE ROLE USERADMIN;
CREATE USER weaviate_user
PASSWORD='weaviate123'
DEFAULT_ROLE = WEAVIATE_ROLE
DEFAULT_SECONDARY_ROLES = ('ALL')
MUST_CHANGE_PASSWORD = FALSE;
USE ROLE SECURITYADMIN;
GRANT ROLE WEAVIATE_ROLE TO USER weaviate_user;
3. Configure data storage
Create a database, an image repository, and stages. The image repository holds container images. The stages hold service specification files and files that the service creates.
Create a warehouse
Create a warehouse to use with Weaviate.
USE ROLE SYSADMIN;
CREATE OR REPLACE WAREHOUSE WEAVIATE_WAREHOUSE WITH
WAREHOUSE_SIZE='X-SMALL'
AUTO_SUSPEND = 180
AUTO_RESUME = true
INITIALLY_SUSPENDED=false;
Create a database
Create a Weaviate database and stages.
USE ROLE SYSADMIN;
CREATE DATABASE IF NOT EXISTS WEAVIATE_DEMO;
USE DATABASE WEAVIATE_DEMO;
CREATE IMAGE REPOSITORY WEAVIATE_DEMO.PUBLIC.WEAVIATE_REPO;
CREATE OR REPLACE STAGE YAML_STAGE;
CREATE OR REPLACE STAGE DATA ENCRYPTION = (TYPE = 'SNOWFLAKE_SSE');
CREATE OR REPLACE STAGE FILES ENCRYPTION = (TYPE = 'SNOWFLAKE_SSE');
Grant privileges
Grant privileges to the WEAVIATE_ROLE:
USE ROLE SECURITYADMIN;
GRANT ALL PRIVILEGES ON DATABASE WEAVIATE_DEMO TO WEAVIATE_ROLE;
GRANT ALL PRIVILEGES ON SCHEMA WEAVIATE_DEMO.PUBLIC TO WEAVIATE_ROLE;
GRANT ALL PRIVILEGES ON WAREHOUSE WEAVIATE_WAREHOUSE TO WEAVIATE_ROLE;
GRANT ALL PRIVILEGES ON STAGE WEAVIATE_DEMO.PUBLIC.FILES TO WEAVIATE_ROLE;
Allow external access
The user credentials on this page are for demonstration purposes only. Change the user credentials and password before you connect to the open internet.
USE ROLE ACCOUNTADMIN;
USE DATABASE WEAVIATE_DEMO;
USE SCHEMA PUBLIC;
CREATE NETWORK RULE allow_all_rule
TYPE = 'HOST_PORT'
MODE= 'EGRESS'
VALUE_LIST = ('0.0.0.0:443','0.0.0.0:80');
CREATE EXTERNAL ACCESS INTEGRATION allow_all_eai
ALLOWED_NETWORK_RULES=(allow_all_rule)
ENABLED=TRUE;
GRANT USAGE ON INTEGRATION allow_all_eai TO ROLE SYSADMIN;
4. Setup compute pools
Create compute pools. The compute pools run the application services.
USE ROLE SYSADMIN;
CREATE COMPUTE POOL IF NOT EXISTS WEAVIATE_COMPUTE_POOL
MIN_NODES = 1
MAX_NODES = 1
INSTANCE_FAMILY = CPU_X64_S
AUTO_RESUME = true;
CREATE COMPUTE POOL IF NOT EXISTS TEXT2VEC_COMPUTE_POOL
MIN_NODES = 1
MAX_NODES = 1
INSTANCE_FAMILY = GPU_NV_S
AUTO_RESUME = true;
CREATE COMPUTE POOL IF NOT EXISTS JUPYTER_COMPUTE_POOL
MIN_NODES = 1
MAX_NODES = 1
INSTANCE_FAMILY = CPU_X64_S
AUTO_RESUME = true;
To configure your own instance, edit the pool names and pool sizes to support your application.
To check if the compute pools are active, run DESCRIBE COMPUTE POOL <Pool Name>
.
DESCRIBE COMPUTE POOL WEAVIATE_COMPUTE_POOL;
DESCRIBE COMPUTE POOL TEXT2VEC_COMPUTE_POOL;
DESCRIBE COMPUTE POOL JUPYTER_COMPUTE_POOL;
The compute pools take a few minutes to initialize. The pools are ready for use when they reach the ACTIVE
or IDLE
state.
5. Build the Docker images
Build the Docker images in your local shell. There are three images.
- The Weaviate image runs the database.
- The
text2vec
image lets you process data without leaving Snowpark. - The Jupyter image serves notebooks.
The Docker files are in this repo. Clone the repo, and go to the top level directory.
You don't need to modify the Dockerfiles to run this sample instance. However, if you need to use non-standard ports, or make other changes for your deployment, edit the Dockerfiles before you create the containers.
docker build --rm --no-cache --platform linux/amd64 -t weaviate ./images/weaviate
docker build --rm --no-cache --platform linux/amd64 -t jupyter ./images/jupyter
docker build --rm --no-cache --platform linux/amd64 -t text2vec ./images/text2vec
Log in to your Docker repository. The Snowpark account name, username, and password are the same as your snowsql
credentials.
docker login <YOUR_SNOWACCOUNT>-<SNOWORG>.registry.snowflakecomputing.com -u YOUR_SNOWFLAKE_USERNAME
After you login to the Docker repository, tag the images and push them to the repository.
The docker tag
commands look like this:
docker tag weaviate <SNOWFLAKE_ORG>-<SNOWFLAKE_ACCOUNT>.registry.snowflakecomputing.com/weaviate_demo/public/weaviate_repo/weaviate
docker tag jupyter <SNOWFLAKE_ORG>-<SNOWFLAKE_ACCOUNT>.registry.snowflakecomputing.com/weaviate_demo/public/weaviate_repo/jupyter
docker tag text2vec <SNOWFLAKE_ORG>-<SNOWFLAKE_ACCOUNT>.registry.snowflakecomputing.com/weaviate_demo/public/weaviate_repo/text2vec
The docker push
commands look like this:
docker push <SNOWFLAKE_ORG>-<SNOWFLAKE_ACCOUNT>.registry.snowflakecomputing.com/weaviate_demo/public/weaviate_repo/weaviate
docker push <SNOWFLAKE_ORG>-<SNOWFLAKE_ACCOUNT>.registry.snowflakecomputing.com/weaviate_demo/public/weaviate_repo/jupyter
docker push <SNOWFLAKE_ORG>-<SNOWFLAKE_ACCOUNT>.registry.snowflakecomputing.com/weaviate_demo/public/weaviate_repo/text2vec
6. Setup services
SPCS uses spec files
to configure services. The configuration spec files are also in the repo you cloned earlier.
Edit the spec files
to specify an image repository for each service. For example, update weaviate.yaml
to reference the weaviate
image.
image: "<SNOWFLAKE_ORG>-<SNOWFLAKE_ACCOUNT>.registry.snowflakecomputing.com/weaviate_demo/public/weaviate_repo/weaviate"
When you edit the jupyter.yaml
spec file, update the SNOW_ACCOUNT
field with your Snowflake account name.
SNOW_ACCOUNT: <SNOWFLAKE_ORG>-<SNOWFLAKE_ACCOUNT>
After the spec files are updated, use the snowsql
client to upload them.
PUT file:///path/to/jupyter.yaml @yaml_stage overwrite=true auto_compress=false;
PUT file:///path/to/text2vec.yaml @yaml_stage overwrite=true auto_compress=false;
PUT file:///path/to/weaviate.yaml @yaml_stage overwrite=true auto_compress=false;
7. Create the services
Use the snowsql
client to create a service for each component.
USE ROLE SYSADMIN;
USE DATABASE WEAVIATE_DEMO;
USE SCHEMA PUBLIC;
CREATE SERVICE WEAVIATE
IN COMPUTE POOL WEAVIATE_COMPUTE_POOL
FROM @YAML_STAGE
SPEC='weaviate.yaml'
MIN_INSTANCES=1
MAX_INSTANCES=1;
CREATE SERVICE JUPYTER
IN COMPUTE POOL JUPYTER_COMPUTE_POOL
FROM @YAML_STAGE
SPEC='jupyter.yaml'
MIN_INSTANCES=1
MAX_INSTANCES=1;
CREATE SERVICE TEXT2VEC
IN COMPUTE POOL TEXT2VEC_COMPUTE_POOL
FROM @YAML_STAGE
SPEC='text2vec.yaml'
MIN_INSTANCES=1
MAX_INSTANCES=1;
8. Grant user permissions
Grant permission for the services to the weaviate_role
.
USE ROLE SECURITYADMIN;
GRANT USAGE ON SERVICE WEAVIATE_DEMO.PUBLIC.JUPYTER TO ROLE WEAVIATE_ROLE;
9. Log in to the Jupyter Notebook server
Get the ingress_url
URL that you use to access the Jupyter notebook server.
USE ROLE SYSADMIN;
SHOW ENDPOINTS IN SERVICE WEAVIATE_DEMO.PUBLIC.JUPYTER;
Open the ingress_url
in a browser. Use the weaviate_user
credentials to log in.
10. Load data into your Weaviate instance
Follow these steps to create a schema, and load some sample data into your Weaviate instance.
- Download a set of sample Jeopardy Questions.
- Rename the file
SampleJSON.json
, and save it to your desktop. - Upload the file. Drag the file into the Jupyter tree view in your browser or use the 'Upload' button at the upper-right corner.
- Copy the data into Weaviate. Use the
TestWeaviate.ipynb
notebook to copy the data fromSampleJSON.json
into Weaviate.
Query your data
To query your data, run these queries in the a Jupyter notebook.
import weaviate
import json
import os
client = weaviate.connect_to_custom(
http_host="weaviate",
http_port=8080,
http_secure=False,
grpc_host="weaviate",
grpc_port=50051,
grpc_secure=False
)
collection = client.collections.get("Questions")
# Simple search
response = collection.query.near_text(query="animal",limit=2, include_vector=True)
#confirm vectors exist
for o in response.objects:
print(o.vector)
# Hybrid search client.close()
response = collection.query.hybrid(
query="animals",
limit=5
)
client.close()
Suspend and resume services
To suspend and resume services, run the following code in to the snowsql
client.
Suspend services
alter service WEAVIATE suspend;
alter service TEXT2VEC suspend;
alter service JUPYTER suspend;
Resume services:
alter service WEAVIATE resume;
alter service TEXT2VEC resume;
alter service JUPYTER resume;
Cleanup and Removal
To remove the services, run the following code in to the snowsql
client.
USE ROLE ACCOUNTADMIN;
DROP USER weaviate_user;
USE ROLE SYSADMIN;
DROP SERVICE WEAVIATE;
DROP SERVICE JUPYTER;
DROP SERVICE TEXT2VEC;
DROP COMPUTE POOL TEXT2VEC_COMPUTE_POOL;
DROP COMPUTE POOL WEAVIATE_COMPUTE_POOL;
DROP COMPUTE POOL JUPYTER_COMPUTE_POOL;
DROP STAGE DATA;
DROP STAGE FILES;
DROP IMAGE REPOSITORY WEAVIATE_DEMO.PUBLIC.WEAVIATE_REPO;
DROP DATABASE WEAVIATE_DEMO;
DROP WAREHOUSE WEAVIATE_WAREHOUSE;
USE ROLE ACCOUNTADMIN;
DROP ROLE WEAVIATE_ROLE;
DROP SECURITY INTEGRATION SNOWSERVICES_INGRESS_OAUTH;