Skip to main content

Snowpark Container Services (SPCS)

Snowflake provides a hosted solution, Snowpark Container Services (SPCS), that runs containers inside the Snowflake ecosystem. To configure a Weaviate instance that runs in SPCS, follow the steps on this page.

The code in this guide configures a sample SPCS instance. The sample instance demonstrates how to run Weaviate in Snowpark. To configure your own SPCS instance, change the database name, warehouse name, image repository name, and other example values to match your deployment.

Configure your instance

1. Log into Snowflake

Download the SnowSQL client. Use the SnowSQL client to connect to Snowflake.

snowsql -a "YOURINSTANCE" -u "YOURUSER"

2. Setup the user environment

Configure roles and services.

Configure OAUTH

Set up OAUTH integration. Snowflake uses OAUTH to authenticate users to your service.

USE ROLE ACCOUNTADMIN;
CREATE SECURITY INTEGRATION SNOWSERVICES_INGRESS_OAUTH
TYPE=oauth
OAUTH_CLIENT=snowservices_ingress
ENABLED=true;

Grant SYSADMIN privileges

The SYSADMIN uses the BIND SERVICE ENDPOINT to create services.

USE ROLE ACCOUNTADMIN;
GRANT BIND SERVICE ENDPOINT ON ACCOUNT TO ROLE SYSADMIN;

Create Weaviate role and user

Create a role, and a user, for the Weaviate instance. The Jupyter server uses the Weaviate user.

USE ROLE SECURITYADMIN;
CREATE ROLE WEAVIATE_ROLE;

USE ROLE USERADMIN;
CREATE USER weaviate_user
PASSWORD='weaviate123'
DEFAULT_ROLE = WEAVIATE_ROLE
DEFAULT_SECONDARY_ROLES = ('ALL')
MUST_CHANGE_PASSWORD = FALSE;

USE ROLE SECURITYADMIN;
GRANT ROLE WEAVIATE_ROLE TO USER weaviate_user;

3. Configure data storage

Create a database, an image repository, and stages. The image repository holds container images. The stages hold service specification files and files that the service creates.

Create a warehouse

Create a warehouse to use with Weaviate.

USE ROLE SYSADMIN;
CREATE OR REPLACE WAREHOUSE WEAVIATE_WAREHOUSE WITH
WAREHOUSE_SIZE='X-SMALL'
AUTO_SUSPEND = 180
AUTO_RESUME = true
INITIALLY_SUSPENDED=false;

Create a database

Create a Weaviate database and stages.

USE ROLE SYSADMIN;
CREATE DATABASE IF NOT EXISTS WEAVIATE_DEMO;
USE DATABASE WEAVIATE_DEMO;
CREATE IMAGE REPOSITORY WEAVIATE_DEMO.PUBLIC.WEAVIATE_REPO;
CREATE OR REPLACE STAGE YAML_STAGE;
CREATE OR REPLACE STAGE DATA ENCRYPTION = (TYPE = 'SNOWFLAKE_SSE');
CREATE OR REPLACE STAGE FILES ENCRYPTION = (TYPE = 'SNOWFLAKE_SSE');

Grant privileges

Grant privileges to the WEAVIATE_ROLE:

USE ROLE SECURITYADMIN;
GRANT ALL PRIVILEGES ON DATABASE WEAVIATE_DEMO TO WEAVIATE_ROLE;
GRANT ALL PRIVILEGES ON SCHEMA WEAVIATE_DEMO.PUBLIC TO WEAVIATE_ROLE;
GRANT ALL PRIVILEGES ON WAREHOUSE WEAVIATE_WAREHOUSE TO WEAVIATE_ROLE;
GRANT ALL PRIVILEGES ON STAGE WEAVIATE_DEMO.PUBLIC.FILES TO WEAVIATE_ROLE;

Allow external access

External access

The user credentials on this page are for demonstration purposes only. Change the user credentials and password before you connect to the open internet.

USE ROLE ACCOUNTADMIN;
USE DATABASE WEAVIATE_DEMO;
USE SCHEMA PUBLIC;
CREATE NETWORK RULE allow_all_rule
TYPE = 'HOST_PORT'
MODE= 'EGRESS'
VALUE_LIST = ('0.0.0.0:443','0.0.0.0:80');

CREATE EXTERNAL ACCESS INTEGRATION allow_all_eai
ALLOWED_NETWORK_RULES=(allow_all_rule)
ENABLED=TRUE;

GRANT USAGE ON INTEGRATION allow_all_eai TO ROLE SYSADMIN;

4. Setup compute pools

Create compute pools. The compute pools run the application services.

USE ROLE SYSADMIN;
CREATE COMPUTE POOL IF NOT EXISTS WEAVIATE_COMPUTE_POOL
MIN_NODES = 1
MAX_NODES = 1
INSTANCE_FAMILY = CPU_X64_S
AUTO_RESUME = true;
CREATE COMPUTE POOL IF NOT EXISTS TEXT2VEC_COMPUTE_POOL
MIN_NODES = 1
MAX_NODES = 1
INSTANCE_FAMILY = GPU_NV_S
AUTO_RESUME = true;
CREATE COMPUTE POOL IF NOT EXISTS JUPYTER_COMPUTE_POOL
MIN_NODES = 1
MAX_NODES = 1
INSTANCE_FAMILY = CPU_X64_S
AUTO_RESUME = true;

To configure your own instance, edit the pool names and pool sizes to support your application.

To check if the compute pools are active, run DESCRIBE COMPUTE POOL <Pool Name>.

DESCRIBE COMPUTE POOL WEAVIATE_COMPUTE_POOL;
DESCRIBE COMPUTE POOL TEXT2VEC_COMPUTE_POOL;
DESCRIBE COMPUTE POOL JUPYTER_COMPUTE_POOL;

The compute pools take a few minutes to initialize. The pools are ready for use when they reach the ACTIVE or IDLE state.

5. Build the Docker images

Build the Docker images in your local shell. There are three images.

  • The Weaviate image runs the database.
  • The text2vec image lets you process data without leaving Snowpark.
  • The Jupyter image serves notebooks.

The Docker files are in this repo. Clone the repo, and go to the top level directory.

You don't need to modify the Dockerfiles to run this sample instance. However, if you need to use non-standard ports, or make other changes for your deployment, edit the Dockerfiles before you create the containers.

docker build --rm --no-cache --platform linux/amd64 -t weaviate ./images/weaviate
docker build --rm --no-cache --platform linux/amd64 -t jupyter ./images/jupyter
docker build --rm --no-cache --platform linux/amd64 -t text2vec ./images/text2vec

Log in to your Docker repository. The Snowpark account name, username, and password are the same as your snowsql credentials.

docker login <YOUR_SNOWACCOUNT>-<SNOWORG>.registry.snowflakecomputing.com  -u YOUR_SNOWFLAKE_USERNAME

After you login to the Docker repository, tag the images and push them to the repository.

The docker tag commands look like this:

docker tag weaviate <SNOWFLAKE_ORG>-<SNOWFLAKE_ACCOUNT>.registry.snowflakecomputing.com/weaviate_demo/public/weaviate_repo/weaviate
docker tag jupyter <SNOWFLAKE_ORG>-<SNOWFLAKE_ACCOUNT>.registry.snowflakecomputing.com/weaviate_demo/public/weaviate_repo/jupyter
docker tag text2vec <SNOWFLAKE_ORG>-<SNOWFLAKE_ACCOUNT>.registry.snowflakecomputing.com/weaviate_demo/public/weaviate_repo/text2vec

The docker push commands look like this:

docker push <SNOWFLAKE_ORG>-<SNOWFLAKE_ACCOUNT>.registry.snowflakecomputing.com/weaviate_demo/public/weaviate_repo/weaviate
docker push <SNOWFLAKE_ORG>-<SNOWFLAKE_ACCOUNT>.registry.snowflakecomputing.com/weaviate_demo/public/weaviate_repo/jupyter
docker push <SNOWFLAKE_ORG>-<SNOWFLAKE_ACCOUNT>.registry.snowflakecomputing.com/weaviate_demo/public/weaviate_repo/text2vec

6. Setup services

SPCS uses spec files to configure services. The configuration spec files are also in the repo you cloned earlier.

Edit the spec files to specify an image repository for each service. For example, update weaviate.yaml to reference the weaviate image.

image: "<SNOWFLAKE_ORG>-<SNOWFLAKE_ACCOUNT>.registry.snowflakecomputing.com/weaviate_demo/public/weaviate_repo/weaviate"

When you edit the jupyter.yaml spec file, update the SNOW_ACCOUNT field with your Snowflake account name.

SNOW_ACCOUNT: <SNOWFLAKE_ORG>-<SNOWFLAKE_ACCOUNT>

After the spec files are updated, use the snowsql client to upload them.

PUT file:///path/to/jupyter.yaml @yaml_stage overwrite=true auto_compress=false;
PUT file:///path/to/text2vec.yaml @yaml_stage overwrite=true auto_compress=false;
PUT file:///path/to/weaviate.yaml @yaml_stage overwrite=true auto_compress=false;

7. Create the services

Use the snowsql client to create a service for each component.

USE ROLE SYSADMIN;
USE DATABASE WEAVIATE_DEMO;
USE SCHEMA PUBLIC;

CREATE SERVICE WEAVIATE
IN COMPUTE POOL WEAVIATE_COMPUTE_POOL
FROM @YAML_STAGE
SPEC='weaviate.yaml'
MIN_INSTANCES=1
MAX_INSTANCES=1;

CREATE SERVICE JUPYTER
IN COMPUTE POOL JUPYTER_COMPUTE_POOL
FROM @YAML_STAGE
SPEC='jupyter.yaml'
MIN_INSTANCES=1
MAX_INSTANCES=1;

CREATE SERVICE TEXT2VEC
IN COMPUTE POOL TEXT2VEC_COMPUTE_POOL
FROM @YAML_STAGE
SPEC='text2vec.yaml'
MIN_INSTANCES=1
MAX_INSTANCES=1;

8. Grant user permissions

Grant permission for the services to the weaviate_role.

USE ROLE SECURITYADMIN;
GRANT USAGE ON SERVICE WEAVIATE_DEMO.PUBLIC.JUPYTER TO ROLE WEAVIATE_ROLE;

9. Log in to the Jupyter Notebook server

Get the ingress_url URL that you use to access the Jupyter notebook server.

USE ROLE SYSADMIN;
SHOW ENDPOINTS IN SERVICE WEAVIATE_DEMO.PUBLIC.JUPYTER;

Open the ingress_url in a browser. Use the weaviate_user credentials to log in.

10. Load data into your Weaviate instance

Follow these steps to create a schema, and load some sample data into your Weaviate instance.

  1. Download a set of sample Jeopardy Questions.
  2. Rename the file SampleJSON.json, and save it to your desktop.
  3. Upload the file. Drag the file into the Jupyter tree view in your browser or use the 'Upload' button at the upper-right corner.
  4. Copy the data into Weaviate. Use the TestWeaviate.ipynb notebook to copy the data from SampleJSON.json into Weaviate.

Query your data

To query your data, run these queries in the a Jupyter notebook.

import weaviate
import json
import os

client = weaviate.connect_to_custom(
http_host="weaviate",
http_port="8080",
http_secure=False,
grpc_host="weaviate",
grpc_port="50051",
grpc_secure=False
)

collection = client.collections.get("Questions")

# Simple search
response = collection.query.near_text(query="animal",limit=2, include_vector=True)
#confirm vectors exist
for o in response.objects:
print(o.vector)

# Hybrid search client.close()
response = collection.query.hybrid(
query="animals",
limit=5
)

client.close()

Suspend and resume services

To suspend and resume services, run the following code in to the snowsql client.

Suspend services

alter service WEAVIATE suspend;
alter service TEXT2VEC suspend;
alter service JUPYTER suspend;

Resume services:

alter service WEAVIATE resume;
alter service TEXT2VEC resume;
alter service JUPYTER resume;

Cleanup and Removal

To remove the services, run the following code in to the snowsql client.

USE ROLE ACCOUNTADMIN;
DROP USER weaviate_user;

USE ROLE SYSADMIN;
DROP SERVICE WEAVIATE;
DROP SERVICE JUPYTER;
DROP SERVICE TEXT2VEC;
DROP COMPUTE POOL TEXT2VEC_COMPUTE_POOL;
DROP COMPUTE POOL WEAVIATE_COMPUTE_POOL;
DROP COMPUTE POOL JUPYTER_COMPUTE_POOL;
DROP STAGE DATA;
DROP STAGE FILES;
DROP IMAGE REPOSITORY WEAVIATE_DEMO.PUBLIC.WEAVIATE_REPO;
DROP DATABASE WEAVIATE_DEMO;
DROP WAREHOUSE WEAVIATE_WAREHOUSE;

USE ROLE ACCOUNTADMIN;
DROP ROLE WEAVIATE_ROLE;
DROP SECURITY INTEGRATION SNOWSERVICES_INGRESS_OAUTH;