Databricks
Databricks is a data intelligence platform that unifies data, AI and governance on the lakehouse.
Databricks and Weaviate
Databricks' Foundation Model APIs can be called directly from Weaviate, allowing you to use models hosted on the Databricks platform through the text2vec-databricks
and generative-databricks
modules.
Spark Connector and Weaviate
Apache Spark (or the Python API, PySpark) is an open-source data processing framework used for real-time, large-scale data processing.
You can ingest Spark data structures from Databricks into Weaviate using the Weaviate Spark connector. Learn more about the connector in the Weaviate Spark connector repository.
Our Resources
The resources are broken into two categories:
Hands on Learning: Build your technical understanding with end-to-end tutorials.
Read and Listen: Develop your conceptual understanding of these technologies.
Hands on Learning
Topic | Description | Resource |
---|---|---|
Weaviate Tutorial | Learn how to ingest data into Weaviate with Spark. | Tutorial |
Using the Spark Connector for Weaviate | Learn how to take data from a Spark dataframe and feed it into Weaviate. | Notebook |
Ingest data from Spark into Weaviate | Learn how to ingest data from a Spark dataframe to Weaviate and use the text2vec-databricks and generative-databricks module. | Notebook |
Read and Listen
Topic | Description | Resource |
---|---|---|
The Sphere Dataset in Weaviate | Learn how to import and query the Sphere dataset in Weaviate. | Blog |
The Details Behind the Sphere Dataset in Weaviate | The details on how we ingested ~1 billion article snippets into Weaviate. | Blog |