Databricks
Overview
Databricks is a data intelligence platform that unifies data, AI and governance on the lakehouse.
Databricks and Weaviate
Databricks' Foundation Model APIs can be called directly from Weaviate, allowing you to use models hosted on the Databricks platform through the text2vec-databricks and generative-databricks modules.
Spark Connector and Weaviate
Apache Spark (or the Python API, PySpark) is an open-source data processing framework used for real-time, large-scale data processing. You can ingest Spark data structures from Databricks into Weaviate using the Weaviate Spark connector. Learn more about the connector in the Weaviate Spark connector repository.
Resources
The resources are broken into categories:
- Hands on Learning: Build your technical understanding with end-to-end tutorials.
- Read and Listen: Develop your conceptual understanding of these technologies.
Hands-on Learning
Weaviate Tutorial
Learn how to ingest data into Weaviate with Spark.
Open →Using the Spark Connector for Weaviate
Learn how to take data from a Spark dataframe and feed it into Weaviate.
Open →Ingest data from Spark into Weaviate
Learn how to ingest data from a Spark dataframe to Weaviate and use the text2vec-databricks and generative-databricks module.
Read & Listen
The Sphere Dataset in Weaviate
Learn how to import and query the Sphere dataset in Weaviate.
Read →The Details Behind the Sphere Dataset in Weaviate
The details on how we ingested ~1 billion article snippets into Weaviate.
Read →Build Scalable Gen AI Data Pipelines with Weaviate and Databricks
Learn how to build generative AI data pipelines at scale with Weaviate and Databricks
Read →