Skip to main content

Databricks

Databricks is a data intelligence platform that unifies data, AI and governance on the lakehouse.

Databricks and Weaviate

Databricks' Foundation Model APIs can be called directly from Weaviate, allowing you to use models hosted on the Databricks platform through the text2vec-databricks and generative-databricks modules.

Spark Connector and Weaviate

Apache Spark (or the Python API, PySpark) is an open-source data processing framework used for real-time, large-scale data processing.

You can ingest Spark data structures from Databricks into Weaviate using the Weaviate Spark connector. Learn more about the connector in the Weaviate Spark connector repository.

Our Resources

The resources are broken into two categories:

  1. Hands on Learning: Build your technical understanding with end-to-end tutorials.

  2. Read and Listen: Develop your conceptual understanding of these technologies.

Hands on Learning

TopicDescriptionResource
Weaviate TutorialLearn how to ingest data into Weaviate with Spark.Tutorial
Using the Spark Connector for WeaviateLearn how to take data from a Spark dataframe and feed it into Weaviate.Notebook
Ingest data from Spark into WeaviateLearn how to ingest data from a Spark dataframe to Weaviate and use the text2vec-databricks and generative-databricks module.Notebook

Read and Listen

TopicDescriptionResource
The Sphere Dataset in WeaviateLearn how to import and query the Sphere dataset in Weaviate.Blog
The Details Behind the Sphere Dataset in WeaviateThe details on how we ingested ~1 billion article snippets into Weaviate.Blog