DeepEval

DeepEval is an open-source LLM evaluation framework, built for engineers to unit-test LLM applications and AI Agents. It provides out-of-the-box LLM-powered metrics, including RAG, conversational, red-teaming, agentic, multimodal, and custom metrics.

DeepEval and Weaviate

You can use DeepEval to optimize search, retrieval, and RAG with Weaviate by leveraging DeepEval's custom and RAG metrics to select the best hyperparameters like embedding model and top-K for your Weaviate collection.

Custom Metrics

G-Eval
DAG

RAG Metrics

Answer Relevancy
Faithfulness
Contextual Precision
Contextual Recall
Contextual Relevancy

Hands on Learning

Topic	Description	Resource
Optimizing RAG with DeepEval	This notebook shows how to build a RAG pipeline using Weaviate and how to optimize its performance with DeepEval.	Notebook

DeepEval and Weaviate​

Custom Metrics​

RAG Metrics​

Hands on Learning​

DeepEval and Weaviate

Custom Metrics

RAG Metrics

Hands on Learning