Skip to main content

Weaviate Knowledge Cards

Unlock the power of vector search. Our guides will help you conquer vector embeddings and build better AI applications.

Categories

Intro to Vector Databases

Databases designed to store and search data using vector embeddings, enabling efficient similarity search for unstructured data like text and images.
Intro to Vector Databases
Unstructured Data Objects

Unstructured Data Objects

Unstructured data objects are data objects without a predefined structure, making them difficult to manage conventionally. Examples include text documents, images, audio and video...
Intro to Vector Databases
Vector

Vector

Vectors, or vector embeddings in databases, are quantities with magnitude and direction...
Intro to Vector Databases
Vector Embedding

Vector Embedding

Numerical representations of objects, such as words or images, in a vector space...

Search

A search method that combines vector search with traditional keyword search to improve retrieval accuracy and relevance.
Search
Sparse Vectors

Sparse Vectors

Sparse embeddings are generated from algorithms like BM25 and SPLADE...
Search
Dense Vectors

Dense Vectors

In contrast to sparse vectors, dense vectors contain mostly non-zero values and are generated from machine learning models like GloVe and Transformers...
Search
BM25/BM25F

BM25/BM25F

BM25 is a ranking function used by search engines to estimate the relevance of documents to a given search query. It is part of the family of probabilistic information retrieval models...

Hierarchical Navigable Small World

An indexing algorithm used in vector databases to enable fast and efficient similarity search.

Multimodal RAG

A technique that combines retrieval of relevant multimodal data, such as images, text, audio, or video, with generative large language models to generate natural language responses or content to a query.
Multimodal RAG
Cross Modal Reasoning

Cross Modal Reasoning

Cross-modal reasoning refers to the ability to make connections by integrating information from different modalities or sources including text, images, audio and video...
Multimodal RAG
Multimodal Embeddings Models

Multimodal Embeddings Models

Multimodal Embeddings Models produce a joint embedding space for multimodal data that understands text, images, audio and more...
Multimodal RAG
Multimodal Contrastive Finetuning

Multimodal Contrastive Finetuning

Multimodal Embeddings Models produce a joint embedding space for multimodal data that understands text, images, audio and more...

Databases

Systems for storing and storing, organizing, and retrieving structured or unstructured data efficiently.
Databases
Graph Database

Graph Database

A graph database stores data in nodes and edges, representing entities and their relationships. It's optimized for...
Databases
Inverted Indexes

Inverted Indexes

An inverted index is a database indexing structure that maps keywords to their locations in documents...
Databases
Sharding

Sharding

Sharding is splitting a database into smaller, faster, more easily managed parts called shards...

Large Language Models

Deep learning models trained on massive datasets to understand and generate human-like text, used in applications like chatbots and content generation.
Large Language Models
Large Language Model (LLM)

Large Language Model (LLM)

A Large Language Model (LLM) is a machine learning model that is trained on vast text data, learning language...
Large Language Models
Finetuning

Finetuning

Fine-tuning is a process where a pre-trained machine learning model is further trained on a specific dataset...
Large Language Models
Multi-modal

Multi-modal

Multi-modal learning involves integrating various data types like images, text, audio, and sensory inputs...

Information Retrieval/Search

Techniques for finding relevant information in a large collection of data, such as documents, images, or videos.
Information Retrieval/Search
Reranking

Reranking

Re-ranking is adjusting the scores of search results after initial retrieval. It usually uses a machine learning model...
Information Retrieval/Search
Retrieval Augmented Generation (RAG)

Retrieval Augmented Generation (RAG)

Retrieval Augmented Generation (RAG) is the process of contextualising prompts for large language models...

Embedding Types

Different types of embeddings used in vector databases, such as text embeddings, image embeddings, and multimodal embeddings.
Embedding Types
Variable Dimensions

Variable Dimensions

Flexible embedding sizes, like Matryoshka embeddings. Encode information hierarchically, allowing adaptation...
Embedding Types
Sparse Embeddings

Sparse Embeddings

Sparse vectors are often high-dimensional with many zero values. They are generated from algorithms like BM25 and SPLADE...
Embedding Types
Quantized Embeddings

Quantized Embeddings

Compressed dense vectors using lower-precision data types (e.g., float32 to int8). Reduces memory usage and speeds up search...

Chunking Techniques

Techniques for breaking down large data into smaller, more manageable chunks for processing and storage.
Chunking Techniques
Semantic Chunking

Semantic Chunking

In this technique, the text is divided into meaningful units, such as sentences or paragraphs, which are then vectorized...
Chunking Techniques
Recursive Chunking

Recursive Chunking

Text is initially split using a primary separator, like paragraphs. If the resulting chunks are too large, secondary separators...
Chunking Techniques
LLM-Based Chunking

LLM-Based Chunking

This advanced technique uses a Language Model (LLM) to generate chunks. The LLM processes the text and generates semantically isolated sentences...