One of the major developments this past year were the advancements made in machine learning models that can create beautiful and novel images such as the ones below. Though machine learning models with the capability to create images have existed for a while, this previous year we saw a marked improvement in the quality and photo-realism of the images created by these models.
You probably know that Weaviate converts a text corpus into a set of vectors - each object is given a vector that captures its 'meaning'. But you might not know exactly how it does that, or how to adjust that behavior. Here, we will pull back the curtains to examine those questions, by revealing some of the mechanics behind
Hybrid search is a technique that combines multiple search algorithms to improve the accuracy and relevance of search results. It uses the best features of both keyword-based search algorithms with vector search techniques. By leveraging the strengths of different algorithms, it provides a more effective search experience for users.
Earlier in December, we wrote a blog post about importing the Sphere dataset into Weaviate. In that post, we talked about what the Sphere dataset is, announced the release of the Sphere dataset files for Weaviate, and shared how you can use Weaviate to search through large datasets like Sphere. More specifically, we provided a short guide on how you could import Sphere into Weaviate using Python as well as Spark, before finishing with example queries on the entire Sphere dataset. If you haven’t checked that out we recommend you have a quick look!
We are happy to announce the release of Weaviate
1.17, which brings a set of great features, performance improvements, and fixes.
If you like your content brief and to the point, here is the TL;DR of this release:
- Replication - configure multi-node replication to improve your database resilience and performance
- Hybrid Search - combine dense and sparse vectors to deliver the best of both search methods
- BM25 - search your data with bm25
- Faster Startup and Imports - enjoy Weaviate, that is up and running faster than ever before with more speedy data imports
- Other Improvements and Bug Fixes – enjoy a more stable Weaviate experience, courtesy of fixes and improvements delivered in nine installments since 1.16.0.
Natural Language Processing (NLP) has enabled computers to understand the human language. It has shifted the way humans build and interact with computers. Large Language Models (LLMs) underpin the latest developments in NLP and have gained traction in various applications. Cohere is an AI platform that provides its users with access to its LLMs. Cohere gives developers and businesses the ability to implement NLP as part of their toolkit.
What is Sphere?
Sphere is an open-source dataset recently released by Meta. It is a collection of 134 million documents (broken up into 906 million 100-word snippets). It is one of the largest knowledge bases that can help solve knowledge-intensive natural language tasks such as question-answering, fact-checking, and much more.
In the world of Vector Search, we use vector embeddings – generated by Machine Learning models – to represent data objects (text, images, audio, etc.). The key idea here is that embeddings that are semantically similar to each other have a smaller distance between them.
Weaviate 1.16 introduced the Ref2Vec module. In this article, we give you an overview of what Ref2Vec is and some examples in which it can add value such as recommendations or representing long objects.