Maintaining data integrity is one of the key goals for database users. So it should come as no surprise that backing up the data is an important part of database best practices.
Weaviate 1.16 release
We are happy to announce the release of Weaviate 1.16
, which brings a set of great features, performance and UX improvements, and fixes.
The brief
If you like your content brief and to the point, here is the TL;DR of this release:
- New Filter Operators – that allow you to filter data based on null values or array lengths
- Distributed Backups – an upgrade to the backup functionality, which allows you to backup data distributed across clusters
- Ref2Vec Centroid Module – a new module that calculates a mean vector of referenced objects
- Node Status API – to quickly check on the health of your running clusters
- Support for Azure-issued OIDC tokens – now you can authenticate with Azure, Keycloak, or Dex OIDC tokens
- Patch releases – ready sooner – starting with Weaviate
1.15
, we publish new patch releases as soon as new important fixes are available, so that you get access to all updates as soon as possible
How we solved a race condition with the Lock Striping pattern
Lock striping in database design
Database design comes with interesting challenges. Like, dealing with race conditions when importing data in parallel streams. But for every new challenge, there is a clever solution. One of those clever solutions is Lock striping. It refers to an arrangement where locking occurs on multiple buckets or 'stripes'.
How to build an Image Search Application with Weaviate
Recently, I was working with my colleague Marcin (an engineer from Weaviate core) on a really cool demo project. The idea was to build an image-search application for dogs, which allows a user to provide a picture of a dog, and the app would respond with the most similar breed. And if a user provides a picture of their partner (I might've tested this on my boyfriend 😄), it returns the breed most similar to them.
Vamana vs. HNSW - Exploring ANN algorithms Part 1
Vector Search engines must be able to search through a vast number of vectors at speed. This is a huge technical challenge that is only becoming more difficult over time as the vector dimensions and dataset sizes increase.
How to choose a Sentence Transformer from Hugging Face
Weaviate has recently unveiled a new module which allows users to easily integrate models from Hugging Face to vectorize their data and incoming queries. At the time of this writing, there are over 700 models that can be easily plugged into Weaviate.
Support for Hugging Face Inference API in Weaviate
Vector Search engines use Machine Learning models to offer incredible functionality to operate on your data. We are looking at anything from summarizers (that can summarize any text into a short) sentence), through auto-labelers (that can classify your data tokens), to transformers and vectorizers (that can convert any data – text, image, audio, etc. – into vectors and use that for context-based queries) and many more use cases.
What are Distance Metrics in Vector Search?
Vector Search engines - like Weaviate - use Machine Learning models to analyze data and calculate vector embeddings. The vector embeddings are stored together with the data in a database, and later are used to query the data.
Weaviate 1.15.1 patch release
We usually wouldn't write a whole blog post about a patch release. But when I chatted with Sebastian (the regular author of our "big" release blog posts series), about the contents of Weaviate v1.15.1
patch, we quickly realized that this release is too important to end up as a side note somewhere.
Why is Vector Search so fast?
Why is this so incredibly fast?
Whenever I talk about vector search, I like to demonstrate it with an example of a semantic search. To add the wow factor, I like to run my queries on a Wikipedia dataset, which is populated with over 28 million paragraphs sourced from Wikipedia.