Skip to main content

Vectorizers and Rerankers

Overview

This section includes reference guides for retriever & vectorizer modules. As their names suggest, XXX2vec modules are configured to produce a vector for each object.

  • text2vec converts text data
  • img2vec converts image data
  • multi2vec converts image or text data (into the same embedding space)
  • ref2vec converts cross-reference data (from within Weaviate)

Vectorization with text2vec-* modules

Weaviate generates vector embeddings at the object level (rather than for individual properties). For instance text2vec-* modules can generate vectors from text objects. To produce the string to be vectorized from each object, Weaviate follows the schema configuration for the relevant class.

Unless specified otherwise in the schema, the default behavior is to:

  • Only vectorize properties that use the text data type (unless skipped)
  • Sort properties in alphabetical (a-z) order before concatenating values
  • If vectorizePropertyName is true (false by default) prepend the property name to each property value
  • Join the (prepended) property values with spaces
  • Prepend the class name (unless vectorizeClassName is false)
  • Convert the produced string to lowercase
Vector inference at object update

Where Weaviate is configured with a vectorizer, it will only obtain a new vector if an object update changes the underlying text to be vectorized.

Re-ranking

Weaviate includes the following modules for re-ranking the data objects in a result set: