Scalar Quantization (SQ)

Added in v1.26.0

Scalar quantization (SQ) is a vector compression technique that can reduce the size of a vector.

To use SQ, enable it in the collection definition, then add data to the collection.

Basic configuration

SQ can be enabled at collection creation time. To enable SQ, set vector_index_config.

Python Client v4
Python Client v3

import weaviate.classes.config as wc

client.collections.create(
    name="MyCollection",
    vectorizer_config=wc.Configure.Vectorizer.text2vec_openai(),
    vector_index_config=wc.Configure.VectorIndex.hnsw(
        quantizer=wc.Configure.VectorIndex.Quantizer.sq()
    ),
)

API docs

class_definition = {
    "class": "MyCollection",
    "vectorizer": "text2vec-openai",  # Can be any vectorizer
    "vectorIndexType": "hnsw",
    "vectorIndexConfig": {
        "SQ": {
            "enabled": True,
        },
    },
    #  Remainder not shown
}

client.schema.create_class(class_definition)

API docs Deprecated (v3)

Added in v1.31

The ability to enable SQ compression after collection creation was added in Weaviate v1.31.

SQ can also be enabled for an existing collection by updating the collection configuration with the appropriate vector index configuration.

Custom configuration

To tune SQ, set these vectorIndexConfig parameters.

Parameter	Type	Default	Details
`sq`: `enabled`	boolean	`false`	Uses SQ when `true`. The Python client v4 does not use the `enabled` parameter. To enable SQ with the v4 client, set a `quantizer` in the collection definition.
`sq`: `rescoreLimit`	integer	-1	The minimum number of candidates to fetch before rescoring.
`sq`: `trainingLimit`	integer	100000	The size of the training set to determine scalar bucket boundaries.
`sq`: `cache`	boolean	`false`	Use the vector cache when true.
`vectorCacheMaxObjects`	integer	`1e12`	Maximum number of objects in the memory cache. By default, this limit is set to one trillion (`1e12`) objects when a new collection is created. For sizing recommendations, see Vector cache considerations.

Python Client v4
Python Client v3

import weaviate.classes.config as wc

client.collections.create(
    name="MyCollection",
    vectorizer_config=wc.Configure.Vectorizer.text2vec_openai(),
    vector_index_config=wc.Configure.VectorIndex.hnsw(
        distance_metric=wc.VectorDistances.COSINE,
        vector_cache_max_objects=100000,
        quantizer=wc.Configure.VectorIndex.Quantizer.sq(
            rescore_limit=200,
            training_limit=50000,
            cache=True,
        )
    ),
)

API docs

class_definition = {
    "class": "MyCollection",
    "vectorizer": "text2vec-openai",  # Can be any vectorizer
    "vectorIndexType": "hnsw",
    "vectorIndexConfig": {
        "SQ": {
            "enabled": True,
            "rescoreLimit": 200,  # The minimum number of candidates to fetch before rescoring
            "cache": True,  # Default: False
        },
        "vectorCacheMaxObjects": 100000,  # Cache size (used if `cache` enabled)
    },
    # Remainder not shown
}

client.schema.create_class(class_definition)

API docs Deprecated (v3)

Multiple vector embeddings (named vectors)

Added in v1.24

Collections can have multiple named vectors. The vectors in a collection can have their own configurations, and compression must be enabled independently for each vector. Every vector is independent and can use PQ, BQ, SQ, or no compression.

Multi-vector embeddings (ColBERT, ColPali, etc.)

Added in v1.30

Multi-vector embeddings (implemented through models like ColBERT, ColPali, or ColQwen) represent each object or query using multiple vectors instead of a single vector. Just like with single vectors, multi-vectors support PQ, BQ, SQ, or no compression.

During the initial search phase, compressed vectors are used for efficiency. However, when computing the MaxSim operation, uncompressed vectors are utilized to ensure more precise similarity calculations. This approach balances the benefits of compression for search efficiency with the accuracy of uncompressed vectors during final scoring.

Questions and feedback

If you have any questions or feedback, let us know in the user forum.

Basic configuration​

Custom configuration​

Multiple vector embeddings (named vectors)​

Multi-vector embeddings (ColBERT, ColPali, etc.)​

Related pages​

Questions and feedback​

Basic configuration

Custom configuration

Multiple vector embeddings (named vectors)

Multi-vector embeddings (ColBERT, ColPali, etc.)

Related pages

Questions and feedback