Binary Quantization (BQ)

Added in v1.23

BQ is available for the flat index type from v1.23 onwards and for the hnsw index type from v1.24.

Binary quantization (BQ) is a vector compression technique that can reduce the size of a vector.

To use BQ, enable it as shown below and add data to the collection.

Additional information

How to set the index type

Simple BQ configuration

Each collection can be configured to use BQ compression. BQ can be enabled at collection creation time, before data is added to it.

This can be done by setting the vector_index_config of the collection to enable BQ compression.

import weaviate.classes.config as wc

client.collections.create(
    name="MyCollection",
    vectorizer_config=wc.Configure.Vectorizer.text2vec_openai(),
    vector_index_config=wc.Configure.VectorIndex.flat(
        quantizer=wc.Configure.VectorIndex.Quantizer.bq()
    ),
)

API docs

class_definition = {
    "class": "MyCollection",
    "vectorizer": "text2vec-openai",  # Can be any vectorizer
    "vectorIndexType": "flat",
    "vectorIndexConfig": {
        "bq": {
            "enabled": True,
        },
    },
    #  Remainder not shown
}

client.schema.create_class(class_definition)

API docs Deprecated (v3)

import { configure } from 'weaviate-client';

const collection = await client.collections.create({
  name: 'MyCollection',
  vectorizers: weaviate.configure.vectorizer.none({
    vectorIndexConfig: weaviate.configure.vectorIndex.hnsw({
      quantizer: weaviate.configure.vectorIndex.quantizer.bq(),
    })
  })
})

API docs

async function enableBQ() {
  const classObj = {
    class: 'MyCollection',
    vectorizer: 'text2vec-openai', // Can be any vectorizer
    vectorIndexType: 'flat',
    vectorIndexConfig: {
      bq: {
        enabled: true,
      },
    },
    //  Remainder not shown
  };
  const res = await client.schema.classCreator().withClass(classObj).do();
  console.log(res);
}

await enableBQ();

simple_bq := map[string]interface{}{
"enabled": true,
}
class := &models.Class{
Class:      "MyCollection",
Vectorizer: "text2vec-openai",
VectorIndexConfig: map[string]interface{}{
  "bq": simple_bq,
},
// Remainder not shown
}

err = client.Schema().ClassCreator().
WithClass(class).Do(context.Background())

if err != nil {
log.Fatalf("create class: %v", err)
}

API docs

// Coming soon

API docs

Added in v1.31

The ability to enable BQ compression after collection creation was added in Weaviate v1.31.

BQ can also be enabled for an existing collection by updating the collection configuration with the appropriate vector index configuration.

BQ with custom settings

The following parameters are available for BQ compression, under vectorIndexConfig:

Parameter	Type	Default	Details
`bq` : `enabled`	boolean	`false`	Enable BQ. Weaviate uses binary quantization (BQ) compression when `true`. The Python client v4 does not use the `enabled` parameter. To enable BQ with the v4 client, set a `quantizer` in the collection definition.
`bq` : `rescoreLimit`	integer	-1	The minimum number of candidates to fetch before rescoring.
`bq` : `cache`	boolean	`false`	Whether to use the vector cache.
`vectorCacheMaxObjects`	integer	`1e12`	Maximum number of objects in the memory cache. By default, this limit is set to one trillion (`1e12`) objects when a new collection is created. For sizing recommendations, see Vector cache considerations.

For example:

import weaviate.classes.config as wc

client.collections.create(
    name="MyCollection",
    vectorizer_config=wc.Configure.Vectorizer.text2vec_openai(),
    vector_index_config=wc.Configure.VectorIndex.flat(
        distance_metric=wc.VectorDistances.COSINE,
        vector_cache_max_objects=100000,
        quantizer=wc.Configure.VectorIndex.Quantizer.bq(
            rescore_limit=200,
            cache=True
        )
    ),
)

API docs

class_definition = {
    "class": "MyCollection",
    "vectorizer": "text2vec-openai",  # Can be any vectorizer
    "vectorIndexType": "flat",
    "vectorIndexConfig": {
        "bq": {
            "enabled": True,
            "rescoreLimit": 200,  # The minimum number of candidates to fetch before rescoring
            "cache": True,  # Default: False
        },
        "vectorCacheMaxObjects": 100000,  # Cache size (used if `cache` enabled)
    },
    # Remainder not shown
}

client.schema.create_class(class_definition)

API docs Deprecated (v3)

import { configure } from 'weaviate-client';

const collection = await client.collections.create({
  name: 'MyCollection',
  vectorizers: weaviate.configure.vectorizer.none({
    vectorIndexConfig: weaviate.configure.vectorIndex.hnsw({
      quantizer: weaviate.configure.vectorIndex.quantizer.bq({
        cache: true,     // Enable caching
        rescoreLimit: 200, // The minimum number of candidates to fetch before rescoring
      }),
      vectorCacheMaxObjects: 10000 // Cache size (used if `cache` enabled)
    })
  })
})

API docs

async function bqWithOptions() {
  const classObj = {
    class: 'MyCollection',
    vectorizer: 'text2vec-openai', // Can be any vectorizer
    vectorIndexType: 'flat',
    vectorIndexConfig: {
      bq: {
        enabled: true,
        rescoreLimit: 200, // The minimum number of candidates to fetch before rescoring
        cache: true, // Default: false
      },
      vectorCacheMaxObjects: 100000, // Cache size (used if `cache` enabled)
    },
    //  Remainder not shown
  };
  const res = await client.schema.classCreator().withClass(classObj).do();
  console.log(res);
}

await bqWithOptions();

custom_bq := map[string]interface{}{
"enabled":      true,
"rescoreLimit": 200,
"cache":        true,
}
class := &models.Class{
Class:      "MyCollection",
Vectorizer: "text2vec-openai",
VectorIndexConfig: map[string]interface{}{
  "bq": custom_bq,
  "vectorCacheMaxObjects": 100_000,
},
// Remainder not shown
}

err = client.Schema().ClassCreator().
WithClass(class).Do(context.Background())

if err != nil {
log.Fatalf("create class: %v", err)
}

API docs

// Coming soon

API docs

Multiple vector embeddings (named vectors)

Added in v1.24

Collections can have multiple named vectors. The vectors in a collection can have their own configurations, and compression must be enabled independently for each vector. Every vector is independent and can use PQ, BQ, SQ, or no compression.

Multi-vector embeddings (ColBERT, ColPali, etc.)

Added in v1.30

Multi-vector embeddings (implemented through models like ColBERT, ColPali, or ColQwen) represent each object or query using multiple vectors instead of a single vector. Just like with single vectors, multi-vectors support PQ, BQ, SQ, or no compression.

During the initial search phase, compressed vectors are used for efficiency. However, when computing the MaxSim operation, uncompressed vectors are utilized to ensure more precise similarity calculations. This approach balances the benefits of compression for search efficiency with the accuracy of uncompressed vectors during final scoring.

Questions and feedback

If you have any questions or feedback, let us know in the user forum.

Simple BQ configuration​

BQ with custom settings​

Multiple vector embeddings (named vectors)​

Multi-vector embeddings (ColBERT, ColPali, etc.)​

Related pages​

Questions and feedback​

Simple BQ configuration

BQ with custom settings

Multiple vector embeddings (named vectors)

Multi-vector embeddings (ColBERT, ColPali, etc.)

Related pages

Questions and feedback