Choose a model
On this page, you can find a list of pre-trained models designed specifically for enterprise retrieval tasks in English and other languages. Additional models and features will be added in the future, so please check back regularly for updates.
How to choose the right model?
Here are some simple recommendations on when you should use a specific model:
Snowflake/snowflake-arctic-embed-m-v1.5
Best for datasets that are primarily in English with text lengths typically under 512 tokens.Snowflake/snowflake-arctic-embed-l-v2.0
Ideal for datasets that include multiple languages or require longer context (up to 8192 tokens). This model is optimized for robust performance on both English and multilingual retrieval tasks.
Below, you can find a complete list of all available models.
Available models
Snowflake/snowflake-arctic-embed-l-v2.0
(default)
- A 568M parameter, 1024-dimensional model for multilingual enterprise retrieval tasks.
- Trained with Matryoshka Representation Learning to allow vector truncation with minimal loss.
- Quantization-friendly: Using scalar quantization and 256 dimensions provides 99% of unquantized, full-precision performance.
- Read more at the Snowflake blog, and the Hugging Face model card
- Allowable
dimensions
: 1024 (default), 256
Snowflake/snowflake-arctic-embed-m-v1.5
- A 109M parameter, 768-dimensional model for enterprise retrieval tasks in English.
- Trained with Matryoshka Representation Learning to allow vector truncation with minimal loss.
- Quantization-friendly: Using scalar quantization and 256 dimensions provides 99% of unquantized, full-precision performance.
- Read more at the Snowflake blog, and the Hugging Face model card
- Allowable
dimensions
: 768 (default), 256
Currently, input exceeding the model's context windows is truncated from the right (i.e. the end of the input).
Vectorizer parameters
model
(optional): The name of the model to use for embedding generation.dimensions
(optional): The number of dimensions to use for the generated embeddings.base_url
(optional): The base URL for the Weaviate Embeddings service. (Not required in most cases.)
The following examples show how to configure Weaviate Embeddings-specific options.
- Python API v4
- JS/TS API v3
- Go
- Java
from weaviate.classes.config import Configure
client.collections.create(
"DemoCollection",
vectorizer_config=[
Configure.NamedVectors.text2vec_weaviate(
name="title_vector",
source_properties=["title"],
model="Snowflake/snowflake-arctic-embed-m-v1.5",
# Further options
# dimensions=256
# base_url="<custom_weaviate_embeddings_url>",
)
],
# Additional parameters not shown
)
await client.collections.create({
name: 'DemoCollection',
properties: [
{
name: 'title',
dataType: 'text' as const,
},
],
vectorizers: [
weaviate.configure.vectorizer.text2VecWeaviate({
name: 'title_vector',
sourceProperties: ['title'],
model: 'Snowflake/snowflake-arctic-embed-m-v1.5',
// // Further options
// dimensions: 256,
// baseUrl: '<custom_weaviate_embeddings_url>',
},
),
],
// Additional parameters not shown
});
// package, imports not shown
func main() {
// Instantiation not shown
ctx := context.Background()
// Define the collection
weaviateVectorizerArcticEmbedMV15 := &models.Class{
Class: "DemoCollection",
VectorConfig: map[string]models.VectorConfig{
"title_vector": {
Vectorizer: map[string]interface{}{
"text2vec-weaviate": map[string]interface{}{
"model": "Snowflake/snowflake-arctic-embed-m-v1.5",
"dimensions": 256, // Or 768
"base_url": "<custom_weaviate_url>",
},
},
},
},
}
// add the collection
err = client.Schema().ClassCreator().WithClass(weaviateVectorizerArcticEmbedMV15).Do(ctx)
if err != nil {
panic(err)
}
}
Map<String, Object> text2vecWeaviate = new HashMap<>();
Map<String, Object> text2vecWeaviateSettings = new HashMap<>();
text2vecWeaviateSettings.put("properties", new String[]{"title"});
text2vecWeaviateSettings.put("model", new String[]{"Snowflake/snowflake-arctic-embed-m-v1.5"});
text2vecWeaviateSettings.put("dimensions", new Integer[]{768}); // 768, 256
text2vecWeaviateSettings.put("base_url", new String[]{"<custom_weaviate_url>"});
text2vecWeaviate.put("text2vec-weaviate", text2vecWeaviateSettings);
// Define the vector configurations
Map<String, WeaviateClass.VectorConfig> vectorConfig = new HashMap<>();
vectorConfig.put("title_vector", WeaviateClass.VectorConfig.builder()
.vectorIndexType("hnsw")
.vectorizer(text2vecWeaviate)
.build());
// Create the collection "DemoCollection"
WeaviateClass clazz = WeaviateClass.builder()
.className("DemoCollection")
.vectorConfig(vectorConfig)
.build();
Result<Boolean> result = client.schema().classCreator().withClass(clazz).run();
Additional resources
- Weaviate Embeddings: Overview
- Weaviate Embeddings: Quickstart
- Weaviate Embeddings: Administration
- Model provider integrations: Weaviate Embeddings
Support
For help with Serverless Cloud, Enterprise Cloud, and Bring Your Own Cloud accounts, contact Weaviate support directly to open a support ticket.
For questions and support from the Weaviate community, try these resources:
To add a support plan, contact Weaviate sales.