Skip to main content

Schema

LICENSEย Weaviate on Stackoverflow badgeย Weaviate issues on GitHub badgeย Weaviate version badgeย Weaviate total Docker pulls badgeย Go Report Card

How to configure a schema

Overviewโ€‹

This page includes information on how to configure your schema in Weaviate. For other schema-related information, see related pages below.

Auto-schemaโ€‹

When a class definition is missing or inadequate for data import, the auto-schema feature infers it based on the object properties and defaults (learn more).

However, you might find it preferable to define the schema manually to ensure that the schema aligns with your specific requirements.

Create a classโ€‹

Capitalization

Class and property names are treated equally no matter how the first letter is cased, eg "Article" == "article".

Generally, however, Weaviate follows GraphQL conventions where classes start with a capital letter and properties start with a lowercase letter.

A class describes a collection of data objects. They are defined as a part of the schema, such as shown in the examples below.

Minimal exampleโ€‹

As a minimum, you must specify the class parameter for the class name.

class_obj = {"class": "Article"}

client.schema.create_class(class_obj)

Property definitionโ€‹

You can use properties to specify properties. A class definition can include any number of properties.

class_obj = {
"class": "Article",
"properties": [
{
"dataType": ["text"],
"name": "title",
},
{
"dataType": ["text"],
"name": "body",
},
],
}

client.schema.create_class(class_obj)

In addition to the property name, you can configure parameters such as the data type, inverted index tokenization and more.

Specify a vectorizerโ€‹

You can set an optional vectorizer for each class, which will override any default values present in the configuration (e.g. in an environment variable). The following sets the text2vec-openai module as the vectorizer for the Article class.

class_obj = {
"class": "Article",
"properties": [
{
"dataType": ["text"],
"name": "title",
},
],
"vectorizer": "text2vec-openai" # This could be any vectorizer
}

client.schema.create_class(class_obj)

Class-level module settingsโ€‹

You can set the moduleConfig parameter at the class-level to set class-wide settings for module behavior. For example, the vectorizer could be configured to set the model used (model), or whether to vectorize the class name (vectorizeClassName).

class_obj = {
"class": "Article",
"properties": [
{
"dataType": ["text"],
"name": "title",
},
],
"vectorizer": "text2vec-cohere", # This could be any vectorizer
"moduleConfig": {
"text2vec-cohere": { # This must match the vectorizer used
"vectorizeClassName": True,
"model": "multilingual-22-12",
}
}
}

client.schema.create_class(class_obj)

The available parameters vary according to the module (learn more).

Property-level module settingsโ€‹

You can also set the moduleConfig parameter at the property-level to set property-level settings for module behavior. For example, you could set whether to vectorizer the property name (vectorizePropertyName), or whether to skip the property from vectorization altogether (skip).

class_obj = {
"class": "Article",
"properties": [
{
"dataType": ["text"],
"name": "title",
"moduleConfig": {
"text2vec-huggingface": { # This must match the vectorizer used
"skip": False,
"vectorizePropertyName": False
}
}
},
],
"vectorizer": "text2vec-huggingface" # This could be any vectorizer
}

client.schema.create_class(class_obj)

The available parameters vary according to the module (learn more).

Indexing, sharding and replication settingsโ€‹

You can also set indexing, sharding and replication settings through the schema. For example, a vector index distance metric can be set for a class, can a replication factor can be set as shown below.

class_obj = {
"class": "Article",
"vectorIndexConfig": {
"distance": "cosine",
},
"replicationConfig": {
"factor": 3,
},
}

client.schema.create_class(class_obj)

You can read more about various parameters here.

Delete a classโ€‹

If your Weaviate instance contains data you want removed, you can manually delete the unwanted class(es).

Deleting a class == Deleting its objects

Know that deleting a class will also delete all associated objects!

Do not do this to a production database, or anywhere where you do not wish to delete your data.

Run the code below to delete the relevant class and its objects.

# delete class "YourClassName" - THIS WILL DELETE ALL DATA IN THIS CLASS
client.schema.delete_class("YourClassName") # Replace with your class name - e.g. "Question"

Update a class definitionโ€‹

Some parts of a class definition may be updated, while others are immutable.

The following sections describe how to add a property in a class, or to modify parameters.

Add a propertyโ€‹

A new property can be added to an existing class.

add_prop = {
"dataType": ["text"],
"name": "body"
}

client.schema.property.create("Article", add_prop)
Property removal/change currently not possible

Currently, a property cannot be removed from a class definition or renamed once it has been added. This is due to the high compute cost associated with reindexing the data in such scenarios.

Modify a parameterโ€‹

You can modify some parameters of a schema as shown below. However, many parameters are immutable and cannot be changed once set.

class_obj = {
"invertedIndexConfig": {
"stopwords": {
"preset": "en",
"removals": ["a", "the"]
},
},
}

client.schema.update_config("Article", class_obj)

Review schemaโ€‹

If you want to review the schema, you can retrieve it as shown below.

client.schema.get()

The response will be a JSON object, such as the example shown below.

Sample schema
{
"classes": [
{
"class": "Article",
"invertedIndexConfig": {
"bm25": {
"b": 0.75,
"k1": 1.2
},
"cleanupIntervalSeconds": 60,
"stopwords": {
"additions": null,
"preset": "en",
"removals": null
}
},
"moduleConfig": {
"text2vec-openai": {
"model": "ada",
"modelVersion": "002",
"type": "text",
"vectorizeClassName": true
}
},
"properties": [
{
"dataType": [
"text"
],
"moduleConfig": {
"text2vec-openai": {
"skip": false,
"vectorizePropertyName": false
}
},
"name": "title",
"tokenization": "word"
},
{
"dataType": [
"text"
],
"moduleConfig": {
"text2vec-openai": {
"skip": false,
"vectorizePropertyName": false
}
},
"name": "body",
"tokenization": "word"
}
],
"replicationConfig": {
"factor": 1
},
"shardingConfig": {
"virtualPerPhysical": 128,
"desiredCount": 1,
"actualCount": 1,
"desiredVirtualCount": 128,
"actualVirtualCount": 128,
"key": "_id",
"strategy": "hash",
"function": "murmur3"
},
"vectorIndexConfig": {
"skip": false,
"cleanupIntervalSeconds": 300,
"maxConnections": 64,
"efConstruction": 128,
"ef": -1,
"dynamicEfMin": 100,
"dynamicEfMax": 500,
"dynamicEfFactor": 8,
"vectorCacheMaxObjects": 1000000000000,
"flatSearchCutoff": 40000,
"distance": "cosine",
"pq": {
"enabled": false,
"bitCompression": false,
"segments": 0,
"centroids": 256,
"encoder": {
"type": "kmeans",
"distribution": "log-normal"
}
}
},
"vectorIndexType": "hnsw",
"vectorizer": "text2vec-openai"
}
]
}

More Resourcesโ€‹

If you can't find the answer to your question here, please look at the:

  1. Frequently Asked Questions. Or,
  2. Knowledge base of old issues. Or,
  3. For questions: Stackoverflow. Or,
  4. For more involved discussion: Weaviate Community Forum. Or,
  5. We also have a Slack channel.