Schema
How to configure a schema
Overviewโ
This page includes information on how to configure your schema in Weaviate. For other schema-related information, see related pages below.
Auto-schemaโ
When a class definition is missing or inadequate for data import, the auto-schema feature infers it based on the object properties and defaults (learn more).
However, you might find it preferable to define the schema manually to ensure that the schema aligns with your specific requirements.
Create a classโ
Class and property names are treated equally no matter how the first letter is cased, eg "Article" == "article".
Generally, however, Weaviate follows GraphQL conventions where classes start with a capital letter and properties start with a lowercase letter.
A class describes a collection of data objects. They are defined as a part of the schema, such as shown in the examples below.
Minimal exampleโ
As a minimum, you must specify the class
parameter for the class name.
- Python
- JavaScript
- TypeScript
class_obj = {"class": "Article"}
client.schema.create_class(class_obj)
let classObj = {'class': 'Article'}
// add the schema
client
.schema
.classCreator()
.withClass(classObj)
.do()
.then(res => {
console.log(res)
})
.catch(err => {
console.error(err)
});
let classObj = {'class': 'Article'}
// add the schema
client
.schema
.classCreator()
.withClass(classObj)
.do()
.then((res: any) => {
console.log(res)
})
.catch((err: Error) => {
console.error(err)
});
Property definitionโ
You can use properties
to specify properties. A class definition can include any number of properties.
- Python
- JavaScript
- TypeScript
class_obj = {
"class": "Article",
"properties": [
{
"dataType": ["text"],
"name": "title",
},
{
"dataType": ["text"],
"name": "body",
},
],
}
client.schema.create_class(class_obj)
let classObj = {
'class': 'Article',
'properties': [
{
'dataType': ['text'],
'name': 'title',
},
{
'dataType': ['text'],
'name': 'body',
},
],
}
// add the schema
client
.schema
.classCreator()
.withClass(classObj)
.do()
.then(res => {
console.log(res)
})
.catch(err => {
console.error(err)
});
let classObj = {
'class': 'Article',
'properties': [
{
'dataType': ['text'],
'name': 'title',
},
{
'dataType': ['text'],
'name': 'body',
},
],
}
// add the schema
client
.schema
.classCreator()
.withClass(classObj)
.do()
.then((res: any) => {
console.log(res)
})
.catch((err: Error) => {
console.error(err)
});
In addition to the property name, you can configure parameters such as the data type, inverted index tokenization and more.
Specify a vectorizerโ
You can set an optional vectorizer
for each class, which will override any default values present in the configuration (e.g. in an environment variable). The following sets the text2vec-openai
module as the vectorizer for the Article
class.
- Python
- JavaScript
- TypeScript
class_obj = {
"class": "Article",
"properties": [
{
"dataType": ["text"],
"name": "title",
},
],
"vectorizer": "text2vec-openai" # This could be any vectorizer
}
client.schema.create_class(class_obj)
let classObj = {
'class': 'Article',
'properties': [
{
'dataType': ['text'],
'name': 'title',
},
],
'vectorizer': 'text2vec-openai' // This could be any vectorizer
}
// add the schema
client
.schema
.classCreator()
.withClass(classObj)
.do()
.then(res => {
console.log(res)
})
.catch(err => {
console.error(err)
});
let classObj = {
'class': 'Article',
'properties': [
{
'dataType': ['text'],
'name': 'title',
},
],
'vectorizer': 'text2vec-openai' // This could be any vectorizer
}
// add the schema
client
.schema
.classCreator()
.withClass(classObj)
.do()
.then((res: any) => {
console.log(res)
})
.catch((err: Error) => {
console.error(err)
});
Class-level module settingsโ
You can set the moduleConfig
parameter at the class-level to set class-wide settings for module behavior. For example, the vectorizer could be configured to set the model used (model
), or whether to vectorize the class name (vectorizeClassName
).
- Python
- JavaScript
- TypeScript
class_obj = {
"class": "Article",
"properties": [
{
"dataType": ["text"],
"name": "title",
},
],
"vectorizer": "text2vec-cohere", # This could be any vectorizer
"moduleConfig": {
"text2vec-cohere": { # This must match the vectorizer used
"vectorizeClassName": True,
"model": "multilingual-22-12",
}
}
}
client.schema.create_class(class_obj)
let classObj = {
'class': 'Article',
'properties': [
{
'dataType': ['text'],
'name': 'title',
},
],
'vectorizer': 'text2vec-cohere', // This could be any vectorizer
'moduleConfig': {
'text2vec-cohere': { // This must match the vectorizer used
'vectorizeClassName': True,
'model': 'multilingual-22-12',
}
}
}
// add the schema
client
.schema
.classCreator()
.withClass(classObj)
.do()
.then(res => {
console.log(res)
})
.catch(err => {
console.error(err)
});
let classObj = {
'class': 'Article',
'properties': [
{
'dataType': ['text'],
'name': 'title',
},
],
'vectorizer': 'text2vec-cohere', // This could be any vectorizer
'moduleConfig': {
'text2vec-cohere': { // This must match the vectorizer used
'vectorizeClassName': True,
'model': 'multilingual-22-12',
}
}
}
// add the schema
client
.schema
.classCreator()
.withClass(classObj)
.do()
.then((res: any) => {
console.log(res)
})
.catch((err: Error) => {
console.error(err)
});
The available parameters vary according to the module (learn more).
Property-level module settingsโ
You can also set the moduleConfig
parameter at the property-level to set property-level settings for module behavior. For example, you could set whether to vectorizer the property name (vectorizePropertyName
), or whether to skip the property from vectorization altogether (skip
).
- Python
- JavaScript
- TypeScript
class_obj = {
"class": "Article",
"properties": [
{
"dataType": ["text"],
"name": "title",
"moduleConfig": {
"text2vec-huggingface": { # This must match the vectorizer used
"skip": False,
"vectorizePropertyName": False
}
}
},
],
"vectorizer": "text2vec-huggingface" # This could be any vectorizer
}
client.schema.create_class(class_obj)
let classObj = {
'class': 'Article',
'properties': [
{
'dataType': ['text'],
'name': 'title',
'moduleConfig': {
'text2vec-huggingface': { // This must match the vectorizer used
'skip': False,
'vectorizePropertyName': False
}
}
},
],
'vectorizer': 'text2vec-huggingface' // This could be any vectorizer
}
// add the schema
client
.schema
.classCreator()
.withClass(classObj)
.do()
.then(res => {
console.log(res)
})
.catch(err => {
console.error(err)
});
let classObj = {
'class': 'Article',
'properties': [
{
'dataType': ['text'],
'name': 'title',
'moduleConfig': {
'text2vec-huggingface': { // This must match the vectorizer used
'skip': False,
'vectorizePropertyName': False
}
}
},
],
'vectorizer': 'text2vec-huggingface' // This could be any vectorizer
}
// add the schema
client
.schema
.classCreator()
.withClass(classObj)
.do()
.then((res: any) => {
console.log(res)
})
.catch((err: Error) => {
console.error(err)
});
The available parameters vary according to the module (learn more).
Indexing, sharding and replication settingsโ
You can also set indexing, sharding and replication settings through the schema. For example, a vector index distance metric can be set for a class, can a replication factor can be set as shown below.
- Python
- JavaScript
- TypeScript
class_obj = {
"class": "Article",
"vectorIndexConfig": {
"distance": "cosine",
},
"replicationConfig": {
"factor": 3,
},
}
client.schema.create_class(class_obj)
let classObj = {
'class': 'Article',
'vectorIndexConfig': {
'distance': 'cosine',
},
'replicationConfig': {
'factor': 3,
},
}
// add the schema
client
.schema
.classCreator()
.withClass(classObj)
.do()
.then(res => {
console.log(res)
})
.catch(err => {
console.error(err)
});
let classObj = {
'class': 'Article',
'vectorIndexConfig': {
'distance': 'cosine',
},
'replicationConfig': {
'factor': 3,
},
}
// add the schema
client
.schema
.classCreator()
.withClass(classObj)
.do()
.then((res: any) => {
console.log(res)
})
.catch((err: Error) => {
console.error(err)
});
You can read more about various parameters here.
- Vector index configuration references
- Inverted index configuration references
- Sharding configuration references
- Replication configuration references
Delete a classโ
If your Weaviate instance contains data you want removed, you can manually delete the unwanted class(es).
Know that deleting a class will also delete all associated objects!
Do not do this to a production database, or anywhere where you do not wish to delete your data.
Run the code below to delete the relevant class and its objects.
- Python
- TypeScript
- Go
- Curl
# delete class "YourClassName" - THIS WILL DELETE ALL DATA IN THIS CLASS
client.schema.delete_class("YourClassName") # Replace with your class name - e.g. "Question"
var className: string = 'YourClassName'; // Replace with your class name
client.schema
.classDeleter()
.withClassName(className)
.do()
.then((res: any) => {
console.log(res);
})
.catch((err: Error) => {
console.error(err)
});
className := "YourClassName"
// delete the class
if err := client.Schema().ClassDeleter().WithClassName(className).Do(context.Background()); err != nil {
// Weaviate will return a 400 if the class does not exist, so this is allowed, only return an error if it's not a 400
if status, ok := err.(*fault.WeaviateClientError); ok && status.StatusCode != http.StatusBadRequest {
panic(err)
}
}
curl \
-X DELETE \
https://some-endpoint.weaviate.network/v1/schema/YourClassName
Update a class definitionโ
Some parts of a class definition may be updated, while others are immutable.
The following sections describe how to add a property in a class, or to modify parameters.
Add a propertyโ
A new property can be added to an existing class.
- Python
- JavaScript
- TypeScript
add_prop = {
"dataType": ["text"],
"name": "body"
}
client.schema.property.create("Article", add_prop)
const className = 'Article';
const prop = {
'dataType': ['text'],
'name': 'body'
};
client
.schema
.propertyCreator()
.withClassName(className)
.withProperty(prop)
.do()
.then(res => {
console.log(res);
})
.catch(err => {
console.error(err)
});
const className = 'Article';
const prop = {
'dataType': ['text'],
'name': 'body'
};
// add the schema
client
.schema
.propertyCreator()
.withClassName(className)
.withProperty(prop)
.do()
.then((res: any) => {
console.log(res)
})
.catch((err: Error) => {
console.error(err)
});
Currently, a property cannot be removed from a class definition or renamed once it has been added. This is due to the high compute cost associated with reindexing the data in such scenarios.
Modify a parameterโ
You can modify some parameters of a schema as shown below. However, many parameters are immutable and cannot be changed once set.
- Python
- JavaScript
- TypeScript
class_obj = {
"invertedIndexConfig": {
"stopwords": {
"preset": "en",
"removals": ["a", "the"]
},
},
}
client.schema.update_config("Article", class_obj)
Coming soon
Coming soon
Review schemaโ
If you want to review the schema, you can retrieve it as shown below.
- Python
- JavaScript
- TypeScript
client.schema.get()
client
.schema
.getter()
.do()
.then(res => {
console.log(res);
})
.catch(err => {
console.error(err)
});
client
.schema
.getter()
.do()
.then((res: any) => {
console.log(res)
})
.catch((err: Error) => {
console.error(err)
});
The response will be a JSON object, such as the example shown below.
Sample schema
{
"classes": [
{
"class": "Article",
"invertedIndexConfig": {
"bm25": {
"b": 0.75,
"k1": 1.2
},
"cleanupIntervalSeconds": 60,
"stopwords": {
"additions": null,
"preset": "en",
"removals": null
}
},
"moduleConfig": {
"text2vec-openai": {
"model": "ada",
"modelVersion": "002",
"type": "text",
"vectorizeClassName": true
}
},
"properties": [
{
"dataType": [
"text"
],
"moduleConfig": {
"text2vec-openai": {
"skip": false,
"vectorizePropertyName": false
}
},
"name": "title",
"tokenization": "word"
},
{
"dataType": [
"text"
],
"moduleConfig": {
"text2vec-openai": {
"skip": false,
"vectorizePropertyName": false
}
},
"name": "body",
"tokenization": "word"
}
],
"replicationConfig": {
"factor": 1
},
"shardingConfig": {
"virtualPerPhysical": 128,
"desiredCount": 1,
"actualCount": 1,
"desiredVirtualCount": 128,
"actualVirtualCount": 128,
"key": "_id",
"strategy": "hash",
"function": "murmur3"
},
"vectorIndexConfig": {
"skip": false,
"cleanupIntervalSeconds": 300,
"maxConnections": 64,
"efConstruction": 128,
"ef": -1,
"dynamicEfMin": 100,
"dynamicEfMax": 500,
"dynamicEfFactor": 8,
"vectorCacheMaxObjects": 1000000000000,
"flatSearchCutoff": 40000,
"distance": "cosine",
"pq": {
"enabled": false,
"bitCompression": false,
"segments": 0,
"centroids": 256,
"encoder": {
"type": "kmeans",
"distribution": "log-normal"
}
}
},
"vectorIndexType": "hnsw",
"vectorizer": "text2vec-openai"
}
]
}
More Resourcesโ
If you can't find the answer to your question here, please look at the:
- Frequently Asked Questions. Or,
- Knowledge base of old issues. Or,
- For questions: Stackoverflow. Or,
- For more involved discussion: Weaviate Community Forum. Or,
- We also have a Slack channel.