MongoDB Schema Design
When MongoDB was introduced a few years ago, one of the important features touted was the ability to be “schemaless” – What does this mean for your documents?
MongoDB schema design does not enforce any schema on the documents stored in a collection. MongoDB essentially stores JSON documents, and each document can contain any structure that you want. Consider some examples from our “contacts” collection below. Here is one document that you can store:
{ 'name':'user1', 'address':' 1 mountain view', 'phone': '123-324-3308', 'SSN':'123-45-7891' }
Now the second document stored in the collection can be of this format:
{ 'name': ' user2', 'employeeid': 546789 }
It’s pretty cool that you can store both these documents in the same collection. The problem, however, starts when you need to retrieve these documents from the collection. How do you tell if the retrieved document contains format 1 or format 2? You can check if the retrieved document contains the ‘ssn’ field and then make a decision. Another option is to store the type of the document in the document itself:
{ 'type': xxx, 'name': .... ... }
In both these cases what you have achieved is moving the schema enforcement from the database to the application –
There is always a schema, it is just a question of where it is implemented.
If you have the right indexes it alleviates the problem to a certain extent. If a majority of your queries are by ’employeeid’ you know that the retrieved document is always of the second format – however, the rest of your code that does not use this index will still have the problem mentioned above. Also If you are using an ODM like mongoose then it automatically already enforces a schema for you on top of MongoDB.
There are several applications that benefit from this flexibility. One scenario that comes to mind is the case of a schema where there are a number of optional fields/columns. In MongoDB, there is no penalty for having some missing columns. Each document can only contain the fields that it needs.
Document Validation
Starting version 3.2.x MongoDB now supports the concept of schema validation using the “validator” construct. This provides many levels of validation – so you can choose the level that works for you. The default behaviour if you don’t use validator is the previous schemaless behaviour. Typically you will create the “validators” at the time of collection creation
db.createCollection( "contacts", { validator: { $or: [ { employeeid: { $exists: true }}, { SSN: { $exists: true } } ] } } )Create schema validations in MongoDB so you can choose the level you needClick To Tweet
Existing Collections
Existing collections can be updated using the ‘collMod’ command:
db.runCommand( { collMod: "contacts”, validator: { $or: [ { employeeid: { $exists: true }}, { SSN: { $exists:true} } ] } } )
Validation Level
MongoDB supports the concept of ‘ValidationLevel’. The default validation level is ‘strict’ which means that inserts and updates fail if the document does not meet the validation criteria. If the validation level is ‘Moderate’ it applies the validation to existing documents that meet the validation criteria. Documents that exist currently and don’t meet the criteria are not validated. While convenient the ‘Moderate’ validation level can get you into trouble down the line – so it needs to be used with care.
Validation Action
By default, the validation action is ‘Error’. If your document fails validation it is an error and the update/insert fails. However, you can also set the Validation action to ‘warn’ which basically logs the schema violation in the log , but does not fail the insert.
What schema design examples would help you on your next project, let us know!