MongoDB® Schema Design: There Is Always A Schema

3 min read
MongoDB® Schema Design: There Is Always A Schema


MongoDB Schema Design

When MongoDB was introduced a few years ago, one of the important features touted was the ability to be “schemaless” – What does this mean for your documents?

MongoDB schema design does not enforce any schema on the documents stored in a collection. MongoDB essentially stores JSON documents, and each document can contain any structure that you want. Consider some examples from our “contacts” collection below. Here is one document that you can store:

  'name': 'user1',
  'address': '1 mountain view',
  'phone': '123-324-3308',
  'SSN': '123-45-7891'

Now the second document stored in the collection can be of this format:

  'name': 'user2',
  'employeeid': 546789

It’s pretty cool that you can store both these documents in the same collection. The problem, however, starts when you need to retrieve these documents from the collection. How do you tell if the retrieved document contains format 1 or format 2? You can check if the retrieved document contains the ‘ssn’ field and then make a decision.  Another option is to store the type of the document in the document itself:

  'type': xxx,
  'name': ....

In both these cases what you have achieved is moving the schema enforcement from the database to the application –

There is always a schema, it is just a question of where it is implemented.

If you have the right indexes it alleviates the problem to a certain extent. If a majority of your queries are by ’employeeid’ you know that the retrieved document is always of the second format – however, the rest of your code that does not use this index will still have the problem mentioned above. Also If you are using an ODM like mongoose then it automatically already enforces a schema for you on top of MongoDB.

There are several applications that benefit from this flexibility. One scenario that comes to mind is the case of a schema where there are a number of optional fields/columns. In MongoDB, there is no penalty for having some missing columns.  Each document can only contain the fields that it needs.

Document Validation

Starting version 3.2.x MongoDB now supports the concept of schema validation using the “validator” construct.  This provides many levels of validation – so you can choose the level that works for you.  The default behaviour if you don’t use validator is the previous schemaless behaviour. Typically you will create the “validators” at the time of collection creation

   { validator: { $or:
         { employeeid: { $exists: true }},
         { SSN: { $exists: true } }

Existing Collections

Existing collections can be updated using the ‘collMod’ command:

  collMod: "contacts",
  validator: { $or: [ { employeeid: { $exists: true }}, { SSN: { $exists: true} } ] }

Validation Level

MongoDB supports the concept of ‘ValidationLevel’.  The default validation level is ‘strict’ which means that inserts and updates fail if the document does not meet the validation criteria. If the validation level is ‘Moderate’ it applies the validation to existing documents that meet the validation criteria. Documents that exist currently and don’t meet the criteria are not validated. While convenient the ‘Moderate’ validation level can get you into trouble down the line – so it needs to be used with care.

Validation Action

By default, the validation action is ‘Error’. If your document fails validation it is an error and the update/insert fails. However, you can also set the Validation action to ‘warn’ which basically logs the schema violation in the log , but does not fail the insert.

What schema design examples would help you on your next project, let us know!

For more information, please visit Connect with ScaleGrid on LinkedIn, X, Facebook, and YouTube.
Table of Contents

Stay Ahead with ScaleGrid Insights

Dive into the world of database management with our monthly newsletter. Get expert tips, in-depth articles, and the latest news, directly to your inbox.

Related Posts

Setting Up MongoDB SSL Encryption

In a world where data security is essential, enabling MongoDB SSL is critical in fortifying your database. This guide walks...

distributed storage system

What is a Distributed Storage System

A distributed storage system is foundational in today’s data-driven landscape, ensuring data spread over multiple servers is reliable, accessible, and...

MySQL Backups: Methods & Best Practices

Regarding MySQL backups, knowing how to secure your data is crucial. This guide cuts through the complexity, providing instructions on...


Add Headline Here