MongoDB® Schema Design: There Is Always A Schema

3 min read
MongoDB® Schema Design: There Is Always A Schema

SHARE THIS ARTICLE

MongoDB Schema Design

When MongoDB was introduced a few years ago, one of the important features touted was the ability to be “schemaless” – What does this mean for your documents?

MongoDB schema design does not enforce any schema on the documents stored in a collection. MongoDB essentially stores JSON documents, and each document can contain any structure that you want. Consider some examples from our “contacts” collection below. Here is one document that you can store:

{
  'name': 'user1',
  'address': '1 mountain view',
  'phone': '123-324-3308',
  'SSN': '123-45-7891'
}

Now the second document stored in the collection can be of this format:

{
  'name': 'user2',
  'employeeid': 546789
}

It’s pretty cool that you can store both these documents in the same collection. The problem, however, starts when you need to retrieve these documents from the collection. How do you tell if the retrieved document contains format 1 or format 2? You can check if the retrieved document contains the ‘ssn’ field and then make a decision.  Another option is to store the type of the document in the document itself:

{
  'type': xxx,
  'name': ....
  ...
}

In both these cases what you have achieved is moving the schema enforcement from the database to the application –

There is always a schema, it is just a question of where it is implemented.

If you have the right indexes it alleviates the problem to a certain extent. If a majority of your queries are by ’employeeid’ you know that the retrieved document is always of the second format – however, the rest of your code that does not use this index will still have the problem mentioned above. Also If you are using an ODM like mongoose then it automatically already enforces a schema for you on top of MongoDB.

There are several applications that benefit from this flexibility. One scenario that comes to mind is the case of a schema where there are a number of optional fields/columns. In MongoDB, there is no penalty for having some missing columns.  Each document can only contain the fields that it needs.

Document Validation

Starting version 3.2.x MongoDB now supports the concept of schema validation using the “validator” construct.  This provides many levels of validation – so you can choose the level that works for you.  The default behaviour if you don’t use validator is the previous schemaless behaviour. Typically you will create the “validators” at the time of collection creation

db.createCollection("contacts",
   { validator: { $or:
      [
         { employeeid: { $exists: true }},
         { SSN: { $exists: true } }
      ]
   }
})

Existing Collections

Existing collections can be updated using the ‘collMod’ command:

db.runCommand({
  collMod: "contacts",
  validator: { $or: [ { employeeid: { $exists: true }}, { SSN: { $exists: true} } ] }
})

Validation Level

MongoDB supports the concept of ‘ValidationLevel’.  The default validation level is ‘strict’ which means that inserts and updates fail if the document does not meet the validation criteria. If the validation level is ‘Moderate’ it applies the validation to existing documents that meet the validation criteria. Documents that exist currently and don’t meet the criteria are not validated. While convenient the ‘Moderate’ validation level can get you into trouble down the line – so it needs to be used with care.

Validation Action

By default, the validation action is ‘Error’. If your document fails validation it is an error and the update/insert fails. However, you can also set the Validation action to ‘warn’ which basically logs the schema violation in the log , but does not fail the insert.

What schema design examples would help you on your next project, let us know!

For more information, please visit www.scalegrid.io. Connect with ScaleGrid on LinkedIn, X, Facebook, and YouTube.
Table of Contents

Stay Ahead with ScaleGrid Insights

Dive into the world of database management with our monthly newsletter. Get expert tips, in-depth articles, and the latest news, directly to your inbox.

Related Posts

Redis vs Memcached in 2024

Choosing between Redis and Memcached hinges on specific application requirements. In this comparison of Redis vs Memcached, we strip away...

multi cloud plan - scalegrid

Plan Your Multi Cloud Strategy

Thinking about going multi-cloud? A well-planned multi cloud strategy can seriously upgrade your business’s tech game, making you more agile....

hybrid cloud strategy - scalegrid

Mastering Hybrid Cloud Strategy

Mastering Hybrid Cloud Strategy Are you looking to leverage the best private and public cloud worlds to propel your business...

NEWS

Add Headline Here