MongoEngine is an Object Document Mapper (ODM) for working with MongoDB from Python. The ODM layer maps an object model to a document database in a way similar to an ORM mapping an object model to a relational database. ODMs like MongoEngine offer relational database-like features e.g. schema enforcement, foreign key, field-level constraint etc at the application level.
Many good resources are available to learn the use of MongoEngine including a tutorial here.
In this post, we will discuss a MongoEngine programming construct for creating indexes as a MongoDB python tutorial and the performance overhead associated with it.
Automatic Index Creation in MongoEngine
By default, MongoEngine stores documents in a collection that is named as the pluralized form of the class name. For example, the User class shown below will be stored in a collection named users.A model should inherit the MongoEngine class Document to become a mapped object.
class User(Document): meta = { 'indexes': [ { 'fields': ['+name'] }, { 'fields': ['#email'] }] }
The User class defined above declares two indexes: 1. name (sort order) and 2. email (hashed). MongoEngine creates each declared index at the first upsert operation. These indexes are created on the collection via a createIndex/ensureIndex call . MongoEngine attempts to create these indexes every time a document is inserted into the collection.
For e.g.
User(name = "Ross", email='ross@gmail.com",address="127,Baker Street").save()
This call results in three command requests to database server: two commands to ensure that name and email index exist on users collection, one to do the actual upsert.
COMMAND [conn8640] command admin.$cmd command: createIndexes { createIndexes: "user", indexes: [ { background: false, name: "name_1", key: { name: 1 } } ] } keyUpdates:0 writeConflicts:0 numYields:0 reslen:149 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, Database: { acquireCount: { W: 1 } } } protocol:op_query 0ms COMMAND [conn8640] command admin.$cmd command: createIndexes { createIndexes: "user", indexes: [ { background: false, name: "email_hashed", key: { email: "hashed" } } ] } keyUpdates:0 writeConflicts:0 numYields:0 reslen:149 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, Database: { acquireCount: { W: 1 } } } protocol:op_query 0ms COMMAND [conn8640] command admin.user command: insert { insert: "user", ordered: true, documents: [ { name: "Ross", email: "ross@example.com", address: "127, Baker Street", _id: ObjectId('584419df01f38269dd9d63c1') } ], writeConcern: { w: 1 } } ninserted:1 keyUpdates:0 writeConflicts:0 numYields:0 reslen:40 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, Database: { acquireCount: { W: 1 } }, Collection: { acquireCount: { w: 1 } } } protocol:op_query 0ms
This is ok for applications where write load is low to moderate. However, If your application is write-intensive, this has a serious adverse impact on write performance.
Avoiding Auto Index Creation
If 'auto_create_index' is set to false in the meta-dictionary, then MongoEngine skips the automatic creation of indexes. No extra createIndex requests are sent during write operations. Turning off Auto Index Creation is also useful in productions systems where indexes are applied typically during database deployment.
For example,
meta = { 'auto_create_index':false, 'indexes': [ ..... ] }
In case you are designing a write-intensive application, it makes sense to decide on your indexes during the schema design phase and deploy them even before the application is deployed. If you are planning to add indexes on existing collections, it would be better to follow the documentation to build index on replica set. Using this approach, we bring down the servers one at a time and build indexes on them.
Use MongoEngine create_index method to create indexes within the application:
User.create_index(keys, background=False, **kwargs)
You can also use the ScaleGrid UI to help you build indexes in a ‘Rolling fashion’ with no downtime. For more details refer to our MongoDB index building blog post.