How to Stop a Runaway Index Build in MongoDB®

3 min read
How to Stop a Runaway Index Build in MongoDB®

SHARE THIS ARTICLE

Index builds in MongoDB can have an adverse impact on the availability of your MongoDB cluster.  If you trigger a foreground index build on a large collection on your production server, you may find that your cluster is unresponsive until the index build is complete.  On a large collection, this could take several hours or days, as described in the perils of index building in MongoDB.

The recommended best practice is to trigger index builds in the background, however, on large collection indexes, we’ve seen multiple problems with this approach.  In the case of a three node cluster, both secondaries start building the index and stop responding to any requests. Consequently, the primary does not have quorum and moves to the secondary state taking your cluster down. Also, the default index builds triggered from the command line are foreground index builds – making this a now widespread problem. In future releases, we’re hopeful that this becomes background by default.

Once you’ve triggered an index, simply restarting the server does not solve our problem; MongoDB will pick up the index build from where it left off. If you were running a background index build previously after the restart, it now becomes a foreground index build, so in this case, the restart could make the problem worse.

If you’ve already triggered an index build, how do you stop it? Luckily, it’s relatively easy to stop an index build.

Option 1: Kill the index build process

Locate the index build process using db.currentOp() and then kill the operation using db.killOp(<opid>). The index operation will look something like this:

{
"opid" : 820659355,
"active" : true,
"lockType" : "write",
....
"op" : "insert",
"ns" : "xxxx",
"query" : {
},
"client" : "xxxx",
"desc" : "conn",
"msg" : "index: (2/3) btree bottom up 292168587/398486401 64%"
}

If the node where the index is building does not respond to new connections, or the killOp does not work, use Option 2 below:

Option 2: Configuring “noIndexBuildRetry” & restart

MongoDB provides a “–noIndexBuildRetry” option which instructs MongoDB to stop building incomplete indexes on restart.

This parameter doesn’t appear to be supported from the config file, only as a parameter for the mongod process. We don’t prefer to run mongod manually with this option because if you accidentally run the mongod process as an elevated user (E.g. root), it ends up changing the permissions of all the files. Also, once run as “root”, we’ve had intermittent problems running the process as mongod again.

A simpler option is to edit the /etc/init.d/mongod file. Looks for this line:

OPTIONS=" -f $CONFIGFILE"

Replace with this line:

OPTIONS=" -f $CONFIGFILE --noIndexBuildRetry"

Detailed steps

For the purposes of this discussion, we’re providing instructions for CentOS/RedHat/Amazon Linux.

  1. Configure “–noIndexBuildRetry”

    Add the “–noIndexBuildRetry” option to all your data nodes as explained above.

  2. Restart all the nodes building the index

    Look at the mongod log file for each data server and determine if it’s building the index. If it is, restart the server “service mongod restart”.

  3. Drop the incomplete index

    Once all of the relevant nodes are restarted, look at the list of indexes and drop the incomplete index if you see it on the list.

  4. Remove “–noIndexBuildRetry”

    Edit the /etc/init.d/mongod file to remove the –noIndexBuildRetry option that you added in step 1 so we can revert back to the default behavior of resuming the index build.

For any further questions, reach out to us at support@scalegrid.io.

Happy indexing!

For more information, please visit www.scalegrid.io. Connect with ScaleGrid on LinkedIn, X, Facebook, and YouTube.
Table of Contents

Stay Ahead with ScaleGrid Insights

Dive into the world of database management with our monthly newsletter. Get expert tips, in-depth articles, and the latest news, directly to your inbox.

Related Posts

pitr mysql

Master MySQL Point in Time Recovery

Data loss or corruption can be daunting. With MySQL point-in-time recovery, you can restore your database to the moment before...

Setting Up MongoDB SSL Encryption

In a world where data security is essential, enabling MongoDB SSL is critical in fortifying your database. This guide walks...

distributed storage system

What is a Distributed Storage System

A distributed storage system is foundational in today’s data-driven landscape, ensuring data spread over multiple servers is reliable, accessible, and...

NEWS

Add Headline Here