MongoDB® Regex, Index & Performance

2 min read
MongoDB® Regex, Index & Performance

SHARE THIS ARTICLE

MongoDB® supports regular expressions using the $regex operator. However these MongoDB regex queries have a downside,  all but one type of regex makes poor use of indexes and results in performance problems. For a production server with large amounts of data, a bad regex query can bring your server to its knees.

MongoDB regex based queries are a fairly common query in most applications using MongoDB. This is similar to the ‘LIKE’ operation supported on most relational databases. The syntax of the command is as follows

{ $regex: /pattern/, $options: '<options>' }
E.g. { name: { $regex: /^acme.*test/}}

For more detailed information on the regex operation and additional options refer to the MongoDB documentation

For the rest of this discussion, we will assume that the field you are matching against has an index. If you don’t index it will result in a collection scan and very poor performance. However, even if the field is indexed it can result in poor performance. The reason is that MongoDB can make good use of indexes only if your regular expression is a “prefix expression” – these are expressions starting with the “^” character.

E.g.  { name: { $regex: /^acme/}}

This allows MongoDB to identify a range of the index entries that are relevant to this query and results in efficient queries. Any other query results in an index scan since MongoDB is not able to narrow the scan to a range of index entries. An index scan is particularly bad since all the indexes need to be paged into memory and this affects the working set of your server (In fact the index scan could lead to worse performance than a collection scan – it results in twice the number of page faults).

Let’s look at some examples and the resulting query plans. For our testing purposes I have setup a collection with 100k documents. Each document has a firstName field which is a 16 character string.

Example 1: { name: { $regex: /^acme/}}
Result : Efficient index usage
Query plan:

"executionStats" : {
    "executionSuccess" : true,
    "nReturned" : 0,
    "executionTimeMillis" : 0,
    "totalKeysExamined" : 1,
    "totalDocsExamined" : 0,
}

Example 2: { name: { $regex: /^acme/i}}
Result : Inefficient index scan due to case insensitive requirement. So basically /i option negates the “prefix expression”
Query plan:

"executionStats" : {
    "executionSuccess" : true,
    "nReturned" : 0,
    "executionTimeMillis" : 137,
    "totalKeysExamined" : 100000,
    "totalDocsExamined" : 0
}

Example 3: { name: { $regex: /acme.*corp/}}
Result : Inefficient index scan
Query plan:

    "executionSuccess" : true,
    "nReturned" : 0,
    "executionTimeMillis" : 167,
    "totalKeysExamined" : 100000,
    "totalDocsExamined" : 0

Example 4: { name: { $regex: /acme/}}
Result : Inefficient index scan

"executionStats" : {
    "executionSuccess" : true,
    "nReturned" : 0,
    "executionTimeMillis" : 130,
    "totalKeysExamined" : 100000,
    "totalDocsExamined" : 0,
}
For more information, please visit www.scalegrid.io. Connect with ScaleGrid on LinkedIn, X, Facebook, and YouTube.
Table of Contents

Stay Ahead with ScaleGrid Insights

Dive into the world of database management with our monthly newsletter. Get expert tips, in-depth articles, and the latest news, directly to your inbox.

Related Posts

blog-feature-img_whats-new-at-scalegrid

What’s New at ScaleGrid – September 2024

At ScaleGrid, we’re always pushing the boundaries to offer more flexibility and scalability to our customers. Over the past few...

Managing PostgreSQL® High Availability – Part I: PostgreSQL Automatic Failover

Managing High Availability (HA) in your PostgreSQL hosting is very important to ensuring your database deployment clusters maintain exceptional uptime...

RabbitMQ Security Compliance - ScaleGrid

RabbitMQ Security and Compliance

Follow fundamental procedures in authentication, encryption, and commitment to RabbitMQ security protocols to protect your RabbitMQ system and secure messages....

NEWS

Add Headline Here