Index Prefix Compression in MongoDB 3.0 WiredTiger

2 min read
Index Prefix Compression in MongoDB 3.0 WiredTiger

SHARE THIS ARTICLE

MongoDB 3.0 with WiredTiger introduces a new feature called ‘Index Prefix Compression’ which greatly reduces the memory consumed by the indexes. Less memory used by indexes means more memory for document storage or other indexes which implies better performance.

For best performance in MongoDB it is great to keep your indexes in memory. A page miss on an index is a double whammy – one page fault to bring the actual index page in memory and another page fault later to bring the data page into memory.

Technology

Index prefix compression does not use block compression (like zlib, snappy etc) but is a different technique to store the indexes in memory. It reduces memory usage by storing identical prefixes only once. The “key prefix compression” is a domain-specific way of compressing data and refers to the key storage format in WiredTiger. For more details, you can refer to the WiredTiger documentation of file formats.

Performance Tests

For our performance tests, we use a document structure as detailed below:

{
   "employeeID": <long>,
   "firstName": <string>,
   "lastName": <string>,
   "income": <long>,
   "supervisor": {
       "ID": <long>, 
       "firstName": <string>, 
       "lastName": <string>
   }
}

We added the following indexes on this setup:

Index 1: db.ensureIndex({'employeeID':1});
Index 2: db.ensureIndex({'lastName':1, 'firstName':1});
Index 3: db.ensureIndex({'income':1});
Index 4: db.ensureIndex({'supervisor.lastName':1, 'supervisor.firstName':1})

Results

In our test run we inserted identical data (about 10 million records) into two clusters – one 2.6.x replica set and the other one a MongoDB 3.0 with WiredTiger. Then we added the above indexes on both the cluster configurations. The results are quite staggering – in some cases there is an order of magnitude difference in the index size!

 Index name MMAP index size (MB) WT Index size (MB)  % Reduction in size
{employeeID:1} 230.7 94 59%
{lastName:1, firstName:1} 1530 36 97%
{income:1} 230 94 59%
{‘supervisor.lastName’:1, ‘supervisor.firstName’:1} 1530 35 97%

 

Mongodb Index size-2.6.x

MongoDB index size wired tiger

All the memory saved on indexes is memory that can be used for caching data, other indexes etc. Your mileage might vary – be sure to test out your particular index structure. The reduction of index sizes is a much-undersold improvement in Mongo 3.0 and can make a tremendous difference to your performance!

For more information, please visit www.scalegrid.io. Connect with ScaleGrid on LinkedIn, X, Facebook, and YouTube.
Table of Contents

Stay Ahead with ScaleGrid Insights

Dive into the world of database management with our monthly newsletter. Get expert tips, in-depth articles, and the latest news, directly to your inbox.

Related Posts

high available cluster

High Availability Clustering & Why You Need It

High availability clustering keeps your IT systems running without interruptions, even amid failures. This guide details high availability clustering, its...

blog-feature-img_whats-new-at-scalegrid

What’s New at ScaleGrid – July 2024

ScaleGrid is excited to announce our latest platform updates, showcasing our unwavering commitment to security, usability, and performance. Our recent...

database backend

What is RabbitMQ Used For

RabbitMQ is an open-source message broker facilitating the connection between different applications within a distributed setup. It is widely utilized...

NEWS

Add Headline Here