Cassandra vs. MongoDB

Cassandra vs. MongoDB

Are you considering Cassandra or MongoDB as the data store for your next project?  Would you like to compare the two databases?  Cassandra and MongoDB are both “NoSQL” databases, but the reality is that they are very different. They have very different strengths and value propositions – so any comparison has to be a nuanced one. Let’s start with initial requirements… Neither of these databases replaces RDBMS, nor are they “ACID” databases. So If you have a transactional workload where normalization and consistency are the primary requirements, neither of these databases will work for you. You are better off sticking with traditional relational databases like MySQL, PostgreSQL, Oracle, etc. Now that we have relational databases out of the way, let’s consider the major differences between Cassandra and MongoDB that will help you make the decision. In this post, I am not going to discuss specific features but will point out some high-level strategic differences to help you make your choice.

1.  Expressive Object Model

MongoDB supports a rich and expressive object model. Objects can have properties and objects can be nested in one another (for multiple levels). This model is very “object-oriented” and can easily represent any object structure in your domain. You can also index the property of any object at any level of the hierarchy – this is strikingly powerful! Cassandra, on the other hand, offers a fairly traditional table structure with rows and columns. Data is more structured and each column has a specific type which can be specified during creation.

Verdict: If your problem domain needs a rich data model, then MongoDB hosting is a better fit for you.


2.  Secondary Indexes

Secondary indexes are a first-class construct in MongoDB. This makes it easy to index any property of an object stored in MongoDB even if it is nested. This makes it really easy to query based on these secondary indexes. Cassandra has only cursory support for secondary indexes. Secondary indexes are also limited to single columns and equality comparisons. If you are mostly going to be querying by the primary key then Cassandra will work well for you.

Verdict:  If your application needs secondary indexes and needs flexibility in the query model then MongoDB is a better fit for you.


3.  High Availability

MongoDB supports a “single master” model. This means you have a master node and a number of slave nodes. In case the master goes down, one of the slaves is elected as master. This process happens automatically but it takes time, usually 10-40 seconds. During this time of new leader election, your replica set is down and cannot take writes. This works for most applications but ultimately depends on your needs. Cassandra supports a “multiple master” model. The loss of a single node does not affect the ability of the cluster to take writes – so you can achieve 100% uptime for writes.

Verdict: If you need 100% uptime Cassandra is a better fit for you.


4.  Write Scalability

MongoDB with its “single master” model can take writes only on the primary. The secondary servers can only be used for reads. So essentially if you have three node replica set, only the master is taking writes and the other two nodes are only used for reads. This greatly limits write scalability. You can deploy multiple shards but essentially only 1/3 of your data nodes can take writes. Cassandra with its “multiple master” model can take writes on any server. Essentially your write scalability is limited by the number of servers you have in the cluster. The more servers you have in the cluster, the better it will scale.

Verdict: If write scalability is your thing, Cassandra is a better fit for you.


5.  Query Language Support

Cassandra supports the CQL query language which is very similar to SQL. If you already have a team of data analysts they will be able to port over a majority of their SQL skills which is very important to large organizations. However CQL is not full blown ANSI SQL – It has several limitations (No join support, no OR clauses) etc. MongoDB at this point has no support for a query language. The queries are structured as JSON fragments.

Verdict: If you need query language support, Cassandra is the better fit for you.


6.  Performance Benchmarks

Let’s talk performance.  At this point, you are probably expecting a performance benchmark comparison of the databases.  I have deliberately not included performance benchmarks in the comparison. In any comparison, we have to make sure we are making an apples-to-apples comparison.

1.  Database model - The database model/schema of the application being tested makes a big difference. Some schemas are well suited for MongoDB and some are well suited for Cassandra. So when comparing databases it is important to use a model that works reasonably well for both databases.
2.  Load characteristics – The characteristics of the benchmark load are very important. E.g. In write-heavy benchmarks, I would expect Cassandra to smoke MongoDB. However, in read-heavy benchmarks, MongoDB and Cassandra should be similar in performance.
3.  Consistency requirements  - This is a tricky one. You need to make sure that the read/write consistency requirements specified are identical in both databases and not biased towards one participant.  Very often in a number of the ‘Marketing’ benchmarks, the knobs are tuned to disadvantage the other side. So, pay close attention to the consistency settings.

One last thing to keep in mind is that the benchmark load may or may not reflect the performance of your application. So in order for benchmarks to be useful, it is very important to find a benchmark load that reflects the performance characteristics of your application. Here are some benchmarks you might want to look at:
NoSQL Performance Benchmarks
Cassandra vs. MongoDB vs. Couchbase vs. HBase

7.  Ease of Use

If you had asked this question a couple of years ago MongoDB would be the hands-down winner. It’s a fairly simple task to get MongoDB up and running. In the last couple of years, however, Cassandra has made great strides in this aspect of the product. With the adoption of CQL as the primary interface for Cassandra, it has taken this a step further – they have made it very simple for legions of SQL programmers to use Cassandra very easily.

Verdict: Both are fairly easy to use and ramp up.


8.  Native Aggregation

MongoDB has a built-in Aggregation framework to run an ETL pipeline to transform the data stored in the database. This is great for small to medium jobs but as your data processing needs become more complicated the aggregation framework becomes difficult to debug. Cassandra does not have a built-in aggregation framework. External tools like Hadoop, Spark are used for this.

9.  Schema-less Models

In MongoDB, you can choose to not enforce any schema on your documents. While this was the default in prior versions in the newer version you have the option to enforce a schema for your documents.  Each document in MongoDB can be a different structure and it is up to your application to interpret the data. While this is not relevant to most applications, in some cases the extra flexibility is important. Cassandra in the newer versions (with CQL as the default language) provides static typing. You need to define the type of very column upfront.

To summarize here are the important differences in table form: cassandra vs mongodb
If you wish to view the full infographic, you can visit our Cassandra vs MongoDB comparison page.

Dharshan is the founder of (formerly He is an experienced MongoDB developer and administrator. He can be reached for further comment at @dharshanrg

  • Alex Tyutchev

    Hi. I was evaluating both DBs recently as well, and came up with more or less same conclusion as you did. The only point I disagree on is 5. Even if Cassandra has CQL it is fairly limited and cannot be really compared with SQL. For example there is no even OR operator.

    • John A. De Goes

      In addition, the Quasar open source project brings powerful SQL to MongoDB, and it’s leveraged by SlamData (among other applications).

    • Dharshan

      Fair point Alex. Post has been updated to note the limitations of CQL.

    • Nisar Adappadathil

      Cassandra has restricted its query to a partition.So using OR operator has to query in different partitions which is not recommended in cassandra. While doing data modeling you have to partition your data so that querying is more efficient.

  • Kelly Stirman

    Disclosure – I work for MongoDB. Here are some observations:

    #2. If you are mostly querying on primary key, then *any* database will work for you.

    #3. In MongoDB 3.2 and later, failures are detected and a new leader elected in under 2 seconds. The trade off for multi-master is that reads are slower and scale less effectively because the client must read from multiple nodes to ensure consistency.

    #4. In MongoDB , you can stripe primaries and secondaries across all nodes so that all nodes are capable of serving reads and writes. The trade off is capacity in the event of a failure. The same trade off exists for Cassandra.

    #5. MongoDB has a query language – MongoDB Query Language( I think your point is specific to SQL. MongoDB provides a Connector for BI that supports ANSI SQL, whereas Cassandra’s CQL is a variant of SQL so existing tools are not compatible.

    #7. I still think MongoDB has a massive advantage in terms of ease of use. The Cassandra data model – while based on tables – is very different from an RDBMS. For example, there are no joins, and no secondary indexes. Both products require new skills in terms of modeling data. For MongoDB documents mostly look like the objects in your code, which is pretty natural and easy to understand. There are also far more drivers and frameworks compatible with MongoDB, as well as a wider range of tools that support the database.

    #8. Both MongoDB and Cassandra work with Spark and Hadoop. These are heavy-weight tools with their own resources, skills, dependencies, security concerns, and other factors to consider. You can go very, very far with MongoDB’s aggregation framework while staying within the MongoDB ecosystem. There is no options to do this in Cassandra.

    #9. Your chart is cut off.

    • Dharshan

      Thanks Kelly. In your answer section #3 is 2 seconds the average/best/worst case time? It will be good to share more information on this.

      • SR-71

        2 seconds is a killer. That would be over 8,000 lost writes in systems I have built in the past.

        • Mathieu Poussin

          Your system should be build to buffer and allow a failover time in this case ?

          • SR-71

            Buffer? Nah, dead letter queue. Buffers create all sorts of headaches like duplicates and single points of failure.

          • Sieu Nhan Gao

            can you share you solution @disqus_oMu537SQ2V:disqus

          • SR-71

            Message brokers are written from the ground up to answer this very issue. Both Apache ActiveMQ and Kafka will provide out of the box dead letter queue handling. Then you write a process to recover from the failure, not write code trying to capture the failure and relevant data. It is easier to write the failure recovery process than trying to capture the data during failure. It is also a lot of code to detect failure accurately and know when to give up and persist the data somewhere so as to not lose it. Pick your battles. The message broker keeps you safe and then you can spend your time implementing interfaces to recover.

    • Nikhil Nanjappa

      I would love to use mongo if thats the case, but do we have any fact-sheet on the 2 seconds downtime ?

    • disqus_Aa7Jjb9CXv

      salty <3

  • Lucas
    A little more basic compression of mongodb vs cassandra with python
    Definitely the items you have mentioned would need to be considered after initial prototype.

  • raul

    Could you please touch upon maintenance and cost effectiveness aspect for different sizes and on-premise v/s cloud aws hosted solutions?

    • Danielle

      Thanks for the feedback Raul, I will pass this request on to our devs to see how we can work this idea into the content!

  • ldmtwo

    Do you have a link to where I can find the rest of the cut-off chart? Thanks. I would really like to understand the differences more clearly. Cassandra has very high throughput (confirmed by my team), but seems to be relatively weak on ease of use. The O’Reilly Data Science Salary Survey 2016 shows 10% usage (of survey correspondents) for MDB, but only 4% for Cassandra.

    • Dharshan

      The infographic should have the full table – You can find a link to it at the bottom of the article.

  • Sarath reddy

    Hi Dharshan I am working on MongoDB and Cassandra Perfromance and cost metrics for specific data. Could you please let me know how to proceed further regarding performance and costs?

    • Carlos El Sueco

      You’re asking too broadly I’m afraid. Try setting up a concrete goal or context for your question and someone might help you.

      • Sarath reddy

        Thanks for your reply. I am looking forward for the cost metrics analysis for both (MongoDB and Cassandra) NoSQL Databases based on Sample Data. How can I recommend which is best database based on the performance and cost analysis and in which databases deployed on premises. Right now not going for cloud. I have seen the cost metrics in AWS. But I am just trying to figure out how can I design cost analysis. Consideration (throughput Efficiency, Nodes, Clusters, CPU performance and everything hardware prespective also) . Thanks for reading my comment.

  • Carlos El Sueco

    very easy to understand and objective comprison. Great thanks to you for this valuable intro.

  • Aakash Sharma

    what about multitenancy support. Also, if one needs to keep master data information such as customer data, affiliates etc. along with analytics information such as hits, pageviews etc. would it be advisable to keep Cassandra as the only database. Having two database will create additional layer to replicate information. But, keeping master data in cassandra will not allow for complex queries filters for search by name, joining dates etc. Any thoughts?

  • Vinod Jayakumar

    This was pretty helpful. thanks for putting this out neatly.

  • Anshul katta

    shut the fuck up , i still use mysql , works better than these kiddy dbs

  • peridotventures

    please stop saying cassandra doesn’t have secondary indices or aggregations. It has both, it didn’t used to but does now. Cassandra is opinionated it requires that you think about your query cases, typing, and partitioning up front. You can decide whether that is a good design principle or not, I for one think it is. My understanding of Mongo is that is lets you defer thinking hard about these things until you have to, because stuff isn’t scaling anymore, at which point the problem is harder, but probably you are so successful that you can afford to re-engineer a bit. Mongo is schemaless to a degree, Cassandra is more strongly typed. The expressive object model can be done in either, I think that point is outdated or just wrong. With collections, maps, and user defined types you can make very rich object models in cassandra, with type safety. What I’ve seen with many mongo designs is that people push type checking and validation into the API layer, which is fine, but I kinda laugh about it, because are you really schemaless now ? It’s very similar to how people hate compilers and then end up writing a bazillion tests to replace what the compiler did for free. Feels like people just move the problem around and reinvent things a new way, but I digress. They are both useful tools depends on your use cases.

    • David Griffiths

      The article is from 2016. The information was correct at the time the article was written. To ask them to ‘stop saying cassandra doesn’t have secondary indices’ doesn’t really make sense.

  • faris rayhan

    Two points that help me very much is High Availability and Write Scalability. It is very contrast with another database feature. Cassandra is good for transaction :)

  • Ravish Patel

    We us MongoDB and Cassandra for multiple clients. Below is what I would say:
    - MongoDB is easy to start and work with for throughput of upto 100k Ops/s. If you wanna scale beyond that, the benefit slope of performance gain vs infrastructure is around 20° while in case of cassandra this slope is round 35-50° So if your need is something below 100k Cassandra works better with the risk of 2-5seconds of node failure downtime.

    The cost of sharding for MongoDB is extremely high. With typical Shart cluster consisting of 9-11 servers at minimum. In order to achieve a similar benefit from Cassandra you need only 3 servers!!!

    • FINDarkside

      > In order to achieve a similar benefit

      Are you talking about read performance, because in that case MongoDB scales horribly, even though it’s supposed to be “scalable”.

  • EduardoCupertino

    Great Article!

  • Devender Singh

    I was just going through the tutorial of Cassandra.the statement in the blog that “Neither of these databases replaces RDBMS, nor are they “ACID” databases.” is bit conflicting which is present in blog “” which says Cassandra supports properties like Atomicity, Consistency, Isolation, and Durability (ACID).


    I ended up going with MariaDB :-)

  • Layne Sadler