Thursday, January 15, 2026

How does it scale? A primary OLTP benchmark on MongoDB


Selecting a database requires making certain that efficiency stays quick as your information grows. For instance, if a question takes 10 milliseconds on a small dataset, it ought to nonetheless be fast as the info quantity will increase and may by no means method the 100ms threshold that customers understand as ready. Right here’s a easy benchmark: We insert batches of 1,000 operations into random accounts, then question the account with the latest operation in a selected class—an OLTP situation utilizing filtering and pagination. As the gathering grows, a full assortment scan would decelerate, so secondary indexes are important.

We create an accounts assortment, the place every account belongs to a class and holds a number of operations—a typical one-to-many relationship, with an index for our question on operations per classes:

db.accounts.createIndex({
  "class": 1,
  "operations.date": 1,
  "operations.quantity": 1,
});
Enter fullscreen mode

Exit fullscreen mode

The index follows the MongoDB Equality, Type, Vary guideline.

To extend information quantity, this perform inserts operations into accounts (randomly distributed to 10 million accounts over 3 classes):

perform insert(num) {
  const ops = [];
  for (let i = 0; i < num; i++) {
    const account  = Math.flooring(Math.random() * 10_000_000) + 1;
    const class = Math.flooring(Math.random() * 3);
    const operation = {
      date: new Date(),
      quantity: Math.flooring(Math.random() * 1000) + 1,
    };
    ops.push({
      updateOne: {
        filter: { _id: account },
        replace: {
          $set: { class: class },
          $push: { operations: operation },
        },
        upsert: true,
      }
    });
  }
  db.accounts.bulkWrite(ops);
}
Enter fullscreen mode

Exit fullscreen mode

This provides 1,000 operations and may take lower than one second:

let time = Date.now();
insert(1000);
console.log(`Elapsed ${Date.now() - time} ms`);
Enter fullscreen mode

Exit fullscreen mode

A typical question fetches the account, in a class, that had the newest operation:

perform question(class) {
  return db.accounts.discover(
    { class: class },
    { "operations.quantity": 1 , "operations.date": 1 }
  ).kind({ "operations.date": -1 })
   .restrict(1);
}
Enter fullscreen mode

Exit fullscreen mode

Such question ought to take a number of milliseconds:

let time = Date.now();
print(question(1).toArray());
console.log(`Elapsed ${Date.now() - time} ms`);
Enter fullscreen mode

Exit fullscreen mode

I repeatedly insert new operations, by batches of 1,000, in a loop, and measure the time taken for the question whereas the gathering grows, stopping as soon as I attain one billion operations randomly distributed into the accounts:

for (let i = 0; i < 1_000_000; i++) { 
  // extra information  
  insert(1000);  
  // similar question
  const begin = Date.now();  
  const outcomes = question(1).toArray();  
  const elapsed = Date.now() - begin;  
  print(outcomes);  
  // elapsed time
  console.log(`Elapsed ${elapsed} ms`);  
}  
console.log(`Complete accounts: ${db.accounts.countDocuments()}`);  
Enter fullscreen mode

Exit fullscreen mode

In a scalable database, the response time shouldn’t considerably enhance whereas the gathering grows. I’ve run that in MongoDB, and response time stays in single-digit milliseconds. I’ve run that in an Oracle Autonomous Database, with the MongoDB emulation, however I can not publish the outcomes as Oracle Companies forbids the publication of database benchmarks (DeWitt Clause). Nonetheless, you possibly can copy/paste this check and watch the elapsed time whereas information is rising, by yourself infrastructure. I additionally examined Amazon DocumentDB with the New Question Planner, however because the index shouldn’t be used for pagination, response time will increase as the gathering grows.

I ran it in MongoDB for a few hours, then checked the gathering measurement and the elapsed question occasions:


check> db.accounts.countDocuments();

9797064

check> db.accounts.combination([
  { $project: { category: 1, opsCount: { $size: "$operations" } } },
  { $group: { _id: "$category", operations: { $sum: "$opsCount" } } },
  { $sort: { _id: 1 } }
 ] );

[
  { _id: 0, operations: 8772705 },
  { _id: 1, operations: 8771114 },
  { _id: 2, operations: 8771181 }
]

check> let time = Date.now(); insert(1000); console.log(`Elapsed ${Date.now() - time} ms`);

Elapsed 292 ms

check> let time = Date.now(); print(question(1).toArray()); console.log(`Elapsed ${Date.now() - time} ms`);

[
  {
    _id: 7733303,
    operations: [
      { date: ISODate('2025-11-11T14:41:44.139Z'), amount: 68 },
      { date: ISODate('2025-11-11T14:50:58.409Z'), amount: 384 },
      { date: ISODate('2025-11-11T15:57:15.743Z'), amount: 890 },
      { date: ISODate('2025-11-11T17:09:52.410Z'), amount: 666 },
      { date: ISODate('2025-11-11T17:50:08.232Z'), amount: 998 }
    ]
  }
]

Elapsed 3 ms

Enter fullscreen mode

Exit fullscreen mode

The gathering has 25 million operations distributed throughout 10 million accounts. Inserts are nonetheless in lower than a millisecond per doc (300ms for 1000 paperwork), and the question continues to be in 3 milliseconds to get the account with the newest operation in a class. That is the scalability you wish to obtain for an OLTP software.

Here’s a Docker Compose you should use as a template to do your personal exams: mongo-oltp-scalability-benchmark.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles