You gained’t at all times have an ideal index for each question, however could have have single-field indexes for every filter. In such circumstances, PostgreSQL can use Bitmap Scan to mix these indexes. MongoDB can be able to merging a number of index bounds in a single scan or utilizing index intersection to mix separate scans. But, the MongoDB question planner hardly ever selects index intersection. Let’s have a look at the explanations behind this.
TL;DR: in case you suppose you want index intersection, you in all probability want higher compound indexes.
Take a look at Setup in a lab
I create a group with 100 thousand paperwork and two fields, with an index on every:
let bulk = [];
for (let i = 0; i < 100000; i++) {
bulk.push({
a: Math.ground(Math.random()*100),
b: Math.ground(Math.random()*100)
});
}
db.demo.insertMany(bulk);
// separate indexes
db.demo.createIndex({ a: 1 });
db.demo.createIndex({ b: 1 });
In PostgreSQL, we’ll mirror the dataset as comply with:
CREATE TABLE demo AS
SELECT id,
(random()*100)::int AS a,
(random()*100)::int AS b
FROM generate_series(1,100000) id;
CREATE INDEX demo_a_idx ON demo(a);
CREATE INDEX demo_b_idx ON demo(b);
In my MongoDB assortment of 100,000 paperwork, solely 9 paperwork have each the “a” and “b” fields set to 42:
mongo> db.demo.countDocuments()
100000
mongo> db.demo.discover({ a: 42, b: 42 }).showRecordID()
[
{ _id: ObjectId('6928697ae5fd2cdba9d53f54'), a: 42, b: 42, '$recordId': Long('36499') },
{ _id: ObjectId('6928697ae5fd2cdba9d54081'), a: 42, b: 42, '$recordId': Long('36800') },
{ _id: ObjectId('6928697ae5fd2cdba9d54a7c'), a: 42, b: 42, '$recordId': Long('39355') },
{ _id: ObjectId('6928697ae5fd2cdba9d55a3e'), a: 42, b: 42, '$recordId': Long('43389') },
{ _id: ObjectId('6928697ae5fd2cdba9d5a214'), a: 42, b: 42, '$recordId': Long('61779') },
{ _id: ObjectId('6928697ae5fd2cdba9d5e52a'), a: 42, b: 42, '$recordId': Long('78953') },
{ _id: ObjectId('6928697ae5fd2cdba9d5eeea'), a: 42, b: 42, '$recordId': Long('81449') },
{ _id: ObjectId('6928697ae5fd2cdba9d61f48'), a: 42, b: 42, '$recordId': Long('93831') },
{ _id: ObjectId('6928697ae5fd2cdba9d61f97'), a: 42, b: 42, '$recordId': Long('93910') }
]
In my PostgreSQL database, there are 100,000 rows and, amongst them, 9 rows have the worth 42 in each the “a” and “b” columns:
postgres=# choose rely(*) from demo;
rely
--------
100000
(1 row)
postgres=# SELECT *, ctid FROM demo WHERE a = 42 AND b = 42;
a | b | id | ctid
----+----+-------+-----------
42 | 42 | 4734 | (25,109)
42 | 42 | 15678 | (84,138)
42 | 42 | 29464 | (159,49)
42 | 42 | 29748 | (160,148)
42 | 42 | 31139 | (168,59)
42 | 42 | 37785 | (204,45)
42 | 42 | 55112 | (297,167)
42 | 42 | 85823 | (463,168)
42 | 42 | 88707 | (479,92)
I displayed the CTID for PostgreSQL and the RecordID for MongoDB, to see the distribution over the heap desk (for PostgreSQL) or the WiredTiger B-Tree (for MongoDB).
MongoDB attainable execution plans
I’ve executed db.demo.discover({ a: 42, b: 42 }), and a number of plans have been evaluated:
mongo> db.demo.getPlanCache().listing();
[
{
version: '1',
queryHash: 'BBC007A6',
planCacheShapeHash: 'BBC007A6',
planCacheKey: '51C56FDD',
isActive: true,
works: Long('968'),
worksType: 'works',
timeOfCreation: ISODate('2025-11-27T15:09:11.069Z'),
createdFromQuery: { query: { a: 42, b: 42 }, sort: {}, projection: {} },
cachedPlan: {
stage: 'FETCH',
filter: { b: { '$eq': 42 } },
inputStage: {
stage: 'IXSCAN',
keyPattern: { a: 1 },
indexName: 'a_1',
isMultiKey: false,
multiKeyPaths: { a: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
course: 'ahead',
indexBounds: { a: [ '[42, 42]' ] }
}
},
creationExecStats: [
{
nReturned: 12,
executionTimeMillisEstimate: 0,
totalKeysExamined: 967,
totalDocsExamined: 967,
executionStages: {
stage: 'FETCH',
filter: { b: { '$eq': 42 } },
nReturned: 12,
executionTimeMillisEstimate: 0,
works: 968,
advanced: 12,
needTime: 955,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
docsExamined: 967,
alreadyHasObj: 0,
inputStage: {
stage: 'IXSCAN',
nReturned: 967,
executionTimeMillisEstimate: 0,
works: 968,
advanced: 967,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { a: 1 },
indexName: 'a_1',
isMultiKey: false,
multiKeyPaths: { a: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
course: 'ahead',
indexBounds: { a: [Array] },
keysExamined: 967,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
}
}
},
{
nReturned: 10,
executionTimeMillisEstimate: 0,
totalKeysExamined: 968,
totalDocsExamined: 968,
executionStages: {
stage: 'FETCH',
filter: { a: { '$eq': 42 } },
nReturned: 10,
executionTimeMillisEstimate: 0,
works: 968,
superior: 10,
needTime: 958,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 0,
docsExamined: 968,
alreadyHasObj: 0,
inputStage: {
stage: 'IXSCAN',
nReturned: 968,
executionTimeMillisEstimate: 0,
works: 968,
superior: 968,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 0,
keyPattern: { b: 1 },
indexName: 'b_1',
isMultiKey: false,
multiKeyPaths: { b: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
course: 'ahead',
indexBounds: { b: [Array] },
keysExamined: 968,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
}
}
},
{
nReturned: 7,
executionTimeMillisEstimate: 0,
totalKeysExamined: 968,
totalDocsExamined: 7,
executionStages: {
stage: 'FETCH',
filter: { '$and': [ [Object], [Object] ] },
nReturned: 7,
executionTimeMillisEstimate: 0,
works: 968,
superior: 7,
needTime: 961,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 0,
docsExamined: 7,
alreadyHasObj: 0,
inputStage: {
stage: 'AND_SORTED',
nReturned: 7,
executionTimeMillisEstimate: 0,
works: 968,
superior: 7,
needTime: 961,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 0,
failedAnd_0: 232,
failedAnd_1: 230,
inputStages: [ [Object], [Object] ]
}
}
}
],
candidatePlanScores: [ 2.012596694214876, 1.0105305785123966, 1.0073314049586777 ],
indexFilterSet: false,
estimatedSizeBytes: Lengthy('4826'),
solutionHash: Lengthy('6151768200665613849'),
host: '40ae92e83a12:27017'
}
]
The cached plan makes use of just one index, on “a”, however there are two extra attainable plans within the cache: one utilizing the index on “b” and one other utilizing a mix of each indexes with AND_SORTED. The scores (candidatePlanScores) are:
- 2.012 for the index on “a”
- 1.010 for the index on “b”
- 1.007 for the AND_SORTED intersection of the indexes on “a” and “b”
This can be shocking, and given how the information was generated, we should always anticipate comparable prices for the 2 indexes. We will see that in the course of the trial interval on the question plans, the index on “a” completed the scan (isEOF: 1), and although the opposite two had comparable efficiency and have been going to finish, the trial interval ended earlier than they reached the top (isEOF: 0). MongoDB provides an EOF bonus of 1 when one when the trial plan finishes earlier than the others, and that explains why the rating is greater. So it is probably not that the index on “a” is healthier than the opposite plans, however simply that every one plans are good, and the primary one began and completed first, and acquired the bonus.
Along with that, there’s one other small penalty on index intersection, in order that even when none finish the scan earlier than the trial restrict, index intersection won’t ever have a better rating. Lastly the scores are:
- Index on “a”: 1 (base) + 0.012 (productiveness) + 1.0 (EOF bonus) = 2.012
- Index on “b”: 1 (base) + 0.010 (productiveness) + 0 (no EOF) = 1.010
- AND_SORTED: 1 (base) + 0.007 (productiveness) + 0 (no EOF) = 1.007
Even with out penalties, the planner nonetheless would not select AND_SORTED. The underlying situation is the scoring (superior divided by work) which solely measures work models, not how ‘mild’ that work could also be. For instance, a primary index scan fetches the doc and applies filters multi functional work unit, whereas AND_SORTED waits for intersection and skips pointless fetches and filtering. But, the planner doesn’t consider the good thing about processing extra keys to reduce fetch operations.
Power with inner setting (intenal, not supported)
To point out the AND_SORTED plan, I am going to drive it on my lab database for the next examples on this article:
// get the present parametrs (default):
mongo> x=db.adminCommand({ getParameter: "*" });Object.keys(x).filter(okay => okay.toLowerCase().consists of("intersection")).forEach(okay => print(okay + " : " + x[k]));
internalQueryForceIntersectionPlans : false
internalQueryPlannerEnableHashIntersection : false
internalQueryPlannerEnableIndexIntersection : true
// set all at true:
db.adminCommand({
setParameter: 1,
internalQueryPlannerEnableIndexIntersection: true,
internalQueryPlannerEnableHashIntersection: true,
internalQueryForceIntersectionPlans: true
});
I’ve set internalQueryForceIntersectionPlans to drive index intersection (it nonetheless makes use of the question planner, however with a 3-point increase to the rating). Index intersection is feasible for AND_SORTED by default, however I additionally set AND_HASH for one more take a look at later that can’t use AND_SORTED.
Index Intersection in MongoDB
Now that I compelled index intersection, I can observe it with execution statistics:
db.demo.discover({ a: 42, b: 42 }).clarify("executionStats").executionStats
Execution plan exhibits that each indexes have been scanned with one vary (seeks: 1), and mixed with an AND_SORTED earlier than fetching the paperwork for the end result:
{
executionSuccess: true,
nReturned: 9,
executionTimeMillis: 4,
totalKeysExamined: 2009,
totalDocsExamined: 9,
executionStages: {
isCached: false,
stage: 'FETCH',
filter: {
'$and': [ { a: { '$eq': 42 } }, { b: { '$eq': 42 } } ]
},
nReturned: 9,
executionTimeMillisEstimate: 0,
works: 2010,
superior: 9,
needTime: 2000,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
docsExamined: 9,
alreadyHasObj: 0,
inputStage: {
stage: 'AND_SORTED',
nReturned: 9,
executionTimeMillisEstimate: 0,
works: 2010,
superior: 9,
needTime: 2000,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
failedAnd_0: 501,
failedAnd_1: 495,
inputStages: [
{
stage: 'IXSCAN',
nReturned: 964,
executionTimeMillisEstimate: 0,
works: 965,
advanced: 964,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { a: 1 },
indexName: 'a_1',
isMultiKey: false,
multiKeyPaths: { a: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
course: 'ahead',
indexBounds: { a: [ '[42, 42]' ] },
keysExamined: 964,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
},
{
stage: 'IXSCAN',
nReturned: 1045,
executionTimeMillisEstimate: 0,
works: 1045,
superior: 1045,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 0,
keyPattern: { b: 1 },
indexName: 'b_1',
isMultiKey: false,
multiKeyPaths: { b: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
course: 'ahead',
indexBounds: { b: [ '[42, 42]' ] },
keysExamined: 1045,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
}
]
}
}
}
Right here, AND_SORTED means that it is the intersection (AND) of two streams, each sorted on the identical key (the pure order on RecordId, which is the important thing used to fetch paperwork from the collections). Not solely does it not require extra sorting to merge the 2 sources, however it could additionally optimize the fetches from the gathering. These particulars are important for understanding variations in compound indices, the place an extra key could change the ordering.
Bitmap-OR in PostgreSQL
Right here is the equal on PostgreSQL:
postgres=# EXPLAIN (ANALYZE, BUFFERS, VERBOSE, COSTS off)
SELECT *
FROM demo
WHERE a = 42 AND b = 42;
QUERY PLAN
--------------------------------------------------------------------------------------------
Bitmap Heap Scan on public.demo (precise time=0.136..0.147 rows=9.00 loops=1)
Output: a, b
Recheck Cond: ((demo.a = 42) AND (demo.b = 42))
Heap Blocks: actual=9
Buffers: shared hit=15
-> BitmapAnd (precise time=0.127..0.127 rows=0.00 loops=1)
Buffers: shared hit=6
-> Bitmap Index Scan on demo_a_idx (precise time=0.047..0.047 rows=964.00 loops=1)
Index Cond: (demo.a = 42)
Index Searches: 1
Buffers: shared hit=3
-> Bitmap Index Scan on demo_b_idx (precise time=0.053..0.053 rows=960.00 loops=1)
Index Cond: (demo.b = 42)
Index Searches: 1
Buffers: shared hit=3
Planning Time: 0.074 ms
Execution Time: 0.162 ms
Each engines mix outputs from a number of index scans. PostgreSQL implements it as a bitmap in reminiscence (BitmapAnd), a generic operation used not just for combining a number of indexes but additionally for inverted indexes or perhaps a single index scan to keep away from too many random heap desk reads. MongoDB’s index intersection is extra particular to the filters and obtainable indexes and, on this case, merges sorted RecordIds with out going by means of intermediate representations in reminiscence.
Hash intersection when enter scans aren’t sorted
AND_SORTED was quick as a result of each indexes had an equality filter on their full key, and the scan returns within the order of RecordIds, straightforward to intersect and filter, earlier than fetching the doc. It was just like a merge be part of on the 2 indexes. If these indexes have been extra advanced, with some extra fields in the important thing, it might not be ordered on this extra discipline. I am going to take a look at it by including a discipline to every index:
db.demo.createIndex({ a: 1 , _id:1 });
db.demo.createIndex({ b: 1 , _id:1 });
db.demo.dropIndex({ a: 1 });
db.demo.dropIndex({ b: 1 });
I run the identical question:
db.demo.discover({ a: 42, b: 42 }).clarify("executionStats").executionStats
Execution plan exhibits that each indexes have been scanned on one vary, like earlier than, however are mixed with an AND_HASH earlier than fetching the paperwork for the end result:
{
executionSuccess: true,
nReturned: 9,
executionTimeMillis: 6,
totalKeysExamined: 2010,
totalDocsExamined: 9,
executionStages: {
isCached: false,
stage: 'FETCH',
filter: {
'$and': [ { a: { '$eq': 42 } }, { b: { '$eq': 42 } } ]
},
nReturned: 9,
executionTimeMillisEstimate: 0,
works: 2013,
superior: 9,
needTime: 2003,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
docsExamined: 9,
alreadyHasObj: 0,
inputStage: {
stage: 'AND_HASH',
nReturned: 9,
executionTimeMillisEstimate: 0,
works: 2013,
superior: 9,
needTime: 2003,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
memUsage: 59622,
memLimit: 33554432,
mapAfterChild_0: 1046,
inputStages: [
{
stage: 'IXSCAN',
nReturned: 1046,
executionTimeMillisEstimate: 0,
works: 1047,
advanced: 1046,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { b: 1, _id: 1 },
indexName: 'b_1__id_1',
isMultiKey: false,
multiKeyPaths: { b: [], _id: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
course: 'ahead',
indexBounds: { b: [ '[42, 42]' ], _id: [ '[MinKey, MaxKey]' ] },
keysExamined: 1046,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
},
{
stage: 'IXSCAN',
nReturned: 964,
executionTimeMillisEstimate: 0,
works: 965,
superior: 964,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { a: 1, _id: 1 },
indexName: 'a_1__id_1',
isMultiKey: false,
multiKeyPaths: { a: [], _id: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
course: 'ahead',
indexBounds: { a: [ '[42, 42]' ], _id: [ '[MinKey, MaxKey]' ] },
keysExamined: 964,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
}
]
}
}
}
AND_HASH is just like a hash be part of. It can not stroll each streams in parallel, like a AND_SORTED, however builds an in‑reminiscence hash desk of RecordIds from the primary little one scan, then probes with the second. This consumes reminiscence proportional to the variety of matches from the primary index (memUsage: 59622).
Compound Indexes as the proper answer
We will create a compound index as a result of now we have equality filters on two totally different columns inside the similar doc. Creating an index that begins with these columns will present direct entry to the filtered entries:
db.demo.createIndex({ a: 1, b: 1 });
db.demo.discover({ a: 42, b: 42 }).clarify("executionStats").executionStats;
{
executionSuccess: true,
nReturned: 9,
executionTimeMillis: 0,
totalKeysExamined: 9,
totalDocsExamined: 9,
executionStages: {
isCached: false,
stage: 'FETCH',
nReturned: 9,
executionTimeMillisEstimate: 0,
works: 11,
superior: 9,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
docsExamined: 9,
alreadyHasObj: 0,
inputStage: {
stage: 'IXSCAN',
nReturned: 9,
executionTimeMillisEstimate: 0,
works: 10,
superior: 9,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { a: 1, b: 1 },
indexName: 'a_1_b_1',
isMultiKey: false,
multiKeyPaths: { a: [], b: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
course: 'ahead',
indexBounds: { a: [ '[42, 42]' ], b: [ '[42, 42]' ] },
keysExamined: 9,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
}
}
}
On this case, a single compound index was chosen as an alternative of intersecting two particular person indexes. The index bounds present the intersection instantly: { a: ['[42, 42]'], b: ['[42, 42]'] }. This strategy is perfect as a result of it limits the scan to solely these index entries which can be required (keysExamined: 9) for producing the outcomes (nReturned: 9).
In PostgreSQL, it’s also advisable to make use of a compound index moderately than counting on bitmap intersection:
DROP INDEX demo_a_idx, demo_b_idx;
CREATE INDEX demo_ab_idx ON demo(a, b);
postgres=# EXPLAIN (ANALYZE, BUFFERS, VERBOSE, COSTS off)
SELECT *
FROM demo
WHERE a = 42 AND b = 42;
QUERY PLAN
-------------------------------------------------------------------------------------
Bitmap Heap Scan on public.demo (precise time=0.026..0.038 rows=9.00 loops=1)
Output: a, b, x, id
Recheck Cond: ((demo.a = 42) AND (demo.b = 42))
Heap Blocks: actual=9
Buffers: shared hit=11
-> Bitmap Index Scan on demo_ab_idx (precise time=0.016..0.016 rows=9.00 loops=1)
Index Cond: ((demo.a = 42) AND (demo.b = 42))
Index Searches: 1
Buffers: shared hit=2
Planning Time: 0.072 ms
Execution Time: 0.053 ms
(11 rows)
PostgreSQL nonetheless used a bitmap scan right here, to keep away from unordered random scans to the entry to the heap desk, however without having of a BitmapAnd operation.
Conclusion
Though MongoDB’s index intersection is highly effective in principle, you’ll not see it in follow for the next causes:
- The question planner’s scoring methodology typically underestimates index intersection because of the first-to-end bonus, biased productiveness, and intersection penalties. These are identified points however given low precedence, primarily because of the subsequent cause.
- Compound indexes virtually at all times present one of the best efficiency for queries and are favored by the question planner because of this. When you can drive intersection plans for testing, manufacturing workloads ought to depend on well-designed compound indexes.
- You may’t create compound indexes for all ad-hoc analytic queries. Nonetheless, on this case, Atlas Search Index and a star question could also be a greater answer to keep away from overhead on the operational database.
- Forcing index intersection isn’t supported. Hints are restricted to specifying a single index, whereas drive parameters are meant for testing solely (they’ve names starting with ‘inner’).
For comparable circumstances, PostgreSQL depends on Bitmap Scan to run index intersections as a bitmap AND, and is broadly used even with single indexes to optimize the entry to heap tables from one index, a number of indexes, or indexes with a number of entries per row.
