SQL databases provide a number of be a part of algorithms. The question planner selects probably the most environment friendly one by evaluating cardinality and estimated value. For instance, a nested loop be a part of is right when the outer desk has few rows and an index permits quick entry to the inside desk. In distinction, a hash be a part of is healthier fitted to conditions the place the outer desk comprises many rows and the inside desk should be totally scanned, leading to fewer expensive loops.
Whereas MongoDB gives comparable algorithms, tailored to versatile paperwork, being a NoSQL database means it shifts extra accountability to the developer. Builders should design for optimum information entry, already within the information mannequin, but it surely has the benefit of leading to extra predictable efficiency.
I am going to base my instance on a query on Reddit: Optimizing a MongoDB JOIN with $lookup and $restrict. I take advantage of a group of customers, the place every person references a profile. The profile has a standing. The question lists the customers with no profile or with a profile with a standing equal to 2.
In my demo, I arrange two profile keys: “_id,” which is mechanically listed in MongoDB, and an “ID” subject, which isn’t. This setup illustrates each conditions—an listed lookup desk versus a non-indexed one. Usually, you’d use only one methodology, relying on which be a part of algorithm you prefer.
db.profiles.drop()
db.customers.drop()
db.profiles.insertMany([
{ _id:102, ID: 102, status: 2 },
{ _id:201, ID: 201, status: 1 },
{ _id:302, ID: 302, status: 2 },
{ _id:403, ID: 403, status: 3 }
]);
db.customers.insertMany([
{ name: "Alice" , profileID: 403 }, // profile status = 3
{ name: "Bob", profileID: 102 }, // profile status = 2
{ name: "Charlie", profileID: 201 }, // profile status = 1
{ name: "Diana", profileID: 102 }, // profile status = 2
{ name: "Eve" }, // no profile
{ name: "Franck" , profileID: 403 }, // profile status = 3
{ name: "Gaspar" , profileID: 403 }, // profile status = 3
{ name: "Hans" , profileID: 403 }, // profile status = 3
{ name: "Iona" , profileID: 403 }, // profile status = 3
{ name: "Jane" , profileID: 403 }, // profile status = 3
{ name: "Karl" , profileID: 403 }, // profile status = 3
{ name: "Lili" }, // no profile
{ name: "Math" }, // no profile
{ name: "Niall" }, // no profile
{ name: "Oscar" , profileID: 403 }, // status = 3
{ name: "Paula" , profileID: 102 }, // status = 2
{ name: "Quentin" , profileID: 201 }, // status = 1
{ name: "Ravi" , profileID: 102 }, // status = 2
{ name: "Sofia" }, // no profile
{ name: "Takumi" , profileID: 403 }, // status = 3
{ name: "Uma" , profileID: 403 }, // status = 3
{ name: "Viktor" , profileID: 403 }, // status = 3
{ name: "Wafa" , profileID: 403 }, // status = 3
{ name: "Ximena" , profileID: 403 }, // status = 3
{ name: "Yara" }, // no profile
{ name: "Zubair" }, // no profile
]);
Right here is my question on this small information set:
db.customers.combination([
{
$lookup: {
from: "profiles",
localField: "profileID",
foreignField: "ID",
as: "profile"
}
},
{
$match: {
$or: [
{ profile: { $eq: [] } }, // no profile
{ profile: { $elemMatch: { standing: 2 } } } // profile standing = 2
]
}
},
{
$challenge: {
_id: 0,
title: 1,
"profile.standing": 1 // hold solely the standing subject from joined profile
}
}
]);
Notice that the primary optimization I did is changing the be a part of expression (let: { userId: "$_id" },pipeline: [ { $match: { $expr: { $eq: ["$userId", "$$userId"] } } } ]) with localField/foreignField to permit the push down of the be a part of predicate (EQ_LOOKUP)
Right here is the end result:
[
{ name: 'Bob', profile: [ { status: 2 } ] },
{ title: 'Diana', profile: [ { status: 2 } ] },
{ title: 'Eve', profile: [] },
{ title: 'Lili', profile: [] },
{ title: 'Math', profile: [] },
{ title: 'Niall', profile: [] },
{ title: 'Paula', profile: [ { status: 2 } ] },
{ title: 'Ravi', profile: [ { status: 2 } ] },
{ title: 'Sofia', profile: [] },
{ title: 'Yara', profile: [] },
{ title: 'Zubair', profile: [] },
]
Kind "it" for extra
check>
To scale the variety of customers, I multiply every current person by 10,000 utilizing the next script:
const currentUsers = db.customers.discover({},{_id:0, title:1, profileID:1});
currentUsers.forEach(userDoc => {
print(`Inserting 10,000 paperwork for: ${JSON.stringify(userDoc)}`);
for (let i = 0; i < 10000; i++) {
const newUsers = [];
const clone = { ...userDoc };
clone.title=`${i}${clone.title}`
newUsers.push(clone);
db.customers.insertMany(newUsers);
}
})
I now have 260,026 customers.
Listed nested loop be a part of
I run my question with clarify("executionStats") and extract crucial statistics concerning the time and the variety of paperwork examined:
x=db.customers.combination([
{ "$lookup" : {
"from" : "profiles",
"localField" : "profileID",
"foreignField" : "_id",
"as" : "profile" }},
{
$match: {
$or: [
{ "profile": { $eq: [] } },
{ "profile": { $elemMatch: { standing: 2 } } },
]
}
},
]).clarify("executionStats")
xs=x["stages"][0]["$cursor"]["executionStats"];
xp=x["stages"][0]["$cursor"]["queryPlanner"]["winningPlan"]["queryPlan"];
print("nReturned:", xs["nReturned"],
"executionTimeMillis:", xs["executionTimeMillis"],
"totalKeysExamined:", xs["totalKeysExamined"],
"totalDocsExamined:", xs["totalDocsExamined"],
xp["stage"]+" technique:", xp["strategy"],
)
The lookup stage has returned all paperwork, because it should be joined earlier than filtering, in 2.5 seconds:
nReturned: 260026
executionTimeMillis: 2456
totalKeysExamined: 190019
totalDocsExamined: 450045
EQ_LOOKUP technique: IndexedLoopJoin
The equality lookup used an listed loop be a part of technique, with an index scan for every doc:
-
nReturned: 260026: All native paperwork with their joined profile arrays -
executionTimeMillis: 2456: Complete execution time together with each be a part of and filter levels -
totalKeysExamined: 190019: Solely keys that discovered matches within the profiles assortment’s index on “_id” are accounted (lookup_query_stats). The index can decide “key not discovered” with out truly inspecting a key entry. -
totalDocsExamined: 450045: Customers assortment scan (260,026) + profile paperwork fetched (190,019)
The variety of profiles examined is excessive in comparison with the variety of profiles within the assortment. One other algorithm will be quicker by studying all profiles as soon as and lookup from a hash desk.
Hash be a part of on small unindexed desk
MongoDB 8.0 chooses a hash be a part of on the next circumstances:
-
allowDiskUse: trueis about (required for spilling if hash desk exceeds reminiscence) - International assortment is sufficiently small—managed by these parameters:
internalQueryCollectionMaxNoOfDocumentsToChooseHashJoin(default: 10,000 docs),internalQueryCollectionMaxDataSizeBytesToChooseHashJoin(default: 100 MB), andinternalQueryCollectionMaxStorageSizeBytesToChooseHashJoin(default: 100 MB) - No appropriate index exists, disk use is allowed and hash be a part of is extra environment friendly
To point out that, I take advantage of the “ID” subject, that isn’t listed, for the lookup:
x=db.customers.combination([
{ "$lookup" : {
"from" : "profiles",
"localField" : "profileID",
"foreignField" : "ID",
"as" : "profile" }},
{
$match: {
$or: [
{ "profile": { $eq: [] } },
{ "profile": { $elemMatch: { standing: 2 } } },
]
}
},
],{ allowDiskUse: true } ).clarify("executionStats")
xs=x["stages"][0]["$cursor"]["executionStats"];
xp=x["stages"][0]["$cursor"]["queryPlanner"]["winningPlan"]["queryPlan"];
print("nReturned:", xs["nReturned"],
"executionTimeMillis:", xs["executionTimeMillis"],
"totalKeysExamined:", xs["totalKeysExamined"],
"totalDocsExamined:", xs["totalDocsExamined"],
xp["stage"]+" technique:", xp["strategy"],
)
The hash be a part of accomplished 3.3x quicker (750ms vs 2,456ms) with considerably totally different execution patterns:
nReturned: 260026
executionTimeMillis: 750
totalKeysExamined: 0
totalDocsExamined: 260030
EQ_LOOKUP technique: HashJoin
Hash be a part of works in two phases, with no index required (totalKeysExamined: 0):
- Construct part: Scans the international assortment as soon as to construct an in-memory hash desk keyed by foreignField values. It has learn the 4 profiles.
- Probe part: Scans the native assortment as soon as, probing the hash desk for every native key. It has learn 260,026 customers.
The entire is 260,030 paperwork examined.
Nested loop be a part of with out index
A 3rd choice is a nested loop be a part of that scans the gathering in every loop, relatively than utilizing an index or constructing a hash desk. I disable disk utilization to keep away from hash be a part of and use the non-indexed subject to keep away from listed nested loop:
x=db.customers.combination([
{ "$lookup" : {
"from" : "profiles",
"localField" : "profileID",
"foreignField" : "ID",
"as" : "profile" }},
{
$match: {
$or: [
{ "profile": { $eq: [] } },
{ "profile": { $elemMatch: { standing: 2 } } },
]
}
},
],{ allowDiskUse: false } ).clarify("executionStats")
xs=x["stages"][0]["$cursor"]["executionStats"];
xp=x["stages"][0]["$cursor"]["queryPlanner"]["winningPlan"]["queryPlan"];
print("nReturned:", xs["nReturned"],
"executionTimeMillis:", xs["executionTimeMillis"],
"totalKeysExamined:", xs["totalKeysExamined"],
"totalDocsExamined:", xs["totalDocsExamined"],
xp["stage"]+" technique:", xp["strategy"],
)
Listed here are the execution statistics:
nReturned: 260026
executionTimeMillis: 1618
totalKeysExamined: 0
totalDocsExamined: 1300130
EQ_LOOKUP technique: NestedLoopJoin
Like different algorithms, 260,026 customers have been learn. Since there was no index on the international subject, there is no index scan in any respect (totalKeysExamined: 0). This brought about the system to scan the 4 profiles for every person, leading to a complete of 260,026 + 4 × 260,026 = 1,300,130 paperwork examined.
On this instance, this strategy is 5 instances extra expensive than listed loop be a part of when it comes to paperwork examined, and 3 times greater than hash be a part of as a result of nested loop be a part of requires repeatedly scanning the international assortment. Curiously, as a result of the lookup assortment could be very small and sequential scans are cache-friendly, the execution time surpasses that of the listed nested loop on this case, as index seeks incur further prices.
Listed nested loop with masking index
When you have your key in one other subject than “_id”, like right here with “ID”, you may create an index on it and the plan will use an listed nested loop be a part of. This has the identical efficiency because the question utilizing “_id” as a result of, aside from clustered collections, all indexes are secondary in MongoDB and entry the doc by way of an inner report identifier. Utilizing “_id” avoids creating yet another index. Nonetheless, one motive to create one other index is to get it to cowl all mandatory fields:
db.profiles.createIndex({ ID: 1, standing: 1 })
Now, the earlier question utilizing “ID” because the international subject will use the index and get the identical efficiency because the listed nested loop be a part of above when utilizing “_id”. Nonetheless, it isn’t a masking index as a result of there are not any projections outlined.
x=db.customers.combination([
{ "$lookup" : {
"from" : "profiles",
"localField" : "profileID",
"foreignField" : "ID",
"as" : "profile" }},
{
$match: {
$or: [
{ "profile": { $eq: [] } },
{ "profile": { $elemMatch: { standing: 2 } } },
]
}
},
]).clarify("executionStats")
xs=x["stages"][0]["$cursor"]["executionStats"];
xp=x["stages"][0]["$cursor"]["queryPlanner"]["winningPlan"]["queryPlan"];
print("nReturned:", xs["nReturned"],
"executionTimeMillis:", xs["executionTimeMillis"],
"totalKeysExamined:", xs["totalKeysExamined"],
"totalDocsExamined:", xs["totalDocsExamined"],
xp["stage"]+" technique:", xp["strategy"],
)
Listed here are the execution statistics, the identical as with the index on “_id”, working in 2.5 seconds:
nReturned: 260026
executionTimeMillis: 2484
totalKeysExamined: 190019
totalDocsExamined: 450045
EQ_LOOKUP technique: IndexedLoopJoin
I can add a projection to the lookup pipeline, however watch out, it may be slower:
x=db.customers.combination([
{ "$lookup" : {
"from" : "profiles",
"localField" : "profileID",
"foreignField" : "ID",
pipeline: [
{ $project: { _id: 0, ID: 1, status: 1 } }
],
"as" : "profile" }},
{
$match: {
$or: [
{ "profile": { $eq: [] } },
{ "profile": { $elemMatch: { standing: 2 } } },
]
}
},
] ).clarify("executionStats")
xs=x["stages"][1];
The index is used as a masking index however sadly, this does not use the Slot-Primarily based Question Execution Engine (SBE) and at last takes longer:
{
'$lookup': {
from: 'profiles',
as: 'profile',
localField: 'profileID',
foreignField: 'ID',
let: {},
pipeline: [ { '$project': { _id: 0, ID: 1, status: 1 } } ]
},
totalDocsExamined: Lengthy('0'),
totalKeysExamined: Lengthy('190019'),
collectionScans: Lengthy('0'),
indexesUsed: [ 'ID_1_status_1' ],
nReturned: Lengthy('260026'),
executionTimeMillisEstimate: Lengthy('21309')
}
The absence of assortment scan (collectionScans: 0) confirms that an index is used, and it is the index I’ve created (indexesUsed: [ 'ID_1_status_1' ]). The absence of doc examined (totalDocsExamined: 0) confirms that it makes use of a masking index. Nonetheless, it took 21 seconds to execute.
When a pipeline is added to the lookup stage, SBE can now not be used, so the basic engine plans every lookup question at runtime, which explains the longer elapsed time. SBE’s optimized execution mannequin for joins requires the be a part of to be fully specified at planning time, with solely localField/foreignField and no pipeline (determineSbeCompatibility()). In distinction, masking indexes want a pipeline for runtime projection management, making it not possible to make use of each optimizations collectively. Moreover, the present model (findSbeCompatibleStagesForPushdown) limits lookup to totally native collections and considers whether or not an unwind follows. If you already know there can be just one aspect, use { $arrayElemAt:["$profiles.status",0]} as a substitute of $unwind: "$profiles" for higher effectivity.
Nested loop with materialization
There may be yet another chance that may be quicker than a nested loop, just like a hash be a part of, however with out constructing a hash desk, simply an array for every doc and the be a part of within the projection. It could evaluate to PostgreSQL’s Nested Loop with Materialize.
To learn the lookup solely as soon as, we use an empty pipeline with no be a part of situation. As I need to check the masking index, as a substitute of an empty pipeline (pipeline: [ ]), I add a filter and projection. This provides all profiles to every person. I outline the take part a projection after the lookup, the place I discover the array index for the profile, with $indexOfArray, and get the standing for this merchandise:
x=db.customers.combination([
{
$lookup: {
from: "profiles",
let: { pid: "$profileID" },
pipeline: [
{ $match: { ID: { $ne: null } } },
{ $project: { _id: 0, ID: 1, status: 1 } }
],
as: "profiles"
}
},
{
$challenge: {
standing: {
$arrayElemAt: [
"$profiles.status",
{ $indexOfArray: ["$profiles.ID", "$profileID"] }
]
},
matchFound: {
$ne: [
{ $indexOfArray: ["$profiles.ID", "$profileID"] },
-1
]
}
}
},
{
$match: {
$or: [
{ matchFound: false }, // true “no profile”
{ status: 2 }
]
}
}
]).clarify("executionStats")
The execution takes 3.5 seconds, not utilizing the SBE, and the lookup stage reads the profile solely as soon as, from the masking index:
check> xs=x["stages"][1]
{
'$lookup': {
from: 'profiles',
as: 'profiles',
let: { pid: '$profileID' },
pipeline: [
{ '$match': { ID: { '$ne': null } } },
{ '$project': { _id: 0, ID: 1, status: 1 } }
]
},
totalDocsExamined: Lengthy('0'),
totalKeysExamined: Lengthy('4'),
collectionScans: Lengthy('0'),
indexesUsed: [ 'ID_1_status_1' ],
nReturned: Lengthy('260026'),
executionTimeMillisEstimate: Lengthy('3447')
}
Moderately than being a local be a part of algorithm, this strategy is just like executing two separate queries after which becoming a member of the leads to the appliance. Nonetheless, with this methodology, the be a part of is carried out utilizing aggregation pipeline operators. It reads much less paperwork, and makes use of the masking index, however does not profit from the SBE optimisations.
Abstract
Like with any database, it is very important perceive the be a part of algorithms. Once you use a lookup within the aggregation pipeline, MongoDB selects the be a part of technique primarily based on the question, collections, and current indexes. The three algorithms are:
-
Listed Loop Be a part of is used when a appropriate index exists on foreignField. It’s best for a low to medium match price or when the international assortment is giant.
-
Hash Be a part of is used when allowDiskUse is about, and there’s no appropriate index, however solely when the international assortment is small (< 10,000 docs, < 100MB). It’s best with a excessive match price on plenty of paperwork.
-
Nested Loop Be a part of is a fallback when the earlier ones can’t be used. It’s acceptable just for tiny international collections.
In contrast to SQL databases, the place the optimizer makes all selections however also can result in surprises, MongoDB shifts accountability to builders. You have to:
- Design your schema with be a part of efficiency in thoughts. For bounded relationships, embedding could also be the perfect resolution.
- Perceive your information (assortment sizes, match charges) to foretell which technique would be the greatest.
-
Take a look at totally different methods by creating/dropping indexes and toggling
allowDiskUse. Create an index solely whenever you need to use it. -
Measure efficiency utilizing
clarify("executionStats")to validate your selections.
This design favors predictability and management over counting on automated optimization. So whenever you hear statements like “Joins are sluggish” or “Lookups are quick,” take time to grasp the details and the way these operations are literally executed, earlier than forming an opinion.
