Sunday, November 30, 2025

INNER JOIN and LEFT OUTER JOIN in MongoDB (with $lookup and $unwind)


We frequently clarify $lookup as much like a relational JOIN, however there’s a key distinction within the returned output format. In SQL, a JOIN returns a flat, tabular consequence set, which implies values from the “one” facet of a one‑to‑many relationship are repeated for every matching row on the “many” facet. In MongoDB, $lookup outputs associated knowledge as arrays embedded inside every doc, so the “one” facet seems solely as soon as and associated gadgets are grouped collectively.

Right here’s the humorous half: relational databases had been invented to keep away from duplication by way of relational normalization — but a JOIN instantly re‑introduces duplication within the question outcomes, consumed as one stream of data. That’s why utility code or ORMs typically have to interrupt these repeated values again right into a nested construction. In MongoDB, when knowledge is saved utilizing references, $lookup can resolve these references and embed the associated paperwork instantly within the output, avoiding duplication and producing a construction that’s naturally extra utility‑pleasant. MongoDB is a doc database, both you embed on write to retailer associated knowledge in a single doc, otherwise you embed on learn from a number of paperwork with references, with $lookup, however the consequence is similar: an object mixture in JSON, or reasonably BSON, that’s fetched as-is by the drivers.

Nonetheless, MongoDB can present the equal of an INNER JOIN or a LEFT OUTER JOIN, when including an $unwind stage after the $lookup.

I will reveal this with the normal division and worker schema, utilizing the names from the SCOTT instance schema of Oracle database, however as MongoDB collections:

db.createCollection("dept");
db.dept.insertMany([
  { deptno: 10, dname: "ACCOUNTING", loc: "NEW YORK" },
  { deptno: 20, dname: "RESEARCH", loc: "DALLAS" },
  { deptno: 30, dname: "SALES", loc: "CHICAGO" },
  { deptno: 40, dname: "OPERATIONS", loc: "BOSTON" }
]);
db.createCollection("emp");
db.emp.insertMany([
  { empno: 7839, ename: "KING",  deptno: 10 },
  { empno: 7698, ename: "BLAKE", deptno: 30 },
  { empno: 7782, ename: "CLARK", deptno: 10 },
  { empno: 7566, ename: "JONES", deptno: 20 },
  { empno: 7788, ename: "SCOTT", deptno: 20 },
  { empno: 7902, ename: "FORD",  deptno: 20 },
  { empno: 7369, ename: "SMITH", deptno: 20 },
  { empno: 7499, ename: "ALLEN", deptno: 30 },
  { empno: 7521, ename: "WARD",  deptno: 30 },
  { empno: 7654, ename: "MARTIN",deptno: 30 },
  { empno: 7844, ename: "TURNER",deptno: 30 },
  { empno: 7876, ename: "ADAMS", deptno: 20 },
  { empno: 7900, ename: "JAMES", deptno: 30 },
  { empno: 7934, ename: "MILLER",deptno: 10 }
]);
Enter fullscreen mode

Exit fullscreen mode

Word that the “OPERATIONS” division has no workers at this time limit.



$lookup: be part of with no knowledge duplication

In a SQL database, becoming a member of tables duplicates knowledge to make sure every document has all needed data for unbiased processing. When loading this knowledge as objects reasonably than data, the appliance should detect the place objects start and finish and remove duplicates—or depend on an ORM to do that mechanically. Right here is an instance of a be part of to indicate the departments with their workers:

SQL> choose dept.*,empno,ename 
     from dept left outer be part of emp on dept.deptno = emp.deptno
;

   DEPTNO DNAME         LOC            EMPNO ENAME
_________ _____________ ___________ ________ _________
       10 ACCOUNTING    NEW YORK       7,782 CLARK
       10 ACCOUNTING    NEW YORK       7,839 KING
       10 ACCOUNTING    NEW YORK       7,934 MILLER
       20 RESEARCH      DALLAS         7,369 SMITH
       20 RESEARCH      DALLAS         7,566 JONES
       20 RESEARCH      DALLAS         7,788 SCOTT
       20 RESEARCH      DALLAS         7,876 ADAMS
       20 RESEARCH      DALLAS         7,902 FORD
       30 SALES         CHICAGO        7,499 ALLEN
       30 SALES         CHICAGO        7,521 WARD
       30 SALES         CHICAGO        7,654 MARTIN
       30 SALES         CHICAGO        7,698 BLAKE
       30 SALES         CHICAGO        7,844 TURNER
       30 SALES         CHICAGO        7,900 JAMES
       40 OPERATIONS    BOSTON

15 rows chosen.

Enter fullscreen mode

Exit fullscreen mode

MongoDB offers a constant doc mannequin throughout each utility code and database storage, and returns the identical be part of consequence with out duplication:

db.dept.mixture([
  { $lookup: {
      from: "emp",
      localField: "deptno",
      foreignField: "deptno",
      as: "employees"
    }
  }
])

[
  {
    _id: ObjectId('691ef750f4a22d6613d4b0da'), deptno: 10, dname: 'ACCOUNTING', loc: 'NEW YORK',
    employees: [
      { _id: ObjectId('691f1623f4a22d6613d4b0f1'), empno: 7839, ename: 'KING', deptno: 10 },
      { _id: ObjectId('691f1623f4a22d6613d4b0f3'), empno: 7782, ename: 'CLARK', deptno: 10 },
      { _id: ObjectId('691f1623f4a22d6613d4b0fe'), empno: 7934, ename: 'MILLER', deptno: 10 }
    ]
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0db'),deptno: 20,dname: 'RESEARCH',loc: 'DALLAS',
    workers: [
      { _id: ObjectId('691f1623f4a22d6613d4b0f4'), empno: 7566, ename: 'JONES', deptno: 20 },
      { _id: ObjectId('691f1623f4a22d6613d4b0f5'), empno: 7788, ename: 'SCOTT', deptno: 20 },
      { _id: ObjectId('691f1623f4a22d6613d4b0f6'), empno: 7902, ename: 'FORD', deptno: 20 },
      { _id: ObjectId('691f1623f4a22d6613d4b0f7'), empno: 7369, ename: 'SMITH', deptno: 20 },
      { _id: ObjectId('691f1623f4a22d6613d4b0fc'), empno: 7876, ename: 'ADAMS', deptno: 20 }
    ]
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0dc'),deptno: 30,dname: 'SALES',loc: 'CHICAGO',
    workers: [
      { _id: ObjectId('691f1623f4a22d6613d4b0f2'), empno: 7698, ename: 'BLAKE', deptno: 30 },
      { _id: ObjectId('691f1623f4a22d6613d4b0f8'), empno: 7499, ename: 'ALLEN', deptno: 30 },
      { _id: ObjectId('691f1623f4a22d6613d4b0f9'), empno: 7521, ename: 'WARD', deptno: 30 },
      { _id: ObjectId('691f1623f4a22d6613d4b0fa'), empno: 7654, ename: 'MARTIN', deptno: 30 },
      { _id: ObjectId('691f1623f4a22d6613d4b0fb'), empno: 7844, ename: 'TURNER', deptno: 30 },
      { _id: ObjectId('691f1623f4a22d6613d4b0fd'), empno: 7900, ename: 'JAMES', deptno: 30 }
    ]
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0dd'), deptno: 40,dname: 'OPERATIONS', loc: 'BOSTON',
    workers: []
  }
]
Enter fullscreen mode

Exit fullscreen mode

The $lookup operation in MongoDB is much like a LEFT OUTER JOIN besides that it retains the info in a structured schema, to be consumed by the appliance.



$lookup + $unwind : inside be part of (with duplication)

The results of $lookup is the simplest for the appliance, however when you want a record-like output to course of additional, $unwind can unnest the array into as many paperwork because the array gadgets:

db.dept.mixture([
  { $lookup: {
      from: "emp",
      localField: "deptno",
      foreignField: "deptno",
      as: "employees"
    }
  },
  { $unwind: 
       "$employees" 
  }
])

[
  {
    _id: ObjectId('691ef750f4a22d6613d4b0da'), deptno: 10, dname: 'ACCOUNTING', loc: 'NEW YORK',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f1'), empno: 7839, ename: 'KING', deptno: 10 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0da'), deptno: 10, dname: 'ACCOUNTING', loc: 'NEW YORK',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f3'), empno: 7782, ename: 'CLARK', deptno: 10 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0da'), deptno: 10, dname: 'ACCOUNTING', loc: 'NEW YORK',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0fe'), empno: 7934, ename: 'MILLER', deptno: 10 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f4'), empno: 7566, ename: 'JONES', deptno: 20 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f5'), empno: 7788, ename: 'SCOTT', deptno: 20 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f6'), empno: 7902, ename: 'FORD', deptno: 20 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f7'), empno: 7369, ename: 'SMITH', deptno: 20 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0fc'), empno: 7876, ename: 'ADAMS', deptno: 20 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f2'), empno: 7698, ename: 'BLAKE', deptno: 30 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f8'), empno: 7499, ename: 'ALLEN', deptno: 30 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f9'), empno: 7521, ename: 'WARD', deptno: 30 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0fa'), empno: 7654, ename: 'MARTIN', deptno: 30 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0fb'), empno: 7844, ename: 'TURNER', deptno: 30 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0fd'), empno: 7900, ename: 'JAMES', deptno: 30 }
  }
]
Enter fullscreen mode

Exit fullscreen mode

Internally, the implementation avoids materializing the array and instantly returns one doc per match, the 2 phases $lookup and $unwind being optimized right into a single lookup-unwind stage that acts like a SQL be part of.

Word that as a result of there aren’t any gadgets within the workers array, there aren’t any paperwork within the output for division 40. That is precisely like an INNER JOIN in SQL.



$lookup + $unwind preserving empty arrays: left outer be part of

It’s attainable to protect the LEFT OUTER JOIN output of $lookup by way of $unwind with preserveNullAndEmptyArrays: true:

db.dept.mixture([
  { $lookup: {
      from: "emp",
      localField: "deptno",
      foreignField: "deptno",
      as: "employees"
    }
  },
  { $unwind: {
       path: "$employees" ,
       preserveNullAndEmptyArrays: true
     }
  }
])

[
  {
    _id: ObjectId('691ef750f4a22d6613d4b0da'), deptno: 10, dname: 'ACCOUNTING', loc: 'NEW YORK',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f1'), empno: 7839, ename: 'KING', deptno: 10 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0da'), deptno: 10, dname: 'ACCOUNTING', loc: 'NEW YORK',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f3'), empno: 7782, ename: 'CLARK', deptno: 10 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0da'), deptno: 10, dname: 'ACCOUNTING', loc: 'NEW YORK',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0fe'), empno: 7934, ename: 'MILLER', deptno: 10 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f4'), empno: 7566, ename: 'JONES', deptno: 20 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f5'), empno: 7788, ename: 'SCOTT', deptno: 20 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f6'), empno: 7902, ename: 'FORD', deptno: 20 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f7'), empno: 7369, ename: 'SMITH', deptno: 20 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0fc'), empno: 7876, ename: 'ADAMS', deptno: 20 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f2'), empno: 7698, ename: 'BLAKE', deptno: 30 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f8'), empno: 7499, ename: 'ALLEN', deptno: 30 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0f9'), empno: 7521, ename: 'WARD', deptno: 30 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0fa'), empno: 7654, ename: 'MARTIN', deptno: 30 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0fb'), empno: 7844, ename: 'TURNER', deptno: 30 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
    employees: { _id: ObjectId('691f1623f4a22d6613d4b0fd'), empno: 7900, ename: 'JAMES', deptno: 30 }
  },
  {
    _id: ObjectId('691ef750f4a22d6613d4b0dd'), deptno: 40, dname: 'OPERATIONS', loc: 'BOSTON'
  }
]
Enter fullscreen mode

Exit fullscreen mode

With preserveNullAndEmptyArrays: true the empty array for division 40 nonetheless generated a doc, however with no “workers” fields from workers as there aren’t any workers. This is a bonus of the versatile schema: no want so as to add NULL values into fields for an outer be part of consequence.

That is one other humorous factor in SQL databases, a NULL is meant to mark an unknown worth, and a main secret’s presupposed to by no means be NULL, however the results of a left outer be part of returns nulls into all columns from the appropriate facet, together with columns from the first key. MongoDB versatile schema permits the absence of the fields to characterize this extra clearly.



$lookup from an array in localField

In SQL, a many-to-many relationship requires an affiliation desk and two joins. In MongoDB this associative construction will be embedded as an array, and be queried with a single lookup. When localField is an array, $lookup mechanically matches any worth in that array to the international subject. For instance, I retailer the earlier departments of an worker:

db.emp.updateOne(  
  { _id: ObjectId("691f1623f4a22d6613d4b0f5") }, // SCOTT
  { $set: { previousDepts: [ 30, 40 ] } }  
)  
Enter fullscreen mode

Exit fullscreen mode

I take advantage of the identical $lookup, however on this historical past, and it joins every worth:

db.emp.mixture([
  { $match: {
      "ename": "SCOTT"
    }
  },
  { $lookup: {
      from: "dept",
      localField: "previousDepts",
      foreignField: "deptno",
      as: "previousDepartments"
    }
  }
])

[
  {
    _id: ObjectId('691f1623f4a22d6613d4b0f5'),
    empno: 7788,
    ename: 'SCOTT',
    deptno: 20,
    deptHistory: [ { dept: 30 }, { dept: 40 } ],
    previousDepts: [ 30, 40 ],
    previousDepartments: [
      {
        _id: ObjectId('691ef750f4a22d6613d4b0dc'),
        deptno: 30,
        dname: 'SALES',
        loc: 'CHICAGO'
      },
      {
        _id: ObjectId('691ef750f4a22d6613d4b0dd'),
        deptno: 40,
        dname: 'OPERATIONS',
        loc: 'BOSTON'
      }
    ]
  }
]
Enter fullscreen mode

Exit fullscreen mode



Conclusion

Whereas MongoDB’s $lookup is usually in comparison with SQL JOINs, it really works otherwise and produces outcomes which might be higher suited to functions, embedding associated knowledge as arrays inside paperwork as an alternative of duplicating values in a flat desk. This avoids the additional processing that SQL functions or ORMs typically want to rework repeated rows again into nested constructions. Slightly than making an attempt to duplicate SQL patterns in MongoDB, it’s essential to assume from the appliance’s perspective—how the info will likely be consumed—and use MongoDB’s doc mannequin, versatile schema, and aggregation options like $lookup, $unwind, and array matching to form the ends in essentially the most helpful type in your code. This strategy leverages MongoDB’s strengths to ship structured, prepared‑to‑use JSON objects instantly from the database.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles