There may be an important distinction between MongoDB’s BSON and PostgreSQL’s JSONB. Each are binary JSON codecs, however they serve completely different roles. JSONB is only an inner storage format for JSON knowledge in PostgreSQL. BSON, alternatively, is MongoDB’s native knowledge format: it’s utilized by software drivers, over the community, in reminiscence, and on disk.
JSONB: PostgreSQL inner storage format
JSONB is a storage format, as outlined by the PostgreSQL documentation:
PostgreSQL presents two sorts for storing JSON knowledge: json and jsonb
PostgreSQL makes use of JSONB solely for inner storage, requiring the whole construction to be learn to entry a subject, as noticed in JSONB DeTOASTing (learn amplification).
BSON: MongoDB storage and trade format
BSON is used for storage and in addition as an trade format between the appliance and the database, as outlined within the BSON specification:
BSON might be compared to binary interchange formats, like Protocol Buffers. BSON is extra “schema-less” than Protocol Buffers
On the appliance facet, the MongoDB driver converts software objects to BSON, which helps extra knowledge sorts than JSON or JSONB, together with datetime and binary. This BSON is shipped and obtained over the community and saved and manipulated on the server as-is, with no additional parsing. Each the driving force and the database can effectively entry fields through the binary construction as a result of BSON contains metadata resembling subject size prefixes and express kind data, even for big or nested paperwork.
PostgreSQL protocol is JSON (textual content), not JSONB
As an example this, I’ve written a small Python program that inserts a doc right into a PostgreSQL desk with a JSONB column, and queries that desk to retrieve the doc:
from sqlalchemy import Column, Integer, create_engine
from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy.orm import declarative_base, sessionmaker
Base = declarative_base()
class Merchandise(Base):
__tablename__ = 'gadgets'
id = Column(Integer, primary_key=True)
knowledge = Column(JSONB) # our JSONB column
# Hook up with Postgres
engine = create_engine('postgresql+psycopg2://', echo=True)
Session = sessionmaker(bind=engine)
session = Session()
# Create desk
Base.metadata.create_all(engine)
# Insert an object into JSONB column
obj = {"title": "widget", "value": 9.99, "tags": ["new", "sale"]}
session.add(Merchandise(knowledge=obj))
session.commit()
# Learn again the desk
for row in session.question(Merchandise).all():
print(row.id, row.knowledge)
This system makes use of SQLAlchemy to ship and retrieve Python objects to and from PostgreSQL through the Psycopg2 driver. I’ve saved it in demo.py.
After I run this system, with python demo.py, earlier than it shows the ultimate outcome, it logs all SQL statements:
2025-12-21 12:50:22,484 INFO sqlalchemy.engine.Engine choose pg_catalog.model()
2025-12-21 12:50:22,485 INFO sqlalchemy.engine.Engine [raw sql] {}
2025-12-21 12:50:22,486 INFO sqlalchemy.engine.Engine choose current_schema()
2025-12-21 12:50:22,486 INFO sqlalchemy.engine.Engine [raw sql] {}
2025-12-21 12:50:22,486 INFO sqlalchemy.engine.Engine present standard_conforming_strings
2025-12-21 12:50:22,486 INFO sqlalchemy.engine.Engine [raw sql] {}
2025-12-21 12:50:22,487 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-21 12:50:22,488 INFO sqlalchemy.engine.Engine choose relname from pg_class c be part of pg_namespace n on n.oid=c.relnamespace the place pg_catalog.pg_table_is_visible(c.oid) and relname=%(title)s
2025-12-21 12:50:22,488 INFO sqlalchemy.engine.Engine [generated in 0.00015s] {'title': 'gadgets'}
2025-12-21 12:50:22,489 INFO sqlalchemy.engine.Engine
CREATE TABLE gadgets (
id SERIAL NOT NULL,
knowledge JSONB,
PRIMARY KEY (id)
)
2025-12-21 12:50:22,489 INFO sqlalchemy.engine.Engine [no key 0.00011s] {}
2025-12-21 12:50:22,491 INFO sqlalchemy.engine.Engine COMMIT
2025-12-21 12:50:22,493 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-21 12:50:22,494 INFO sqlalchemy.engine.Engine INSERT INTO gadgets (knowledge) VALUES (%(knowledge)s) RETURNING gadgets.id
2025-12-21 12:50:22,494 INFO sqlalchemy.engine.Engine [generated in 0.00018s] {'knowledge': '{"title": "widget", "value": 9.99, "tags": ["new", "sale"]}'}
2025-12-21 12:50:22,495 INFO sqlalchemy.engine.Engine COMMIT
2025-12-21 12:50:22,497 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-21 12:50:22,498 INFO sqlalchemy.engine.Engine SELECT gadgets.id AS items_id, gadgets.knowledge AS items_data
FROM gadgets
2025-12-21 12:50:22,498 INFO sqlalchemy.engine.Engine [generated in 0.00013s] {}
1 {'title': 'widget', 'tags': ['new', 'sale'], 'value': 9.99}
To see what is shipped and obtained by means of the community by the PostgreSQL protocol, I run this system with strace, exhibiting the sendto and recv system calls with their arguments: strace -e hint=sendto,recvfrom -yy -s 1000 python demo.py.
Like most SQL database drivers, the protocol is fundamental: ship SQL instructions as textual content, and fetch a tabular outcome set. Within the PostgreSQL protocol’s messages, the primary letter is the message kind (Q for Easy Question Message, adopted by the size of the message, and the message in textual content, X to terminate the session, C for command completion standing, T abd D for the resultset).
Right here is the output, the traces beginning with the timestamp are the logs from SQL Alchemy, these beginning with sendto and recv are the community system calls with the message to the database, and the outcome from it
The place is the hint when inserting one doc:
2025-12-21 16:52:20,278 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-21 16:52:20,279 INFO sqlalchemy.engine.Engine INSERT INTO gadgets (knowledge) VALUES (%(knowledge)s) RETURNING gadgets.id
2025-12-21 16:52:20,279 INFO sqlalchemy.engine.Engine [generated in 0.00029s] {'knowledge': '{"title": "widget", "value": 9.99, "tags": ["new", "sale"]}'}
sendto(3<TCPv6:[[::1]:41858->[::1]:5432]>, "Q