Influence of Beginning PostgreSQL Service Manually in an Energetic Patroni Cluster

A Nightmare no DBA desires

Someday, I had a nightmare, the sort each DBA fears.
My supervisor known as me and stated,
“Hey… Mr. Intelligent. We don’t want you on the group anymore.”

I panicked.
“What did I do incorrect?!”

He replied,
“You inform me. Take into consideration your ‘nice’ work yesterday.”

Making an attempt to defend myself, I stated proudly:
“Sir, one node was down. PostgreSQL service was inactive, so I began it… and the node got here again up!
Fairly sensible, proper?”

He sighed, paused for dramatic impact, and stated,
“Get nicely quickly.”
Then he hung up.

I awakened immediately, my coronary heart racing.
Not due to the boss,
However as a result of the true horror hit me:

We’re working a Patroni cluster…
…and I began PostgreSQL manually in my dream.
That’s not “intelligent work.”
That’s a database-level crime, punishable by
“Why-is-my-cluster-broken-now?”
and
“Why-are-my-replicas-on-a-different-timeline?”
and probably
“Why-is-ETCD-looking-at-me-like-that?”

Fortunately, it was solely a nightmare.

This weblog goals to assist stop anybody from dwelling that nightmare. It clearly explains what occurred, why it’s harmful, and easy methods to keep away from it.

Understanding Patroni’s Position within the Cluster :

Patroni is liable for managing :

Startup and shutdown of PostgreSQL
Replication configuration
Chief election and failover/fencing choices
WAL (Write-Forward Logging) timeline coordination
Restoration choices and cluster consistency
Coordinating cluster state with the DCS (e.g., ETCD)
Stopping divergence and corruption

What Truly Occurs When PostgreSQL Begins Outdoors Patroni

In a Patroni-based high-availability (HA) cluster, it isn’t really useful to manually begin PostgreSQL in a working Patroni cluster utilizing both the postgres service or pg_ctl.

When you begin PostgreSQL manually whereas Patroni is down:

When PostgreSQL is manually began whereas Patroni is down, the node is not underneath Patroni’s management and begins working as a standalone occasion.

Nevertheless, this doesn’t instantly create a conflicting timeline.

A conflicting (divergent) timeline solely happens if a failover or switchover occurs whereas the node is standalone.
If the manually began node was beforehand a duplicate, and no failover occurred, it might nonetheless proceed replicating from the first however it’s invisible to Patroni and can’t take part in coordinated HA.
If the manually began node was the former chief, and Patroni elects a brand new chief elsewhere, solely then does a real timeline divergence happen.
Standalone duplicate + no failover → replication could proceed, no new timeline
Standalone chief + failover happens → timeline divergence, requiring pg_rewind or full reinit

Takeaway: A manually began main is probably the most harmful situation; it may well instantly result in a divergent timeline if a brand new chief is elected. A manually began duplicate continues to be harmful, however divergence happens provided that a frontrunner change occurs whereas it runs standalone.

Earlier than diving deeper, I’d wish to share a take a look at with you to supply a greater understanding of potential outcomes.

Detailed Take a look at Case:

Working System: Crimson Hat Enterprise Linux launch 9.6 (Plow) Cluster Topology: Node IP Handle Position (Preliminary) node1 192.168.183.131 Chief node2 192.168.183.129 Duplicate node3 192.168.183.130 Duplicate PostgreSQL Model: 17.7 Patroni: Energetic on all nodes DCS: etcd 3-node cluster Notice: Confirm ETCD main model compatibility. Older Linux distributions should still ship legacy ETCD releases which are incompatible with fashionable Patroni variations. Replication: Streaming pg_rewind: Enabled

Working System:

Crimson Hat Enterprise Linux launch 9.6 (Plow)

Cluster Topology:

Node IP Handle Position (Preliminary)

node1 192.168.183.131 Chief

node2 192.168.183.129 Duplicate

node3 192.168.183.130 Duplicate

PostgreSQL Model: 17.7

Patroni: Energetic on all nodes

DCS: etcd 3–node cluster

Notice: Confirm ETCD main model compatibility. Older Linux distributions could nonetheless ship legacy ETCD releases that are incompatible with fashionable Patroni variations.

Replication: Streaming

pg_rewind: Enabled

Seize Baseline State

[root@node1 ~]# patronictl -c /and many others/patroni/patroni.yml checklist + Cluster: postgres (7575009857136431951) ——-+—-+————-+—–+————+—–+ | Member | Host | Position | State | TL | Obtain LSN | Lag | Replay LSN | Lag | +——–+—————–+———+———–+—-+————-+—–+————+—–+ | node1 | 192.168.183.131 | Chief | working | 4 | | | | | | node2 | 192.168.183.129 | Duplicate | streaming | 4 | 0/5055240 | 0 | 0/5055240 | 0 | | node3 | 192.168.183.130 | Duplicate | streaming | 4 | 0/5055240 | 0 | 0/5055240 | 0 | +——–+—————–+———+———–+—-+————-+—–+————+—–+

[root@node1 ~]# patronictl -c /and many others/patroni/patroni.yml checklist

+ Cluster: postgres (7575009857136431951) ———–+——+——————–+——–+——————+——–+

+————+————————–+————–+—————–+——+——————–+——–+——————+——–+

| node2 | 192.168.183.129 | Duplicate | streaming | 4 | 0/5055240 | 0 | 0/5055240 | 0 |

| node3 | 192.168.183.130 | Duplicate | streaming | 4 | 0/5055240 | 0 | 0/5055240 | 0 |

+————+————————–+————–+—————–+——+——————–+——–+——————+——–+

This confirms cluster well being: replicas have acquired and replayed WAL as much as the chief’s present LSN and all nodes are on the identical timeline.

Crash Take a look at & Failure Simulation:

Now, simulating the crash on node1:

1] Kill Patroni on node1 (the present chief)
2] Begin PostgreSQL manually:

[root@node1 ~]# systemctl begin postgresql-17.service

[root@node1 ~]# systemctl begin postgresql-17.service

3] Node1 is now a standalone main on timeline 4.
4] In the meantime, Patroni promotes node3 → timeline 5.

Patroni now exhibits the modified cluster state:

[root@node1 ~]# patronictl -c /and many others/patroni/patroni.yml checklist + Cluster: postgres (7575009857136431951) ——-+—-+————-+—–+————+—–+ | Member | Host | Position | State | TL | Obtain LSN | Lag | Replay LSN | Lag | +——–+—————–+———+———–+—-+————-+—–+————+—–+ | node2 | 192.168.183.129 | Duplicate | streaming | 5 | 0/5055380 | 0 | 0/5055380 | 0 | | node3 | 192.168.183.130 | Chief | working | 5 | | | | | +——–+—————–+———+———–+—-+————-+—–+————+—–+

[root@node1 ~]# patronictl -c /and many others/patroni/patroni.yml checklist

+ Cluster: postgres (7575009857136431951) ———–+——+——————–+——–+——————+——–+

+————+————————–+————–+—————–+——+——————–+——–+——————+——–+

| node2 | 192.168.183.129 | Duplicate | streaming | 5 | 0/5055380 | 0 | 0/5055380 | 0 |

+————+————————–+————–+—————–+——+——————–+——–+——————+——–+

From the output proven above, we will see {that a} timeline divergence has occurred, and node1 is lacking as a result of its providers are down. Quite than beginning the Patroni providers, I immediately initiated the PostgreSQL service. The output additionally signifies that node3 is now the chief and node2 is functioning because the duplicate, each working on the identical timeline. Moreover, it’s clear that node1 is not a part of the Patroni cluster.

Divergence Demonstration

Writes on standalone node1

postgres=# SELECT timeline_id FROM pg_control_checkpoint(); timeline_id ————- 4 (1 row) postgres=# SELECT pg_current_wal_lsn(); pg_current_wal_lsn ——————– 0/50552F0 (1 row)

postgres=# SELECT timeline_id FROM pg_control_checkpoint();

timeline_id

——————–

(1 row)

postgres=# SELECT pg_current_wal_lsn();

pg_current_wal_lsn

——————————

0/50552F0

(1 row)

Now carry out transactions on the standalone node1:

postgres=# create desk crash_patroni (id int); CREATE TABLE postgres=# SELECT pg_current_wal_lsn(); pg_current_wal_lsn ——————– 0/506EDF8 (1 row) postgres=# insert into crash_patroni values(1), (2), (3); INSERT 0 3 postgres=# SELECT pg_current_wal_lsn(); pg_current_wal_lsn ——————– 0/5070E68 (1 row)

postgres=# create desk crash_patroni (id int);

CREATE TABLE

postgres=# SELECT pg_current_wal_lsn();

pg_current_wal_lsn

——————————

0/506EDF8

(1 row)

postgres=# insert into crash_patroni values(1), (2), (3);

INSERT 0 3

postgres=# SELECT pg_current_wal_lsn();

pg_current_wal_lsn

——————————

0/5070E68

(1 row)

In the meantime, on the brand new chief (node3):

postgres=# create desk crash_new_leader (id int); CREATE TABLE postgres=# postgres=# SELECT pg_current_wal_lsn(); pg_current_wal_lsn ——————– 0/5070B80 (1 row)

postgres=# create desk crash_new_leader (id int);

CREATE TABLE

postgres=#

postgres=# SELECT pg_current_wal_lsn();

pg_current_wal_lsn

——————————

0/5070B80

(1 row)

Node2 replicates Node3’s modifications usually.

postgres=# choose * from crash_new_leader; id —- 1 2 3 (3 rows)

postgres=# choose * from crash_new_leader;

——

(3 rows)

Patroni itemizing on node3:

[root@node3 ~]# patronictl -c /and many others/patroni/patroni.yml checklist + Cluster: postgres (7575009857136431951) ——-+—-+————-+—–+————+—–+ | Member | Host | Position | State | TL | Obtain LSN | Lag | Replay LSN | Lag | +——–+—————–+———+———–+—-+————-+—–+————+—–+ | node2 | 192.168.183.129 | Duplicate | streaming | 5 | 0/5070B80 | 0 | 0/5070B80 | 0 | | node3 | 192.168.183.130 | Chief | working | 5 | | | | | +——–+—————–+———+———–+—-+————-+—–+————+—–+

[root@node3 ~]# patronictl -c /and many others/patroni/patroni.yml checklist

+ Cluster: postgres (7575009857136431951) ———–+——+——————–+——–+——————+——–+

+————+————————–+————–+—————–+——+——————–+——–+——————+——–+

| node2 | 192.168.183.129 | Duplicate | streaming | 5 | 0/5070B80 | 0 | 0/5070B80 | 0 |

+————+————————–+————–+—————–+——+——————–+——–+——————+——–+

Rejoining node1 (begin Patroni)

When Patroni is restarted on node1, it detects the divergence and runs pg_rewind to synchronize the datadir with the brand new chief:

[root@node1 ~]# patronictl checklist node1 | Duplicate | TL 5

[root@node1 ~]# patronictl checklist

node1 | Duplicate | TL 5

Node1 is rewound and joins as a reproduction.

[root@node1 ~]# patronictl -c /and many others/patroni/patroni.yml checklist + Cluster: postgres (7575009857136431951) ——-+—-+————-+—–+————+—–+ | Member | Host | Position | State | TL | Obtain LSN | Lag | Replay LSN | Lag | +——–+—————–+———+———–+—-+————-+—–+————+—–+ | node1 | 192.168.183.131 | Duplicate | streaming | 5 | 0/5070CC0 | 0 | 0/5070CC0 | 0 | | node2 | 192.168.183.129 | Duplicate | streaming | 5 | 0/5070CC0 | 0 | 0/5070CC0 | 0 | | node3 | 192.168.183.130 | Chief | working | 5 | | | | | +——–+—————–+———+———–+—-+————-+—–+————+—–+

[root@node1 ~]# patronictl -c /and many others/patroni/patroni.yml checklist

+ Cluster: postgres (7575009857136431951) ———–+——+——————–+——–+——————+——–+

+————+————————–+————–+—————–+——+——————–+——–+——————+——–+

| node1 | 192.168.183.131 | Duplicate | streaming | 5 | 0/5070CC0 | 0 | 0/5070CC0 | 0 |

| node2 | 192.168.183.129 | Duplicate | streaming | 5 | 0/5070CC0 | 0 | 0/5070CC0 | 0 |

+————+————————–+————–+—————–+——+——————–+——–+——————+——–+

Now verify the desk created earlier on node1:

postgres=# choose * from crash_patroni; ERROR: relation “crash_patroni” doesn’t exist

postgres=# choose * from crash_patroni;

ERROR: relation “crash_patroni” does not exist

The desk created on node3 (crash_new_leader) is now current on node1:

postgres=# choose * from crash_new_leader; id —- 1 2 3 (3 rows)

postgres=# choose * from crash_new_leader;

——

(3 rows)

What occurred: Let me clarify.
Patroni detected:

Node1 was on timeline 4
The cluster progressed to timeline 5 (through promotion of node3)
Divergence occurred

Patroni invoked pg_rewind:

pg_rewind: servers diverged at WAL location… pg_rewind: rewinding… pg_rewind: Achieved!

pg_rewind: servers diverged at WAL location...

pg_rewind: rewinding...

pg_rewind: Achieved!

In consequence:

Node1’s knowledge listing was rewritten to match the brand new chief (node3)
The desk crash_patroni created whereas node1 was standalone was misplaced (knowledge loss)
Node1 efficiently rejoined as a reproduction on timeline 5
Writes carried out throughout standalone mode had been erased

Pg_rewind: stipulations and failure modes

pg_rewind is the most secure restoration methodology to restore a diverged node as a result of it:

Resets the diverging timeline
Reconstructs the datadir to match the brand new chief
Rejoins the node with out requiring a full base backup

Essential stipulations:

wal_log_hints = on OR knowledge checksums have to be enabled.
Patroni usually ensures wal_log_hints = on for you, however all the time confirm the setting, particularly on manually constructed clusters or upgraded nodes.

If pg_rewind is:

Disabled
Misconfigured
Fails on account of lacking WAL segments or catalog mismatch
Crashes unexpectedly

then reinitialization (a full base backup + restore) is the one restoration possibility.

Key Dangers and Issues

1. Patroni have to be the only controller

Handbook PostgreSQL begin = Rogue occasion.

2. Standalone writes WILL be misplaced

Each operation carried out in standalone mode disappears.

3. Break up-brain–like signs

Even when not a real split-brain, signs embrace:

Diverged timelines
Orphaned tables
Inconsistent LSNs
Complicated restoration choices

4. Operational confusion

Completely different nodes could disagree on whether or not a desk exists.

Precautions sensible steps

Masks the OS service
Stop unintentional systemctl begin postgresql-XX.service by masking the unit:

sudo systemctl masks postgresql-XX.service

sudo systemctl masks postgresql–XX.service

Monitor each Patroni and PostgreSQL providers
Create alerts for:

Sudden postgresql-XX.service = lively when patroni. service is down
Lacking Patroni heartbeat
Timeline divergence detection
Replicas not receiving WAL
Sudden PID or process-name mismatches
pg_rewind occasions in Patroni logs

Validate pg_rewind stipulations
Verify wal_log_hints = on or knowledge checksums are enabled earlier than taking cluster nodes into manufacturing.

Examine ETCD compatibility
Guarantee your ETCD main model is appropriate together with your Patroni launch; older distributions could comprise legacy ETCD packages.

Use port-based well being checks
Use netstat checks for port 5432 moderately than process-name matching throughout heterogeneous OS households.

Backup when unsure
When you suspect standalone writes occurred earlier than restarting Patroni, take a filesystem-level backup of the datadir if attainable — this might help forensic investigation.

Early detection is crucial to forestall knowledge loss.

Abstract

Beginning PostgreSQL manually in a Patroni-managed HA cluster is not a innocent mistake — it’s a data-consistency danger. It may trigger:

Timeline divergence

Break up-brain–like conduct

Misplaced transactions and DDL

Cluster confusion and administrative overhead

Compelled pg_rewind or full reinitialization

Observe the precautions above, guarantee pg_rewind stipulations, monitor each Patroni and PostgreSQL, and all the time begin PostgreSQL through Patroni. Try this, and also you’ll tremendously scale back the chance of waking as much as a DBA nightmare.

When you’re utilizing a Patroni cluster, all the time begin PostgreSQL by the Patroni service. Your database will admire it. Let’s hold every little thing working easily!

Influence of Beginning PostgreSQL Service Manually in an Energetic Patroni Cluster

Understanding Patroni’s Position within the Cluster :

What Truly Occurs When PostgreSQL Begins Outdoors Patroni

Seize Baseline State

Crash Take a look at & Failure Simulation:

Now, simulating the crash on node1:

1] Kill Patroni on node1 (the present chief)
2] Begin PostgreSQL manually:

Divergence Demonstration

Rejoining node1 (begin Patroni)

Key Dangers and Issues

1. Patroni have to be the only controller

2. Standalone writes WILL be misplaced

3. Break up-brain–like signs

4. Operational confusion

Precautions sensible steps

Abstract

Related Articles

The Open Supply Path to Freedom

Claude joins the ward as Anthropic eyes US healthcare knowledge • The Register

Taking humanoid soccer to the subsequent degree: An interview with RoboCup trustee Alessandra Rossi

LEAVE A REPLY Cancel reply

Latest Articles

The Open Supply Path to Freedom

Claude joins the ward as Anthropic eyes US healthcare knowledge • The Register

Taking humanoid soccer to the subsequent degree: An interview with RoboCup trustee Alessandra Rossi

Evaluating OCR-to-Markdown Methods Is Basically Damaged (and Why That’s Laborious to Repair)

Aerospace, defence and China ‘doing many of the heavy lifting’ in additive manufacturing

Influence of Beginning PostgreSQL Service Manually in an Energetic Patroni Cluster

Understanding Patroni’s Position within the Cluster :

What Truly Occurs When PostgreSQL Begins Outdoors Patroni

Seize Baseline State

Crash Take a look at & Failure Simulation:

Now, simulating the crash on node1:

1] Kill Patroni on node1 (the present chief)2] Begin PostgreSQL manually:

Divergence Demonstration

Rejoining node1 (begin Patroni)

Key Dangers and Issues

1. Patroni have to be the only controller

2. Standalone writes WILL be misplaced

3. Break up-brain–like signs

4. Operational confusion

Precautions sensible steps

Abstract

Related Articles

LEAVE A REPLY Cancel reply

Latest Articles

1] Kill Patroni on node1 (the present chief)
2] Begin PostgreSQL manually: