On this second a part of the weblog submit, we are going to discover how the PXC Replication Supervisor script handles supply and reproduction failover in a multi-source replication topology.
Multi-source replication is usually used when information from a number of impartial sources must be gathered right into a single occasion which is commonly required for reporting, analytics, or particular ad-hoc enterprise instances. On this submit, we’ll stroll by how failover is managed in such a setup when built-in in PXC/Galera based mostly setting.
For an preliminary understanding of the essential PXC replication supervisor setup, you possibly can discuss with the linked weblog submit
Let’s dive into the sensible use.
Topology:
|
DC1 172.31.78.120 DC1–1 172.31.70.222 DC1–2
DC2 172.31.75.60 DC2–1 172.31.71.144 DC2–2
DC3 172.31.74.136 DC3–1 172.31.73.157 DC3–2 |
Async Replication syncing movement:
- DC1 [DC1-1] can have a multi-source replication channel and syncing from DC2[DC2-1] and DC3 [DC3-1] nodes.
- DC2 [DC2-1] will likely be syncing from DC1 [DC-1].
- DC3 [DC3-1] will likely be syncing from DC1[DC-1].
|
DC2 <=> DC1[multi–source] <=> DC3 |
Async Multi-Supply topology
PXC/Async configurations
The configuration particulars as per every DC node is talked about within the Github file at location – https://gist.github.com/aniljoshi2022/7714c97a9c755e3d12c60e3ead21a55f .
At this stage, all 3 clusters must be bootstrapped and in operating state.
|
shell> systemctl begin mysql@bootstrap.service |
- Second and relaxation different Nodes:
|
shell> systemctl begin mysql |
We also needs to be certain that the replication person created on the DC1[mysql-DC1-1] node.
|
mysql> CREATE USER ‘repl’@‘%’; mysql> GRANT REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO ‘repl’@‘%’; |
Replication Supervisor configuration
Now we are going to add configuration entries within the replication supervisor associated tables on DC-1 [DC1-1]. I’m not masking what every desk does right here, as we already talked about within the first a part of the blogpost.
|
mysql> INSERT INTO `cluster` VALUES (‘DC1’,‘172.31.78.120 172.31.70.222 ‘,‘source_user=”repl“, source_password=’replpass”); mysql> INSERT INTO `cluster` VALUES (‘DC2’,‘172.31.75.60 172.31.71.144’,‘source_user=”repl“, source_password=’replpass”); mysql> INSERT INTO `cluster` VALUES (‘DC3’,‘172.31.74.136 172.31.73.157’,‘source_user=”repl“, source_password=’replpass”); |
|
mysql> choose * from cluster; +————–+———————————————+————————————————————————+ | cluster | masterCandidates | replCreds | +————–+———————————————+————————————————————————+ | DC1 | 172.31.78.120 172.31.70.222 | source_user=‘repl’, source_password=‘replpass’ | | DC2 | 172.31.75.60 172.31.71.144 | source_user=‘repl’, source_password=‘replpass’ | | DC3 | 172.31.74.136 172.31.73.157 | source_user=‘repl’, source_password=‘replpass’ | +————–+———————————————+————————————————————————+ |
|
mysql> INSERT INTO `hyperlink` VALUES (‘DC1’,‘DC2’); mysql> INSERT INTO `hyperlink` VALUES (‘DC1’,‘DC3’); mysql> INSERT INTO `hyperlink` VALUES (‘DC2’,‘DC1’); mysql> INSERT INTO `hyperlink` VALUES (‘DC3’,‘DC1’); |
|
mysql> choose * from hyperlink; +———————+———————–+ | clusterSlave | clusterMaster | +———————+———————–+ | DC1 | DC2 | | DC1 | DC3 | | DC2 | DC1 | | DC3 | DC1 | +———————+———————–+ |
|
mysql> INSERT INTO `weight` VALUES(‘DC1’,‘DC1-1’,10); mysql> INSERT INTO `weight` VALUES(‘DC1’,‘DC1-2’,11); mysql> INSERT INTO `weight` VALUES(‘DC2’,‘DC2-1’,10); mysql> INSERT INTO `weight` VALUES(‘DC2’,‘DC2-2’,11); mysql> INSERT INTO `weight` VALUES(‘DC3’,‘DC3-1’,10); mysql> INSERT INTO `weight` VALUES(‘DC3’,‘DC3-2’,11); |
|
mysql> choose * from weight; +————–+—————+————+ | cluster | nodename | weight | +————–+—————+————+ | DC1 | DC1–1 | 10 | | DC1 | DC1–2 | 11 | | DC2 | DC2–1 | 10 | | DC2 | DC2–2 | 11 | | DC3 | DC3–1 | 10 | | DC3 | DC3–2 | 11 | +————–+—————+————+ |
Asynchronous Replication Setup
- Taking mysqldump from DC1 [DC1-1] node.
|
mysql–DC1–1> mysqldump –u root –pRoot@1234 —supply–information=2 —single–transaction –R –A –E > dump.sql |
- Transferring dump to DC2[DC2-1] and DC3[DC3-1].
|
mysql–DC1–1> sudo scp –i /root/.ssh/mykey dump.sql root@172.31.75.60:/root/ mysql–DC1–1> sudo scp –i /root/.ssh/mykey dump.sql root@172.31.74.136:/root/ |
- Restoring dump on DC2[DC2-1] and DC3[DC3-1].
|
mysql–DC2–1> mysql –u root –p < dump.sql mysql–DC3–1> mysql –u root –p < dump.sql |
- Replication channel setup and beginning.
DC2-1
|
mysql–DC2–1> change replication supply to GET_SOURCE_PUBLIC_KEY=1, source_host=‘172.31.78.120’, source_user=‘repl’, source_password=‘replpass’, SOURCE_AUTO_POSITION = 1 FOR CHANNEL ‘DC2-DC1’; mysql–DC2–1> begin reproduction FOR CHANNEL ‘DC2-DC1’; |
|
mysql–DC2–1> present reproduction statusG; *************************** 1. row *************************** Replica_IO_State: Ready for supply to ship occasion Source_Host: 172.31.78.120 Source_User: repl Source_Port: 3306 Connect_Retry: 60 Source_Log_File: binlog.000002 Read_Source_Log_Pos: 742076 Relay_Log_File: ip–172–31–75–60–relay–bin–dc2@002ddc1.000002 Relay_Log_Pos: 3480 Relay_Source_Log_File: binlog.000002 Replica_IO_Running: Sure Replica_SQL_Running: Sure |
DC3-1
|
mysql–DC3–1> change replication supply to GET_SOURCE_PUBLIC_KEY=1, source_host=‘172.31.78.120’, source_user=‘repl’, source_password=‘replpass’, SOURCE_AUTO_POSITION = 1 FOR CHANNEL ‘DC3-DC1’; mysql–DC3–1> begin reproduction FOR CHANNEL ‘DC3-DC1’; |
|
mysql–DC3–1> present reproduction statusG; *************************** 1. row *************************** Replica_IO_State: Ready for supply to ship occasion Source_Host: 172.31.78.120 Source_User: repl Source_Port: 3306 Connect_Retry: 60 Source_Log_File: binlog.000002 Read_Source_Log_Pos: 797204 Relay_Log_File: ip–172–31–74–136–relay–bin–dc3@002ddc1.000002 Relay_Log_Pos: 70850 Relay_Source_Log_File: binlog.000002 Replica_IO_Running: Sure Replica_SQL_Running: Sure |
- Multo-Supply async replication setup on DC1[DC1-1]
|
mysql–DC1–1> change replication supply to GET_SOURCE_PUBLIC_KEY=1, source_host=‘172.31.75.60’, source_user=‘repl’, source_password=‘replpass’, SOURCE_AUTO_POSITION = 1 FOR CHANNEL ‘DC1-DC2’; mysql–DC1–1> begin reproduction FOR CHANNEL ‘DC1-DC2’; |
|
mysql–DC1–1> SHOW REPLICA STATUS FOR CHANNEL “dc1-dc2”G *************************** 1. row *************************** Replica_IO_State: Ready for supply to ship occasion Source_Host: 172.31.75.60 Source_User: repl Source_Port: 3306 Connect_Retry: 60 Source_Log_File: binlog.000001 Read_Source_Log_Pos: 153605 Relay_Log_File: ip–172–31–78–120–relay–bin–dc1@002ddc2.000002 Relay_Log_Pos: 4898 Relay_Source_Log_File: binlog.000001 Replica_IO_Running: Sure Replica_SQL_Running: Sure |
|
mysql–DC1–1> change replication supply to GET_SOURCE_PUBLIC_KEY=1, source_host=‘172.31.74.136’, source_user=‘repl’, source_password=‘replpass’, SOURCE_AUTO_POSITION = 1 FOR CHANNEL ‘DC1-DC3’; mysql–DC1–1> begin reproduction FOR CHANNEL ‘DC1-DC3’; |
|
mysql–DC1–1> SHOW REPLICA STATUS FOR CHANNEL “dc1-dc3”G *************************** 1. row *************************** Replica_IO_State: Ready for supply to ship occasion Source_Host: 172.31.74.136 Source_User: repl Source_Port: 3306 Connect_Retry: 60 Source_Log_File: binlog.000001 Read_Source_Log_Pos: 144468 Relay_Log_File: ip–172–31–78–120–relay–bin–dc1@002ddc3.000002 Relay_Log_Pos: 413 Relay_Source_Log_File: binlog.000001 Replica_IO_Running: Sure Replica_SQL_Running: Sure |
Now, all of the clusters are linked as a supply to supply.
Replication Supervisor Cron Setup
We have to allow replication supervisor cron throughout all PXC/Async nodes.
|
crontab –l * * * * * /usr/native/bin/replication_manager.sh |
For any error or points we will examine perception the – /tmp/replication_manager.log log file.
Testing Supply Failover For Multi-Supply Channel
DC1-1:
|
mysql–DC1–1> choose * from percona.replication; +———–+————+——————+—————–+——————————–+——————————–+————————+———————–+ | host | weight | localIndex | isReplica | lastUpdate | lastHeartbeat | connectionName | currentSource | +———–+————+——————+—————–+——————————–+——————————–+————————+———————–+ | DC1–1 | 10 | 0 | Sure | 2025–12–20 16:03:02 | 2025–12–20 16:03:02 | DC1–DC2 | 172.31.75.60 | | DC1–2 | 11 | 1 | No | 2025–12–20 16:03:02 | 2025–12–20 16:03:02 | DC1–DC2 | | | DC1–1 | 10 | 0 | Sure | 2025–12–20 16:03:02 | 2025–12–20 16:03:02 | DC1–DC3 | 172.31.74.136 | | DC1–2 | 11 | 1 | No | 2025–12–20 16:03:02 | 2025–12–20 16:03:02 | DC1–DC3 | NULL | | DC2–1 | 10 | 0 | Sure | 2025–12–20 16:03:02 | 2025–12–20 16:03:02 | DC2–DC1 | 172.31.78.120 | | DC3–1 | 10 | 1 | Sure | 2025–12–20 16:03:02 | 2025–12–20 16:03:02 | DC3–DC1 | 172.31.78.120 | +———–+————+——————+—————–+——————————–+——————————–+————————+———————–+ |
|
mysql–DC1–1> present reproduction standing for Channel ‘dc1-dc2’G; *************************** 1. row *************************** Replica_IO_State: Ready for supply to ship occasion Source_Host: 172.31.75.60 Source_User: repl Source_Port: 3306 Connect_Retry: 60 Source_Log_File: binlog.000001 Read_Source_Log_Pos: 165024 Relay_Log_File: ip–172–31–78–120–relay–bin–dc1@002ddc2.000002 Relay_Log_Pos: 11267 Relay_Source_Log_File: binlog.000001 Replica_IO_Running: Sure Replica_SQL_Running: Sure |
Now we are going to cease supply DC2 [DC2-1] .
|
mysql–DC2–1> systemctl cease mysql |
Under, we will see that connectionName with “DC1-DC2” is in a “Failed” state.
|
mysql–DC2–1> choose * from percona.replication; +———–+————+——————+—————–+——————————–+——————————–+————————+———————–+ | host | weight | localIndex | isReplica | lastUpdate | lastHeartbeat | connectionName | currentSource | +———–+————+——————+—————–+——————————–+——————————–+————————+———————–+ | DC1–1 | 10 | 0 | Failed | 2025–12–20 16:10:22 | 2025–12–20 16:10:22 | DC1–DC2 | 172.31.75.60 |
...
| DC1–1 | 10 | 0 | Sure | 2025–12–20 18:49:02 | 2025–12–20 18:49:02 | DC1–DC3 | 172.31.74.136 | |
After a few minutes when the script once more begins monitoring, DC1 [DC1-1] is now linked with one other supply node of DC2 which is [DC2-2] .
|
mysql–DC2–1> choose * from percona.replication; +———–+————+——————+—————–+——————————–+——————————–+————————+———————–+ | host | weight | localIndex | isReplica | lastUpdate | lastHeartbeat | connectionName | currentSource | +———–+————+——————+—————–+——————————–+——————————–+————————+———————–+ | DC1–1 | 10 | 0 | Sure | 2025–12–20 19:06:01 | 2025–12–20 19:06:01 | DC1–DC2 | 172.31.71.144 | ... |
|
mysql–DC2–1> present reproduction standing for Channel ‘dc1-dc2’G; *************************** 1. row *************************** Replica_IO_State: Ready for supply to ship occasion Source_Host: 172.31.71.144 Source_User: repl Source_Port: 3306 Connect_Retry: 60 Source_Log_File: binlog.000004 Read_Source_Log_Pos: 421885 Relay_Log_File: ip–172–31–78–120–relay–bin–dc1@002ddc2.000005 Relay_Log_Pos: 5137 Relay_Source_Log_File: binlog.000004 Replica_IO_Running: Sure Replica_SQL_Running: Sure |
Testing Duplicate Failover For Multi-Supply channel
We’ll cease DC1 (DC1-1) which is the present multi supply reproduction linked by way of each DC2 and DC3 nodes.
|
mysql–DC1–1> choose * from percona.replication; +———–+————+——————+—————–+——————————–+——————————–+————————+———————–+ | host | weight | localIndex | isReplica | lastUpdate | lastHeartbeat | connectionName | currentSource | +———–+————+——————+—————–+——————————–+——————————–+————————+———————–+ | DC1–1 | 10 | 0 | Sure | 2025–12–21 04:15:02 | 2025–12–21 04:15:02 | DC1–DC2 | 172.31.71.144 | | DC1–2 | 11 | 1 | Failed | 2025–12–20 18:48:22 | 2025–12–20 18:48:22 | DC1–DC2 | | | DC1–1 | 10 | 0 | Sure | 2025–12–21 04:15:02 | 2025–12–21 04:15:02 | DC1–DC3 | 172.31.74.136 | | DC1–2 | 11 | 1 | No | 2025–12–20 18:48:22 | 2025–12–20 18:48:22 | DC1–DC3 | NULL | | DC2–1 | 10 | 0 | No | 2025–12–20 15:31:02 | 2025–12–20 15:31:02 | DC2–DC1 | 172.31.78.120 | | DC2–2 | 11 | 0 | Sure | 2025–12–21 04:15:02 | 2025–12–21 04:15:02 | DC2–DC1 | 172.31.78.120 | | DC3–1 | 10 | 1 | Sure | 2025–12–20 19:05:01 | 2025–12–20 19:05:01 | DC3–DC1 | 172.31.78.120 | +———–+————+——————+—————–+——————————–+——————————–+————————+———————–+ |
As soon as we cease the database service on DC1[DC1-1] and after ready for some time we will examine the standing once more and it’ll present as DC1[DC1-2] as a brand new multi-source reproduction.
|
mysql–DC1–2> choose * from percona.replication; +———–+————+——————+—————–+——————————–+——————————–+————————+———————–+ | host | weight | localIndex | isReplica | lastUpdate | lastHeartbeat | connectionName | currentSource | +———–+————+——————+—————–+——————————–+——————————–+————————+———————–+ | DC1–1 | 10 | 0 | No | 2025–12–21 04:48:02 | 2025–12–21 04:45:02 | DC1–DC2 | 172.31.71.144 | | DC1–2 | 11 | 0 | Sure | 2025–12–21 04:50:02 | 2025–12–21 04:50:02 | DC1–DC2 | 172.31.71.144 | | DC1–1 | 10 | 0 | No | 2025–12–21 04:48:02 | 2025–12–21 04:45:02 | DC1–DC3 | 172.31.74.136 | | DC1–2 | 11 | 0 | Sure | 2025–12–21 04:50:02 | 2025–12–21 04:50:02 | DC1–DC3 | 172.31.74.136 | | DC2–1 | 10 | 0 | No | 2025–12–20 15:31:02 | 2025–12–20 15:31:02 | DC2–DC1 | 172.31.78.120 | | DC2–2 | 11 | 0 | Sure | 2025–12–21 04:50:02 | 2025–12–21 04:50:02 | DC2–DC1 | 172.31.70.222 | | DC3–1 | 10 | 1 | Sure | 2025–12–21 04:50:01 | 2025–12–21 04:50:01 | DC3–DC1 | 172.31.70.222 | +———–+————+——————+—————–+——————————–+——————————–+————————+———————–+ |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
mysql–DC1–2> present reproduction statusG; *************************** 1. row *************************** Replica_IO_State: Ready for supply to ship occasion Source_Host: 172.31.71.144 Source_User: repl Source_Port: 3306 Connect_Retry: 60 Source_Log_File: binlog.000004 Read_Source_Log_Pos: 1286708 Relay_Log_File: ip–172–31–70–222–relay–bin–dc1@002ddc2.000002 Relay_Log_Pos: 240223 Relay_Source_Log_File: binlog.000004 Replica_IO_Running: Sure Replica_SQL_Running: Sure ...
Channel_Name: dc1–dc2 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
Replica_IO_State: Ready for supply to ship occasion Source_Host: 172.31.74.136 Source_User: repl Source_Port: 3306 Connect_Retry: 60 Source_Log_File: binlog.000001 Read_Source_Log_Pos: 2325764 Relay_Log_File: ip–172–31–70–222–relay–bin–dc1@002ddc3.000002 Relay_Log_Pos: 285267 Relay_Source_Log_File: binlog.000001 Replica_IO_Running: Sure Replica_SQL_Running: Sure
...
Channel_Name: dc1–dc3 |
Vital consideration:
The topology or state of affairs mentioned above is meant solely for demonstration functions and to watch how the PXC Replication Supervisor handles failover in complicated topologies. In a manufacturing setting, such architectures must be averted, as performing writes throughout each clusters (a number of nodes concurrently) can result in inconsistencies. For any related use instances, thorough and in-depth testing is strongly beneficial beforehand.
Abstract
The replication supervisor script will be significantly helpful in complicated PXC/Galera topologies that require multi-source replication. This can ease the auto supply and reproduction failover to make sure all replication channels are wholesome and in sync. If sure nodes shouldn’t be a part of a async/multi-source replication, we will disable the replication supervisor script there. Alternatively, node participation will be managed by adjusting the weights within the percona.weight desk, permitting replication habits to be managed extra exactly.
