When a reproduction fails because of corruption or drift, the usual resolution is to rebuild it from a recent copy of the grasp when pt-table-sync is just not an choice. Historically, when we have to construct a brand new duplicate, we use a bodily backup for velocity, however there are some instances the place you continue to want logical backups. As an example, if you migrate to a particular vendor (i.e.: MariaDB to MySQL) or storage engines (previously MyISAM to InnoDB and these days from InnoDB to RocksDB), improve to a brand new database model or transfer to a cloud based mostly resolution. That is the place a logical backup shines, providing portability and ease, however provided that it may be carried out rapidly. MyDumper emerges because the important, trendy resolution, delivering the most effective of each worlds: the cross-platform, cross-version flexibility of a logical dump mixed with the parallel, multi-threaded velocity beforehand reserved for bodily strategies, making it the clear alternative for quickly rebuilding a constant duplicate.
Step one is to take the backup. We’ve a number of arguments which we are able to use with mydumper, for this instance we’re going to use:
|
mydumper –v 4 –o information —clear —regex ‘^(?!(mysql.|sys.))’ —supply–information |
The primary 3 are associated to logging and backup dir, the second line is for ignoring mysql and sys schemas and at last –source-data will instruct mydumper to save lots of within the metadata file, all the information required for the replication configuration after the restore underneath the part [source].
That is an instance of the output:
|
[source] # Channel_Name=”” # It may be used to setup replication FOR CHANNEL # SOURCE_LOG_FILE = “binlog.000020” # SOURCE_LOG_POS = 6803936 #SOURCE_HOST = “172.17.0.3” #SOURCE_PORT = #SOURCE_USER = “” #SOURCE_PASSWORD = “” #SOURCE_SSL = 1 executed_gtid_set = “941fdce6-47c4-11f0-87b2-0242ac110006:1-52” SOURCE_LOG_FILE = “binlog.000020” SOURCE_LOG_POS = 6803936 #SOURCE_AUTO_POSITION = 1 myloader_exec_reset_replica = 0 myloader_exec_change_source = 0 myloader_exec_start_replica = 0 |
As we are able to see, this choices are enabled:
|
executed_gtid_set = “941fdce6-47c4-11f0-87b2-0242ac110006:1-52” SOURCE_LOG_FILE = “binlog.000020” SOURCE_LOG_POS = 6803936 |
Nonetheless, the execution of the instructions are disabled:
|
myloader_exec_reset_replica = 0 myloader_exec_change_source = 0 myloader_exec_start_replica = 0 We can allow them, if we set —supply–information=7, then the metadata will change to: myloader_exec_reset_replica = 1 myloader_exec_change_source = 1 myloader_exec_start_replica = 1 |
Which is required to routinely configure the replication.
By default, the SOURCE_LOG_FILE and SOURCE_LOG_POS will probably be used, however you’ll be able to set the GTID place should you configure SOURCE_AUTO_POSITION = 1.
As you would possibly know, to arrange replication we have to execute CHANGE SOURCE. Nonetheless, relying in your use case, you would possibly must RESET REPLICA and after executing the CHANGE SOURCE you often execute START REPLICA. This may be performed by myloader routinely should you set it up within the metadata file utilizing:
|
myloader_exec_reset_replica = 1 myloader_exec_change_source = 1 myloader_exec_start_replica = 1 |
Or you should utilize –source-data=7 as a parameter in myloader. Sure! myloader additionally accepts –source-data.
Relying your use case, you would possibly must configure within the metadata file, this different choices:
|
#SOURCE_HOST = “172.17.0.3” #SOURCE_PORT = #SOURCE_USER = “” #SOURCE_PASSWORD = “” #SOURCE_SSL = 1 executed_gtid_set = “941fdce6-47c4-11f0-87b2-0242ac110006:1-52” SOURCE_LOG_FILE = “binlog.000020” SOURCE_LOG_POS = 6803936 #SOURCE_AUTO_POSITION = 1 |
As there are a number of use instances:
If you’d like wish to rebuild the duplicate from scratch then you will have to configure like this:
|
[source] SOURCE_HOST = “172.17.0.3” SOURCE_PORT = 3306 SOURCE_USER = “duplicate” SOURCE_PASSWORD = “r3pl1c4” executed_gtid_set = “941fdce6-47c4-11f0-87b2-0242ac110006:1-52” SOURCE_LOG_FILE = “binlog.000020” SOURCE_LOG_POS = 6803936 myloader_exec_reset_replica = 1 myloader_exec_change_source = 1 myloader_exec_start_replica = 1 |
You have already got a replication up and operating and also you wish to rebuild it with out altering the host or credentials, then you’ll be able to configure simply on this means:
|
[source] executed_gtid_set = “941fdce6-47c4-11f0-87b2-0242ac110006:1-52” SOURCE_LOG_FILE = “binlog.000020” SOURCE_LOG_POS = 6803936 myloader_exec_reset_replica = 0 myloader_exec_change_source = 1 myloader_exec_start_replica = 1 |
SSL is another choice that’s doable to arrange in –source-data on myloader, as an alternative of utilizing SOURCE_SSL on the metadata file. The total record of choices are: exec_start_slave (1), exec_change_master (2), exec_reset_slave (4), SSL (8), auto_position (16) and exec_start_replica_until (32). Relying on the configuration that you just wish to arrange and the statements that you just wish to execute, you will have to sum the values and cross it to –source-data.
After configure the metadata file, you’ll be able to execute the myloader, which is able to appear like:
|
myloader –d information –v 4 –o —max–threads–for–schema–creation=1 –h replica_host |
Within the log, you will see that myloader has ship the instructions:
|
2025–12–18 16:57:09 [INFO] – Schema create checksum confirmed for sakila 2025–12–18 16:57:09 [INFO] – Sending reset duplicate 2025–12–18 16:57:09 [INFO] – Sending change replication supply 2025–12–18 16:57:09 [INFO] – Sending begin duplicate 2025–12–18 16:57:09 [INFO] – Restore accomplished |
mydumper is sending the command however is just not checking the output which signifies that if the replication did not configure or it was not capable of begin, you will have to manually verify it and repair it. Nonetheless, it should detect if the command failed, for instance if SUORCE_USER is used as an alternative of SOURCE_USER:
|
2025–12–18 17:02:56 [WARNING] – Sending replication command: CHANGE REPLICATION SOURCE TO SOURCE_HOST = “172.17.0.4”, SUORCE_USER = “root”, SOURCE_PASSWORD = “”, SOURCE_LOG_FILE = “binlog.000020”, SOURCE_LOG_POS = 1362220 FOR CHANNEL ”; – ERROR 1064: You have an error in your SQL syntax; verify the handbook that corresponds to your MySQL server model for the proper syntax to use close to ‘SUORCE_USER = “root”, SOURCE_PASSWORD = “”, SOURCE_LOG_FILE = “binlog.000020”, SO’ at line 1 |
There’s an fascinating use case the place we are able to use START REPLICA UNTIL to repair drift on some tables the place pt-table-sync or rebuilding the entire duplicate will not be doable.
Let’s say now we have a supply and a reproduction, and we discovered that information has drifted on the duplicate and the replication course of has stopped with an error like this:
|
LAST_ERROR_MESSAGE: Employee 1 failed executing transaction ‘ANONYMOUS’ at supply log binlog.000020, end_log_pos 1369103; May not execute Update_rows occasion on desk check.test_table; Can‘t discover file in ‘check_desk‘, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the occasion’s supply log binlog.000020, end_log_pos 1369103 |
We checked the binary log and it’s failing due updates over a set of rows:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
# at 1368995 #251218 19:34:59 server id 1 end_log_pos 1369103 CRC32 0x60a481d6 Update_rows: desk id 344 flags: STMT_END_F ### UPDATE `check`.`test_table` ### WHERE ### @1=12 /* INT meta=0 nullable=0 is_null=0 */ ### @2=7062 /* INT meta=0 nullable=1 is_null=0 */ ### SET ### @1=12 /* INT meta=0 nullable=0 is_null=0 */ ### @2=7063 /* INT meta=0 nullable=1 is_null=0 */ ### UPDATE `check`.`test_table` ### WHERE ### @1=15 /* INT meta=0 nullable=0 is_null=0 */ ### @2=7521 /* INT meta=0 nullable=1 is_null=0 */ ### SET ### @1=15 /* INT meta=0 nullable=0 is_null=0 */ ### @2=7522 /* INT meta=0 nullable=1 is_null=0 */ ### UPDATE `check`.`test_table` ### WHERE ### @1=17 /* INT meta=0 nullable=0 is_null=0 */ ### @2=8706 /* INT meta=0 nullable=1 is_null=0 */ ### SET ### @1=17 /* INT meta=0 nullable=0 is_null=0 */ ### @2=8707 /* INT meta=0 nullable=1 is_null=0 */ ### UPDATE `check`.`test_table` ### WHERE ### @1=18 /* INT meta=0 nullable=0 is_null=0 */ ### @2=8108 /* INT meta=0 nullable=1 is_null=0 */ ### SET ### @1=18 /* INT meta=0 nullable=0 is_null=0 */ ### @2=8109 /* INT meta=0 nullable=1 is_null=0 */ # at 1369103 |
We verify the database and it’s true, the info has drifted:
Supply:
|
mysql> choose rely(*) from check.test_table; +—————+ | rely(*) | +—————+ | 15 | +—————+ 1 row in set (0.00 sec) |
Duplicate:
|
mysql> choose rely(*) from check.test_table; +—————+ | rely(*) | +—————+ | 14 | +—————+ 1 row in set (0.00 sec) |
With MyDumper we are able to rebuild the desk following this process:
We have to ignore the desk to permit the duplicate to catchup
|
mysql–duplicate> STOP REPLICA; Question OK, 0 rows affected (0.00 sec)
mysql–duplicate> CHANGE REPLICATION FILTER REPLICATE_IGNORE_TABLE= (check.test_table); Question OK, 0 rows affected (0.00 sec)
mysql–duplicate> START REPLICA; Question OK, 0 rows affected (0.00 sec) |
As soon as that the duplicate is updated we have to cease the duplicate:
|
mysql–duplicate> STOP REPLICA; Question OK, 0 rows affected (0.00 sec) |
and take the backup on the Supply server:
|
mydumper –v 4 –o information —clear –T check.test_table —supply–information |
We’re utilizing -T to take the backup of the problematic desk and –source-data will allow the replication variables that we want on the metadata file.
Then, we restore the desk with the proper worth on –source-data
|
myloader –d information –v 4 –o —max–threads–for–schema–creation=1 –h replica_host —supply–information=32 |
The 32 is to execute the START REPLICA UNTIL.
Lastly, we take away the ignore desk choice and begin the duplicate once more:
|
mysql–duplicate> CHANGE REPLICATION FILTER REPLICATE_IGNORE_TABLE= (); Question OK, 0 rows affected (0.00 sec)
mysql–duplicate> START REPLICA; Question OK, 0 rows affected (0.00 sec) |
The START REPLICA UNTIL executed initially of backup by myloader will drive the duplicate to cease on the level the place we took the backup of the desk, permitting us to proceed the replication in a constant situation.
The shift from legacy dumping strategies to MyDumper represents greater than only a efficiency increase; it represents a modernized method to information integrity and mobility. By decoupling the backup course of from the constraints of single-threaded execution, DBAs can now deal with large datasets with the identical agility they as soon as reserved for small check environments.
Incorporating MyDumper into your customary operational playbook ensures you are ready for the unpredictable—whether or not it’s an emergency duplicate rebuild or a deliberate architectural migration. In an period the place information volumes proceed to scale exponentially, having a software that balances logical flexibility with parallelized velocity, MyDumper turns into a necessity. Preserve it in your toolkit, and the following time you face a “logical-only” restoration situation, you’ll be doing so with a major aggressive benefit.
