MySQL Replication: Errant Transactions in GTID Based Replication

GTID or Global Transaction Identifier was introduced in MySQL 5.6.5. A GTID is a globally unique id given to all transactions executed on a GTID-enabled MySQL server. GTID’s are a combination of the UUID of the server where a particular transaction has been committed, and the sequence number of that transaction on that particular server. This makes the GTID’s globally unique.

MySQL Replication

GTID-based replication is much more flexible compared to the older binlog-based replication. In a GTID-based setup, the slave does not need a master binlog file and position to start replication.  Read more about GTID based replication.  In this blog post we will discuss some common MySQL replication issues caused when deploying a GTID based replica set.

Errant transactions are transactions that are applied to one or more slaves that do not need to be replicated on other nodes.These could be intermittent fixes applied on the slave, or accidental writes to the slave by an application.

The problem with these errant transactions arises when the slave that contains an errant transaction is promoted to master. In the case of GTID-based replication, this would cause an issue. The new master now realizes that slaves have not executed the errant transaction. One of two things can happen:

(1) The errant transaction is still present in the master’s binlog and it will send it to the slaves, this can corrupt the data or cause an error.
(2) The transaction is not present in the binlog, and hence cannot be sent over to the slave ,which causes a replication error.

Prevention

Errant transactions can actively be prevented following these steps.If you have to apply a fix to a slave, one way to mitigate errant transactions is by temporarily turning off binary logging on the slave. Executing sql_bin_log = 0 before executing the errant query should do the trick. You can later enable binlog by running sql_bin_log =1.To prevent any application writes to slaves, Read-Only should be enabled on a server when its configured as a slave.

Detection

Detecting an errant transaction in a GTID based MySQL replica set is easy. MySQL stores all executed GTID’s in its Performance Schema/Information Schema table based on what version MySQL you are using. Taking the current slave’s executed GTID’s and subtracting them from the GTID’s executed on the current master should give you all the errant transactions on that particular slave. Utilities such as mysqlfailover or mysqlrpladmin can also help in detecting errant transactions.

Solution

Once an errant transaction has been detected, there are two ways you can fix the replication errors caused after a failover. One way is to delete the GTID of the errant transaction from the slave GTID executed history. This way, when the slave gets promoted to the master, the errant transaction would not be replicated to all nodes. Another way of handling errant transaction is to tell all the other slaves to skip the errant transaction. That would include inserting an empty transaction with the same GTID as the errant transaction to all the other nodes in the replica set.This shall make all the other nodes think that they have already applied this transaction and hence shall skip it. MySQL has a utility called Mysqlslavetrx dedicated to do this. This utility can be used to insert empty transactions with the given GTID.  Adding empty transactions can have other uses as well , as discussed here.


Liked this post?

Join the ScaleGrid newsletter and never miss out!

Neeraj is a member of technical staff at Scalegrid Inc


70 Shares
+124
Tweet
Share
Share46
Pin