Analysis of MySQL GTID Replication Inconsistency and the Role of Auto_Position
This article examines a MySQL 5.7 replication anomaly where a slave reports no errors despite GTID mismatches, demonstrates how to reproduce the issue, explains the underlying cause related to missing Auto_Position=1, and provides remediation steps and source‑code insights.
1 Fault Phenomenon
The master‑slave data become inconsistent while the replication status shows both Slave_IO_Running and Slave_SQL_Running as Yes . The slave should have reported error 1062 or 1032, but it does not.
MySQL version: 5.7.35
2 Fault Reproduction
Check master status
MySQL [xuzong]> show master status;
+------------------+----------+--------------+------------------+--------------------------------------------------------------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+--------------------------------------------------------------------------+
| mysql-bin.000008| 39349641 | | | c233aec0-58d3-11ec-a74a-a0a33ba8b3ff:1-104345 |
+------------------+----------+--------------+------------------+--------------------------------------------------------------------------+On the slave set a GTID range larger than the master’s:
mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)
mysql> reset slave all;
Query OK, 0 rows affected (0.01 sec)
mysql> reset master;
Query OK, 0 rows affected (0.01 sec)
mysql> set @@GLOBAL.GTID_PURGED='c233aec0-58d3-11ec-a74a-a0a33ba8b3ff:1-1000000'; #1000000>104345
Query OK, 0 rows affected (0.00 sec)Start replication in POSITION MODE and run show slave status \G . The output shows both Slave_IO_Running: Yes and Slave_SQL_Running: Yes , but the Retrieved_Gtid_Set on the slave already includes the range that the master has not produced, so any subsequent write on the master is ignored.
Master write
MySQL [xuzong]> insert into test(passtext) values('test');
Query OK, 1 row affected (0.00 sec)Slave query
mysql> select * from xuzong.test;
+----+--------------------------------------------+
| id | passtext |
+----+--------------------------------------------+
| 1 | 7e5a44af63552be3f2f819cebbe0832a |
| 2 | 7e5a44af63552be3f2f819cebbe0832a |
| 3 | 7e5a44af63552be3f2f819cebbe0832a |
| 4 | d13baf7019431e0c75dee85bc923a91b |
| 5 | 11 |
+----+--------------------------------------------+
5 rows in set (0.00 sec)The issue is that the slave’s GTID set is larger than the master’s, yet no error is raised.
Guess
The root cause is the missing Auto_Position=1 setting on the slave. Enabling it forces the slave to use GTID‑based binlog dumping.
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has...'
... Auto_Position: 1Indeed, after setting Auto_Position=1 the slave reports error 1236, confirming the hypothesis.
3 Problem Handling
In this situation the only practical remedy is to rebuild the slave. Because the replication threads show double “Yes”, the inconsistency cannot be captured by monitoring tools, and the exact point of divergence is unknown, making binlog‑based recovery impossible.
The environment also performed a master‑slave switchover, so the only reliable approach is a full data comparison with the business side, followed by a fresh slave reconstruction.
4 Summary
The Auto_Position parameter works as follows:
It immediately clears the existing relay log.
It locates the correct binlog by taking the union of the slave’s Executed_Gtid_Set and Retrieved_Gtid_Set , ignoring the values of MASTER_LOG_FILE and MASTER_LOG_POS that may be stale.
In production, when GTID is enabled, it is recommended to use AUTO_POSITION mode ( Auto_Position=1 ) to avoid silent replication failures that can lead to data loss.
5 Extension: Source Code Walkthrough
rel_slave.cc
--> request_dump
# Partial source code:
enum_server_command command = mi->is_auto_position() ?
COM_BINLOG_DUMP_GTID : COM_BINLOG_DUMP;
if (command == COM_BINLOG_DUMP_GTID)
{
if (gtid_executed.add_gtid_set(mi->rli->get_gtid_set()) != RETURN_STATUS_OK ||
gtid_executed.add_gtid_set(gtid_state->get_executed_gtids()) != RETURN_STATUS_OK)
// add Retrieved_Gtid_Set and Executed_Gtid_Set
...
int2store(ptr_buffer, binlog_flags);
ptr_buffer+= ::BINLOG_FLAGS_INFO_SIZE;
int4store(ptr_buffer, server_id);
ptr_buffer+= ::BINLOG_SERVER_ID_INFO_SIZE;
int4store(ptr_buffer, static_cast
(BINLOG_NAME_INFO_SIZE));
ptr_buffer+= ::BINLOG_NAME_SIZE_INFO_SIZE;
memset(ptr_buffer, 0, BINLOG_NAME_INFO_SIZE); // set MASTER_LOG_FILE to 0
ptr_buffer+= BINLOG_NAME_INFO_SIZE;
int8store(ptr_buffer, 4LL); // set MASTER_LOG_POS to 4
ptr_buffer+= ::BINLOG_POS_INFO_SIZE; // store gtid_set
}If relay_log_recovery is enabled, the Retrieved_Gtid_Set value is ignored.
if (!is_relay_log_recovery && !gtid_retrieved_initialized &&
relay_log.init_gtid_sets(>id_set, NULL, opt_slave_sql_verify_checksum,
true/*true=need lock*/, &mi->transaction_parser, >id_partial_trx))
{
sql_print_error("Failed in init_gtid_sets() called from Relay_log_info::rli_init_info().");
DBUG_RETURN(1);
}
gtid_retrieved_initialized = true;Key takeaway: enable Auto_Position=1 for GTID‑based replication to ensure the slave correctly tracks the master’s transaction set and avoids silent data loss.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.