Analysis of Large Transaction Binlog Behavior in MySQL 5.6/5.7
This article analyzes why a large MySQL transaction can occupy an entire binlog file without triggering a binlog switch, why the final transaction appears small, and why the commit timestamps of the large and small transactions can differ by up to fifteen minutes, providing detailed explanations, test procedures, and code examples.
Background : A user observed a MySQL 5.6 binlog file of about 3 GB (expected size 1 GB) containing a huge transaction followed by a small one, with a ~15‑minute gap between their XID_EVENT commit timestamps.
Problem : Why is the last transaction small instead of the largest one, why does the binlog not switch after the large transaction, and why is there a large time difference between the commits?
Analysis :
1. Commit Flow Diagram
The diagram (from the author’s book) shows that binlog switching is marked only after all events of a transaction are written to the binlog (flush queue), not after the transaction is committed.
2. Why the large transaction stays in a single binlog
According to step 10 of the diagram, the binlog switch marker is set only after the flush queue finishes writing the transaction events, so even a massive transaction must finish writing before a switch can occur.
3. Why the final transaction is small
Step 10 only sets the switch flag; the actual switch occurs after the commit queue of the transaction completes (see step 28). During the long sync phase of a large transaction, new small transactions can enter the flush queue and be written to the same binlog.
Large transaction may be positioned anywhere in the flush queue, with smaller ones behind it.
The sync phase of a large transaction can take many seconds (e.g., 30 s), allowing other transactions to be written to the binlog before the switch.
Therefore, in a busy production environment the last transaction in a binlog is often not the large one.
4. Why the commit timestamps differ by ~15 minutes
Two possible reasons:
For autocommit transactions, the XID_EVENT timestamp reflects the command issuance time, which can cause large gaps. For explicit BEGIN‑COMMIT transactions, the timestamp reflects the commit command time, but a long fsync can also produce a similar effect.
The author’s book explains these cases in detail (chapters 12 and 14).
5. Test in MySQL 5.7.22
Test setup:
max_binlog_size = 1048576 (small size for testing)
binlog_group_commit_sync_delay = 1000000 (1 s delay to stretch the commit process)
binlog_transaction_dependency_tracking = COMMITORDER (default)
A breakpoint was set on MYSQL_BIN_LOG::ordered_commit to observe when the commit actually occurs.
After creating a table with ~700 k rows and deleting all rows, the resulting binlog (≈3.5 MB) contained many small DELETE statements after the large transaction, confirming the analysis.
### DELETE FROM `testmts`.`testnnn`
### WHERE
### @1=10 /* INT meta=0 nullable=1 is_null=0 */
# at 3626617
#190804 22:56:10 server id 413340 end_log_pos 3626648 CRC32 0xfc5b79e7 Xid = 143
COMMIT/*!*/;
#190804 23:02:26 server id 413340 end_log_pos 3626713 CRC32 0xa2399157 GTID last_committed=0 sequence_number=2 rbr_only=yes
...The log shows a 6‑minute gap between 22:56:10 and 23:02:26 , and the last_committed values remain consistent, matching the author's claim.
6. Precautions
Avoid excessively large transactions in MySQL because they can block other transactions' commits, a point emphasized throughout the author's book.
For a deeper dive, refer to the author's series "深入理解MySQL主从原理 32讲".
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.