Why Master‑Slave Replication Lag Erased My MySQL Data – A Real Debugging Story
A developer recounts a MySQL master‑slave replication lag issue that caused approved activity data to disappear, walks through the investigation steps, shows the faulty PHP code, and offers practical fixes to avoid similar pitfalls in future database designs.
Scenario
A feature requires an activity to be approved before it becomes effective. Because activities can be edited after approval, the system stores the pending version in a temporary table activity_tmp and only copies it to the main activity table after approval. The simplified schema is:
activity_tmp()
id
status // approval status
content // activity content submitted for approval
activity()
id
content // content displayed after approvalProblem Traceback
Checked activity_tmp – the content was correct and the status had been updated to approved, suggesting the write to activity failed.
Queried activity – the approved content was missing.
Reviewed the code – no obvious logic error, so suspected a database operation failure and examined the logs.
The log showed an INSERT with an empty content field. The content originated from a preceding SELECT that returned data when run directly on the standby replica, but returned empty during the transaction, indicating possible master‑slave delay.
The error was a partial failure, confirming the replication lag as the root cause.
Problematic Code
$intStatus = $arrInput['status'];
$this->objActTmp->updateInfoByAId($intActId, $intStatus); // update then immediately read
$arrActContent = $this->objActTmp->getActByStatus($intStatus);The immediate read after the update triggers the master‑slave lag bug: the update may not have propagated to the replica yet, so the subsequent read sees stale (empty) data.
Solution
Adjust the code logic:
If the second step does not need the updated status, read first, then update.
If the second step depends on the updated status, check the read result; if empty, raise an error and retry later.
Consider changing the system architecture to avoid reading from a replica immediately after a write, or use synchronous replication, write‑through caches, or explicit transaction boundaries.
Takeaways
Replication lag can silently corrupt business logic when writes are followed by immediate reads from a replica. Detailed logging is essential for root‑cause analysis; treat logs like a black‑box flight recorder that captures every relevant event.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
