Seven Reliable Methods for Solving Distributed Transaction Problems
This article, excerpted from the book "Software Architecture Design: Integrating Large‑Scale Website Technical and Business Architecture," systematically presents seven practical solutions—including two eventual‑consistency approaches, two compromise methods, TCC, state‑machine retry with idempotence, and reconciliation—to address the pervasive challenges of distributed transactions in micro‑service systems.
The excerpt introduces the concept of distributed transaction consistency problems, illustrating common scenarios such as cache‑database mismatches, referential integrity violations, atomicity failures, master‑slave replication loss, and message loss, and classifies them into transaction consistency and multi‑replica consistency.
Distributed Transaction Solutions Overview
Seven reliable methods are summarized: two eventual‑consistency schemes, two compromise approaches, two state‑machine + retry + idempotence methods (TCC and a state‑machine based retry), and one reconciliation method.
1. Two‑Phase Commit (2PC)
2PC involves a coordinator and participants, with a prepare phase and a commit phase. It requires all participants to implement Prepare, Commit, and Rollback interfaces (e.g., XAResource in Java). The main drawbacks are performance bottlenecks, coordinator failure leading to uncertain states, and limited applicability to only two databases.
2. Final Consistency via Message Middleware
This approach uses a message broker to achieve eventual consistency. The key issue is the non‑atomicity of the database update and message send, leading to scenarios where either the DB update succeeds but the message fails, or vice‑versa. Various implementations are discussed, including a naïve erroneous scheme, a business‑side implementation with a local message table and retry loop, and RocketMQ transactional messages that split sending into Prepare and Confirm phases.
3. Try‑Confirm‑Cancel (TCC)
TCC extends 2PC to service‑level transactions with Try, Confirm, and Cancel interfaces. It ensures idempotent Confirm/Cancel operations and relies on continuous retries by the TCC framework to handle failures.
4. Transaction Status Table + Caller Retry + Callee Idempotence
A global transaction log records each step and its status. A background task scans for incomplete transactions and retries them, while services use the transaction ID to guarantee idempotent operations.
5. Reconciliation
Instead of enforcing strict atomicity, reconciliation compares the final results (e.g., order status, inventory, or follower/fan tables) to detect inconsistencies and perform compensating actions, using either full‑batch or incremental checks.
6. Weak Consistency + State‑Based Compensation
For high‑concurrency e‑commerce ordering, a weak‑consistency scheme is proposed: either deduct inventory first then create the order, or vice‑versa, accepting occasional over‑deduction but never under‑deduction, and later reconciling excess deductions via inventory‑usage logs.
7. Compromise: Retry + Rollback + Alert + Manual Fix
If retries and rollbacks still fail, the system raises alerts for human intervention, relying on complete logs to facilitate manual correction.
Conclusion
The chapter concludes that the seven methods—two eventual‑consistency solutions, two compromise tactics, two state‑machine‑based approaches (TCC and state‑machine + retry + idempotence), and reconciliation—cover most practical distributed transaction scenarios, with compromise and reconciliation being the easiest to implement, while TCC is the most complex.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.