Ensuring Data Consistency Across Microservices: Saga, Reconciliation, Event Sourcing, and Change Data Capture
The article explains why achieving data consistency across multiple microservices is challenging, reviews the limitations of two‑phase commit, and presents practical techniques such as the Saga pattern, reconciliation, event logs, orchestration vs. choreography, and change‑data‑capture to keep distributed systems eventually consistent.
Saga Pattern
One of the most well‑known ways to handle consistency across many microservices is the Saga pattern, which can be seen as an application‑level distributed coordinator. By using compensating actions, a saga can roll back a failed transaction step.
Reconciliation
If the system responsible for invoking compensating actions crashes, reconciliation is needed to locate the failed transaction and either trigger compensation or resume processing, a technique familiar from financial accounting.
Event Log
Event logs provide a simple yet powerful mechanism for recovery and visibility; they can be stored as a table or collection in a coordinating service and are often used by distributed systems for write‑ahead logging.
Orchestration vs. Choreography
Sagas can be implemented either as orchestration, where a central coordinator drives the workflow, or as choreography, where each microservice knows only its part of the process.
Single‑Write Event (Change Data Capture)
A simpler approach is to modify only one data source at a time. Change Data Capture (CDC) tools such as Kafka Connect or Debezium capture database changes and emit events, while some databases expose an oplog or can be polled by timestamp.
Event‑First Approach
When an event is treated as the sole source of truth, the system follows a CQRS style: writes generate events, reads query a separate model. This method faces challenges like optimistic/pessimistic concurrency and event ordering, which can be mitigated with partitioned streams in Kafka or similar platforms.
Designing for Consistency
Service boundaries should align with transaction boundaries where possible; if not, aim to reduce distributed consistency needs, use event‑driven architectures, and make services reversible to handle failures early in the design phase.
Accepting Inconsistency
Some use cases (e.g., analytics) can tolerate occasional data loss without harming business value, allowing more relaxed consistency models.
Choosing a Solution
Try to design a system that avoids distributed consistency requirements.
Prefer modifying a single data source at a time to limit inconsistency.
Consider an event‑driven architecture, using events as the single source of truth or CDC‑generated events.
For complex scenarios, combine synchronous calls, compensation, and reconciliation as needed.
Make service operations reversible and define clear failure‑handling strategies early.
Architects Research Society
A daily treasure trove for architects, expanding your view and depth. We share enterprise, business, application, data, technology, and security architecture, discuss frameworks, planning, governance, standards, and implementation, and explore emerging styles such as microservices, event‑driven, micro‑frontend, big data, data warehousing, IoT, and AI architecture.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.