Distributed systems replicate data — and, increasingly, services and processes — for three fundamental reasons: reliability, performance, and scalability. A single server is a single point of failure and a performance bottleneck. Replication spreads both risk and load.
Yet replication introduces a deep problem: consistency. When data lives in multiple places, how do you keep all copies in agreement? The naive answer — propagate every update to every replica before allowing the next operation — leads to global synchronisation, which is expensive and sometimes impossible at scale. This lesson explores the entire landscape of consistency models, from the strictest to the most relaxed, and the replication techniques that implement them.
Replication is not free. Every copy you create buys you something (reliability, locality, throughput) at the cost of something else (bandwidth, staleness, coordination complexity). Consistency models are the language we use to describe and control this trade-off.
If a file system or database has been replicated, system operation may continue after one replica crashes by simply switching to one of the other replicas. Maintaining multiple copies also allows better protection against corrupted data. For instance, if we have three copies of a file and perform every read and write on each copy, a single failing write can be masked by considering as correct the value returned by at least two copies (a simple form of majority voting).
Replication for performance is critical when a distributed system needs to scale in terms of size or geographical area. When an increasing number of processes needs to access data managed by a single server, replicating the server and dividing the workload among replicas improves throughput. When scaling over a geographic area, access time decreases by placing a copy of data near the processes that use it — the perceived performance for those processes increases.
Scalability problems generally appear as performance problems. Replication and caching are widely applied as scaling techniques: placing copies of data close to the processes using them reduces access time. However, keeping copies up to date consumes network bandwidth. The question is not whether replication is good — it is whether the benefits outweigh the costs for a given workload and access pattern.
If the access-to-update ratio is very low (processes read far less often than the replica is updated), most updated versions of the local replica will never be accessed, making the network communication for those versions useless. In such cases, a different update strategy — or no replication at all — may be preferable.
Consider a concrete scenario: fetching a Web page from a remote server can take seconds. To improve performance, browsers cache a local copy of previously fetched pages. If the user requests that page again, the browser returns the local copy — excellent access time.
The problem: what if the page was modified on the server in the meantime? The cached copy is stale, yet the user may need the latest version.
Two extreme solutions exist, both unsatisfactory:
Between these extremes lies a spectrum of consistency guarantees. When is staleness acceptable? How much staleness? This is exactly the question that consistency models answer.
The replication and consistency dilemma can be stated simply:
Is the cure worse than the disease? There is no general answer valid for every distributed system — yet practical solutions exist. The key insight is that relaxing consistency constraints is often the only real solution. By relaxing the requirement that updates need to be executed as atomic operations, we avoid (instantaneous) global synchronisations and gain performance. The consequence is that replicas may not always be identical everywhere.
When and to what extent can consistency be relaxed? The answer typically depends on the access and update patterns of replicated data, as well as the purpose of data usage. You should be prepared to reason about this for any given application scenario.
This leads to a crucial design question: is a collection of copies consistent only when they are always identical? Intuitively, a collection of copies is consistent when copies are always the same — a read at any copy always returns the same result. Achieving this requires that when an update is performed on any copy, the update is propagated to all copies before the next operation takes place. That is synchronous replication, giving what is informally called tight consistency. An update is performed at all copies as a single atomic operation, or transaction. Unfortunately, implementing atomicity involving many replicas is inherently difficult when operations must also be fast — replicas need to agree on the global ordering of distributed operations, which requires a great deal of communication and runs into fundamental impossibility results.
Instead of a single notion of consistency, we define consistency models — different sorts of consistency, each fitting different application scenarios. This lets engineers explore the trade-off between costs and benefits of consistency by relaxing requirements according to the specific scenario at hand.
A consistency model is essentially a contract between the processes and the data store, ensuring the correctness of data given a set of rules that processes have to follow. What is "correct" also depends on what processes expect — which is problematic to define in the absence of a global notion of system time.
This definition shifts the focus from the replicated data to the processes using the data. We are no longer asking "are these two copies equal?" but rather "does the process see a view of the data that is acceptable for its application?" Different application needs lead to different useful notions of consistency, and therefore to different consistency models.
Consistency models fall into two broad families:
One approach to relaxing consistency is to impose limits on deviations between replicas rather than requiring exact equality. The level of consistency is defined over three independent axes:
| Deviation type | Description | Example |
|---|---|---|
| Numerical | Absolute or relative difference between replica values | Two copies of a stock price should not deviate by more than $0.01 |
| Staleness | How old a replica value is allowed to be | Weather data should be no more than 4 hours stale |
| Ordering | How many operations may appear out of order | A bulletin board allows at most 6 messages out of order |
This three-dimensional notion defines continuous consistency — consistency as a continuum rather than a binary property.
To measure deviation, we need a unit of consistency, called a conit. The nature of each data store suggests its conit: a stock price, a weather reading, a news item. The conit should be chosen carefully — it determines the granularity at which consistency is tracked.
A larger conit includes more data in one consistency unit (e.g., an entire database row rather than a single field), which means more propagation is needed to keep it consistent. A smaller conit creates less need for propagation (only the changed field needs updating) but requires more metadata to track individual units.
There is no absolute measure for deviation: the conit must be chosen depending on the resource and the problem at hand. Choosing the right granularity is a key engineering decision in any replicated system.
Sequential consistency is one of the most important data-centric models. The main idea: all update operations are seen by all processes in the same order.
A data store is sequentially consistent when the result of any execution is the same as if the read and write operations by all processes on the data store were executed in some sequential order, and the operations of each individual process appear in this sequence in the order specified by its program.
This means there exists a global interleaving of all operations that (a) is consistent with the program order of each process, and (b) produces the same results as the actual execution. Crucially, sequential consistency does not require that the global order respects real time — two operations that are far apart in real time could appear in either order in the sequential sequence, as long as all processes see the same order.
Consider two processes P1 and P2, both writing to variables x and y:
P1: write(x, 1); read(y)
P2: write(y, 1); read(x)
Under sequential consistency, it is acceptable for both reads to return 0 (the initial value), provided all processes see the writes in the same order — e.g., P1's write before P2's write, or vice versa. What is not allowed is for P1 to see x=1,y=0 while P2 sees x=0,y=1, because no single global order could produce both observations simultaneously.
Causal consistency weakens sequential consistency by ordering only operations that are in a cause–effect relation. Operations that are unrelated (concurrent) may be seen in different orders by different processes.
A data store is causally consistent when all processes see write operations that are in a cause–effect relation in the same order. Writes that are causally unrelated (concurrent) may be seen in different orders by different processes.
The key notion is the causal precedence relation (happens-before). An operation A causally precedes operation B if: (i) A and B are on the same process and A comes before B; (ii) A is a write and B is a read that returns the value written by A; or (iii) there exists a chain of causal relationships from A to B.
Suppose P1 writes x=1, then P2 reads x and writes y=2. Since the read of x causally depends on P1's write, and P2's write of y causally follows that read, all processes must see P1's write before P2's write. However, two independent writes (by different processes to different variables) could appear in any order.
Can you construct an execution that is sequentially consistent but not causally consistent? What about one that is causally consistent but not sequentially consistent? Understanding the difference — and the exact role of concurrent operations — is a common exam topic.
Eventual consistency is a client-centric model designed for large-scale distributed data stores with infrequent update conflicts. The typical scenario: a single authority performs updates, while many processes only read. The only conflict is read–write: one process updates a data item while another concurrently reads it.
A data store is eventually consistent when, if no updates take place for a sufficiently long period, all replicas will gradually become consistent. Between updates, replicas may diverge and stale values may be returned to readers.
Eventual consistency is the weakest meaningful consistency guarantee for a replicated store. It is often the default in systems that prioritise availability and partition tolerance under the CAP theorem. The practical question is always: how long until eventual convergence, and what anomalies can occur in the meantime?
This is where the client-centric models come in — they refine eventual consistency by adding specific guarantees for individual clients.
A data store provides monotonic-read consistency if the following condition holds: if a process reads the value of a data item x, any successive read operation on x by the same process will always return that same value or a more recent value.
In other words, once a client has seen a particular version of data, it will never "go back in time" to an older version on subsequent reads — even if those reads hit different replicas that are not yet fully updated.
A user checks their e-mail from different devices. On the phone, they see an inbox with 10 messages (including a new message just delivered). Later, they open the laptop, which reads from a different replica. Monotonic reads guarantees that the laptop will show at least those 10 messages — possibly more if new e-mails arrived, but never fewer. Without this guarantee, the laptop might temporarily show only 9 messages if its replica has not yet received the latest update.
This is enforced by ensuring that when a client moves to a new replica, that replica is at least as up-to-date as the previous one the client was using.
A data store provides monotonic-write consistency if the following condition holds: a write operation by a process on a data item x is completed before any successive write operation on x by the same process.
The idea is that the order of updates issued by a single process is maintained across distributed replicas. If a process writes x=1 and then x=2, no other process should ever see x=2 without first seeing x=1 — even if the two writes go to different replicas.
A developer pushes two commits to a distributed version control system: commit A adds a new function, and commit B uses it. Monotonic writes ensures that commit B is never applied to a repository before commit A. Without this guarantee, a developer checking out the latest version might see a build that references a function that does not yet exist.
A data store provides read-your-writes consistency if the following condition holds: the effect of a write operation by a process on data item x will always be seen by a successive read operation on x by the same process.
This is perhaps the most intuitive client-centric guarantee. It avoids the "web page failed update" effect: a user updates their profile picture, refreshes the page, and sees the old picture because the read hit a replica that had not yet received the update.
A user changes their password on a website. The next request must authenticate with the new password. Read-your-writes guarantees that the authentication request — the read following the write — will use the updated password, even if the read is served by a different replica than the one that processed the password change.
A data store provides writes-follow-reads consistency if the following condition holds: a write operation by a process on data item x following a previous read operation on x by the same process is guaranteed to take place on the same or a more recent value of x that was read.
Essentially, writes are performed on a version that is at least as recent as the version the client previously read. This prevents the lost-update anomaly where a client reads a value, makes a decision based on it, writes a new value, but the write is applied to an older version — overwriting someone else's intermediate update.
A user reads a post and then writes a comment. Writes-follow-reads ensures that the comment is recorded against the post version the user actually saw — not against a stale or deleted version. This prevents the confusing situation where a user's reply references content that no longer exists because it was applied to an older snapshot.
Supporting replication in a distributed system means deciding where, when, and by whom replicas should be placed, and which mechanisms should be adopted to keep them consistent. This breaks into two subproblems: placing replica servers (deciding the physical or logical locations) and placing content (deciding which data goes where). These are not the same problem.
Replication does not mean only replicating data — services can be replicated as well, for the same reasons as data stores. This means replicating functions, which may or may not operate on the same underlying data store. Service replication introduces two layers, each with its own consistency and replication model.
In a distributed mobile setting, processes can also be replicated. This may require cloning mechanisms, but also higher-level approaches such as goal-passing (used in agent-oriented programming).
The general field of replication is vast. Dedicated books exist on the topic: "Replication: Theory and Practice" (Charron-Bost et al., 2010) and "Database Replication" (Kemme et al., 2010) are canonical references. The goal of this course is to equip you with the foundational understanding to read those books proficiently.
A widely used approach to balance consistency and performance is quorum-based replication. The idea: instead of requiring all N replicas to agree on every operation, we require only a subset. A write requires acknowledgment from W replicas (the write quorum), and a read requires responses from R replicas (the read quorum).
The key condition for strong consistency is R + W > N. When this holds, any read quorum and any write quorum are guaranteed to overlap in at least one replica. That overlapping replica will have the most recent write, ensuring the read returns the latest value. If R + W ≤ N, reads and writes may not overlap, yielding only eventual consistency.
Why does R + W > N guarantee that a read returns the latest written value? What happens in the edge case where W > N/2 but R is small?
Explore the parameter space yourself with the interactive quorum simulator below, then step through a concrete execution with the stepper.
Adjust the sliders to explore how R + W > N determines consistency.
7 > N = 5Consistency models form a spectrum from the weakest (no guarantees) to the strongest (linearisability). The state explorer below lets you navigate this spectrum, showing how each model relates to the others and what guarantees it provides. Use the buttons to move between models and see the definition of each.
| Consistency type | Key mechanism / protocol | Typical systems / tech |
|---|---|---|
| Strong (Linearizable) | Raft, Paxos | etcd, Spanner, CockroachDB |
| Sequential | Consensus-based replication | ZooKeeper, Chubby |
| Causal | Vector clocks, CRDTs | AntidoteDB, Orleans, Akka |
| Eventual | Gossip, Merkle trees | DynamoDB, Cassandra, Riak |
| Tunable | Quorums (R/W) | Cassandra, Dynamo-style DBs |
| Transactional | Consensus + TrueTime | Spanner, CockroachDB |
| Log-based | ISR, CDC streams | Kafka, Debezium, Aurora |
Replication is useful in distributed systems. Replication requires consistency. Consistency is not an absolute notion — different application needs mandate different models of consistency. We talk about data, yet nowadays we replicate whatever resource we need: data, services, processes — and soon, mobile agents with cloning and goal-passing.
Data-centric models focus on the state of the data across all replicas and define constraints on how operations are ordered globally — e.g., sequential consistency requires all processes to see writes in the same order. Client-centric models focus on the guarantees provided to an individual client as it moves between replicas — e.g., monotonic reads ensure that a client never sees an older version after having seen a newer one.
If a write confirms on W replicas, and a later read queries R replicas, the pigeonhole principle ensures that when R + W > N, at least one replica is in both sets. That replica holds the latest write timestamp, so the read always sees the newest value. If R + W ≤ N, the read and write sets may be disjoint, and the read could return a stale value.
Consider two processes P1 and P2 with concurrent writes to unrelated variables x and y: P1 writes x=1; P2 writes y=1. Under causal consistency (no causal relationship between the writes), different processes may see the writes in different orders (x=1 before y=1, or vice versa). Under sequential consistency, all processes must agree on a single total order. A system like a news feed that does not require total ordering across unrelated updates could safely use causal consistency and gain better performance.
A conit (consistency unit) is the unit of data over which consistency is measured — e.g., a stock price, a database field, or an entire row. A larger conit includes more data, so more propagation is needed when any part of it changes. A smaller conit reduces propagation costs (only changed sub-units are propagated) but increases metadata overhead.
Browsers cache Web pages for performance, but the cached copy may become stale. Forbidding caching eliminates staleness but hurts performance. Server-managed invalidation (the server notifies all caches when a page changes) keeps caches fresh but does not scale — the server must track every cache. Relaxed consistency models offer a middle ground: bounded staleness, conditional caching, or client-visible freshness guarantees.
Essential: password change — after updating the password, the next login must use the new value; reading a stale password from an out-of-date replica would lock the user out. Not important: reading weather data — if a user reads a cached temperature of 22 degrees and the server updates to 23, the user is unlikely to notice or care about the stale value.
Global synchronisation requires all replicas to reach agreement on the exact ordering of operations and the moment to apply updates locally. This demands extensive communication: replicas must exchange messages to agree on a global ordering, coordinate commit points, and handle failures. The FLP impossibility result shows that in an asynchronous system, consensus (and thus global synchronisation) cannot be guaranteed in bounded time even with a single faulty process.
Monotonic reads: once a client reads value V of x, future reads return V or a newer value. Read-your-writes: after a client writes x, a subsequent read returns that write or a newer value. Yes, a system can provide one without the other. For example, a system with read-your-writes but not monotonic reads could let a client read its own write, then later read a stale value from a different replica — violating monotonic reads while preserving read-your-writes.