DS-C5 — Logical Clocks

1. Actions, Events, and Causes

Any computational system can be modelled as a sequence of executed actions. An action is defined by a change in the state of the system — or, more generally, in the world: reading from memory, writing to a file, switching on a lamp. In a distributed system, actions execute in multiple locations, and in this context they are often represented as events: sending or receiving messages, changing a record in a database.

The physical distribution of events can vary hugely: events may occur on different processes running on the same machine, or be geographically spread across the globe. The key property of a distributed system is that no process has immediate access to the state of another — information spreads only through communication.

Key idea

Events are the observable units of a distributed computation. Understanding how they relate to one another — especially which events could have caused which — is essential to designing correct distributed algorithms.

Not all events are related, but some events can cause or influence how other events occur. For example, a reply to a received email is influenced by that message — and possibly by other prior messages received. Detecting potential cause-and-effect relations between events is fundamental to the design of distributed algorithms: maintaining consistency in replicated databases, performing garbage collection, building checkpoints, and analysing concurrency.

2. Internal vs External Causality

To make sense of cause-and-effect relations within a distributed system, we should limit their scope to what can be perceived inside the system itself — this is internal causality. Processes observe events such as message sends and receives, and from these observations they can deduce causal links.

However, a distributed system also interacts with the outside physical world, where other cause-and-effect relations exist at large — external causality. Consider Alice reserving a table for dinner for two and then telling Bob; Bob then reserves two movie tickets for after dinner. Even if the restaurant and cinema booking systems are part of the same distributed system, they know nothing of the external causal relation ("Alice told Bob"). The two reservation systems see concurrent events even though a human observer knows they are causally linked.

Editor's note

External causality cannot be detected by a distributed system without domain-specific knowledge. The only general approximation available is physical time — if event A happened at 7 PM and event B at 8 PM, A might have caused B — but this is heuristic at best.

3. The Happens-Before Relation

In 1978, Leslie Lamport introduced the happens-before relation, denoted with the arrow operator a → b, meaning "a happens before b" — all processes agree that a occurs first, then b occurs. This relation is the foundation of all logical clock theory.

The relation is directly observable in two situations:

Same process: If a and b are events of the same process, and a comes before b, then a → b. Local events are ordered by local sequential execution.
Message propagation: If a message is sent by an event a (send event) and received by another event b (receive event), then a → b. A message always takes a finite, positive, non-zero amount of time to propagate.

The relation is transitive: if a → b and b → c, then a → c. This gives us a partial order over events in the system.

Important

When neither a → b nor b → a can be established, the events are said to be concurrent, written a ∥ b. Concurrency does not mean the events happened at the same physical time; it means we have no way to determine a causal ordering between them.

sequenceDiagram
    participant P1 as Process P1 (Alice)
    participant P2 as Process P2 (Bob)
    participant P3 as Process P3 (Chris)
    Note over P1: a1 → Check fridge
    P1->>P2: a2: "Dinner?" (ts=C(a2))
    Note over P2: b1 → Receive msg
    P2->>P1: b2: "Yes, let's do it" (ts=C(b2))
    Note over P3: c1 → "Bored..."
    P3->>P1: c2: "Can I join?" (ts=C(c2))
    Note over P1: a3 → Receive replies

In this diagram, the arrows represent message sends. The happens-before relation includes: a1 → a2 → b1 → b2 → a3, and also c1 → c2 → a3. But c1 is concurrent with a1 and a2 — no message connects Chris's boredom to Alice's initial check.

4. Causal Paths and Causal Histories

In a distributed computation, a simple way to check whether an event c could have caused another event e is to find at least one directed path from c to e following the happens-before relation. If such a path exists, then c → e and c is a possible cause for e.

The examiner will ask

Given a space-time diagram of a distributed execution, identify which events are causally related and which are concurrent. For any pair of events, determine whether there exists a causal path connecting them.

Causal Histories

A causal history is a richer representation: each process keeps track of the set of all events it knows about. When a message is sent, the sender includes its causal history; the receiver merges the received history with its own. The causal history of an event is the set of all events that happened before it.

For the scenario above:

After a1: P1's causal history = {a1}
After a2: P1's history = {a1, a2}
After b1 (receiving a2): P2's history = {b1, a1, a2}
After b2: P2's history = {b1, b2, a1, a2}
After a3 (receiving b2): P1's history = {a1, a2, a3, b1, b2}

Causal histories grow without bound, which makes them impractical as a data structure. However, the idea of tracking what each process knows about every other process directly motivates vector clocks.

6. Logical Clocks: Definition and Consistency

Let H be the domain of events and T the domain of (logical) time values. A logical clock is a function C: H → T assigning a time value to each event, such that all processes agree on the value assigned to each event.

The fundamental property that every logical clock must satisfy is the clock consistency condition (monotonicity):

∀ a, b ∈ H, a → b ⇒ C(a) < C(b)

This ensures that if a causally precedes b, the logical clock reflects this by giving a a smaller timestamp. However, the converse does not necessarily hold: C(a) < C(b) does not imply a → b, because unrelated events can still receive ordered timestamps.

If the system additionally satisfies:

∀ a, b ∈ H, a → b ⇔ C(a) < C(b)

...then the clock system is said to be strongly consistent. Scalar (Lamport) clocks are not strongly consistent; vector clocks are.

8. Lamport's Algorithm: Scalar Time

Lamport's logical clock algorithm is the simplest form of logical time. Each process p_i maintains a single integer lc_i (its local scalar clock) and a local increment δ_i (typically 1).

The algorithm's beauty is its minimalism: a single integer per process, with three straightforward rules. Despite its simplicity, it guarantees the clock consistency condition.

Because lc_i acts as both the local clock and the global timestamp, every event at p_i gets C(a) = lc_i(a). This merging of local and global time is what makes the system efficient but also what limits it — as we will see.

Key idea

The max operation on receive is what propagates causal information. When P2 receives a message with timestamp 3 but its own clock is only at 2, the max operation pulls the clock forward to 3, acknowledging that the sender had seen more events.

11. The Limits of Scalar Time

Although scalar clocks satisfy a → b ⇒ C(a) < C(b), the converse is false:

C(a) < C(b) ✘⇒ a → b

Scalar clocks are not strongly consistent. This is because each process squashes its local logical clock and its view of the global clock into a single integer. When process p_2 receives a message from p_1 with timestamp 2, it updates its clock to max(lc2, 2) = 2 — but it "forgets" that this value of 2 came from p_1's event, not from its own execution. Causal dependency information among events at different processes is lost.

Example

Consider three processes. p_1's third event has a scalar timestamp of 3, while p_3's fourth event has a timestamp of 4. Despite 3 < 4, the events may be concurrent — no message path connects them. The scalar clock gives the illusion of ordering where none exists causally.

This limitation motivates the need for a richer structure: vector clocks.

12. Vector Clocks: Motivation and Definition

In essence, with scalar clocks a → b implies C(a) < C(b), but C(a) < C(b) does not imply a → b. This means time values can be totally ordered when events are not — comparing scalar timestamps of concurrent events is meaningless.

What we need is a clock system that can answer two questions for any pair of events a, b:

Does a causally precede b?
Are a and b concurrent?

A vector clock for a system of n processes is an array (vector) of n logical clocks, one for each process. Each process p_i maintains a vector vc_i where:

vc_i[i] is the number of events that have occurred so far at p_i — essentially p_i's own scalar clock.
vc_i[j] = k means that p_i knows that at least k events have occurred at process p_j.

Editor's note

The hypothesis is that the number of processes n is known and fixed. We could use smaller vectors for subsets of processes, but all correctness properties are proven when every process maintains an n-element vector.

13. Vector Clock Rules

The protocol for managing vector clocks follows the same structure as scalar clocks but operates on entire vectors:

The critical difference from scalar clocks is on receive: the element-wise max operation merges the sender's entire knowledge into the receiver's vector. This preserves causal dependency information: after receiving, p_j knows not just the sender's latest event count, but also what the sender knew about all other processes.

Evolution of Vector Time

sequenceDiagram
    participant P1 as p₁
    participant P2 as p₂
    participant P3 as p₃
    Note over P1: vc₁=[1,0,0]
    Note over P1: vc₁=[2,0,0]
    P1->>P2: msg with ts=[2,0,0]
    Note over P2: vc₂=[2,1,0] after receive
    Note over P2: vc₂=[2,2,0]
    P2->>P3: msg with ts=[2,2,0]
    Note over P3: vc₃=[2,2,1] after receive

Notice how the vector preserves the chain: p_3 ends up knowing that p_1 has had 2 events and p_2 has had 2 events. With a scalar clock, p_3 would have a single number (say, 5) and could not distinguish which process contributed which events.

14. Vector Clock Ordering and Strong Consistency

Vector timestamps are compared component-wise. For two vector timestamps vh and vk:

vh < vk ⇔ (∀i, vh_i ≤ vk_i) ∧ (∃i', vh_{i'} < vk_{i'})

This ordering is antisymmetric and transitive. But most importantly, it captures causality exactly:

vc(m1) < vc(m2) ⇔ m1 → m2

Vector clocks provide strong consistency — the bidirectional implication that scalar clocks lack. If two vector timestamps are incomparable (neither < nor > nor =), the corresponding events are concurrent.

The examiner will ask

Given two vector timestamps, determine the causal relationship between their events. For example, [2,1,0] and [2,2,0]: is the first event earlier, later, or concurrent with the second?

Property	Scalar (Lamport)	Vector
Data per process	1 integer	`n` integers
Message overhead	1 integer	`n` integers
Consistency	Weak: a→b ⇒ C(a)<C(b)	Strong: a→b ⇔ vc(a)<vc(b)
Concurrency detection	Not possible	Incomparable ⇒ concurrent

Check Your Understanding

Explain the happens-before relation. What are the two ways to directly observe a → b?

The happens-before relation (a → b) means all processes agree that event a occurs before event b. It is directly observed in two situations: (1) a and b are events in the same process, and a comes before b in that process's sequential execution; (2) a is the sending of a message and b is the reception of that same message, since messages take finite positive time to propagate. The relation is transitive: a → b and b → c implies a → c.

What does it mean for two events to be concurrent (a ∥ b) in the context of happens-before?

Two events a and b are concurrent (a ∥ b) when neither a → b nor b → a can be established. This does NOT mean they happened at the same physical time. It means that there is no causal path connecting them in either direction: no chain of same-process ordering and message propagation links one to the other. They are causally independent.

State the clock consistency condition. Why is it called a "weak" condition?

The clock consistency condition is: a → b ⇒ C(a) < C(b). It is called "weak" because the implication goes only one way. If C(a) < C(b), we cannot conclude that a → b — the timestamps may be ordered while the events are actually concurrent. This is the case with scalar (Lamport) clocks, which are weakly consistent but not strongly consistent.

Describe the three rules of Lamport's logical clock algorithm.

(1) Before executing any local event (internal, send, or receive), process p_i increments its local clock lc_i by δ_i (typically 1). (2) When sending a message m, p_i piggybacks the current value of lc_i on the message as its timestamp. (3) Upon receiving a message m with timestamp ts, p_j adjusts its clock: lc_j ← max(lc_j, ts). This ensures the clock always moves forward to reflect observed causal information.

Why are scalar clocks NOT strongly consistent? Give a concrete example.

Scalar clocks merge the local clock and the view of global time into a single integer. When p₂ receives a message from p₁ with timestamp 3, it sets lc₂ = max(lc₂, 3) = 3, but "forgets" that this 3 represents p₁'s events, not p₂'s own. Later, p₁'s fourth event might get timestamp 4 and p₂'s third event also gets 4. Even if 4 = 4, the events might be concurrent — we cannot recover the causal relationship. More generally, C(a) < C(b) does not imply a → b because unrelated events can receive ordered timestamps.

How does a vector clock fix the strong consistency problem?

A vector clock maintains one counter per process (n counters for n processes). Instead of squashing all information into a single integer, the receive operation applies element-wise max: ∀k, vc_j[k] ← max(vc_j[k], ts(m)[k]). This preserves what p_j knows about every process after incorporating the sender's knowledge. The ordering vc(a) < vc(b) is defined component-wise and holds if and only if a → b, giving strong consistency. If two vectors are incomparable, the events are concurrent.

Prove that if a → b, then for vector clocks, vc(a) < vc(b).

By induction on the definition of a → b. Base cases: (1) Same process: a occurs before b at p_i. Each event increments vc_i[i], so vc(a)_i < vc(b)_i and all other components are ≥ (they may have been updated in between). Thus vc(a) < vc(b). (2) Message send/receive: a sends a message with timestamp ts = vc(a). Upon receive at b, p_j sets vc(b)[k] ← max(vc(b)[k], ts[k]) for all k, and also increments its own counter. So vc(b) ≥ ts = vc(a) element-wise, and at least one component (p_j's own) strictly increased. Thus vc(a) < vc(b). Transitive closure follows from transitivity of the < relation on vectors.

What is the practical implication of the trade-off between scalar and vector clocks?

Scalar clocks use O(1) space per process and O(1) message overhead but cannot detect concurrency. Vector clocks use O(n) space and O(n) message overhead but can detect concurrency and provide strong consistency. In systems with many processes, the O(n) overhead of vector clocks can be significant, so some practical systems use scalar clocks with additional mechanisms (like version vectors in CRDTs) or hybrid approaches.

Explain the concept of event height with scalar clocks when δ = 1.

When the increment δ is always 1, the scalar timestamp h of an event a has the property that h − 1 represents the minimum logical duration in terms of number of events — the number of events that causally precede a along the longest causal path ending at a. This is called the height of the event. For example, if an event has timestamp 5, at least 4 other events must have occurred sequentially before it, regardless of which processes produced them.

How could you use total ordering of scalar timestamps for liveness in a distributed algorithm?

Many distributed algorithms (e.g., distributed mutual exclusion, totally ordered multicast) rely on processing events in a consistent global order. Scalar timestamps with tie-breaking by process ID provide a total order. Since concurrent events are causally independent, ordering them arbitrarily does not violate causality. This total order ensures liveness: processes can always decide which event to handle next, even when events occur concurrently at different processes.

Logical Clocks

In this lesson

1. Actions, Events, and Causes

2. Internal vs External Causality

3. The Happens-Before Relation

4. Causal Paths and Causal Histories

Causal Histories

5. Logical Time: From Causality to Clocks

6. Logical Clocks: Definition and Consistency

7. Implementing Logical Clocks

8. Lamport's Algorithm: Scalar Time

9. Lamport Clock Interactive Simulator

10. Total Ordering via Scalar Clocks

11. The Limits of Scalar Time

Example

12. Vector Clocks: Motivation and Definition

13. Vector Clock Rules

Evolution of Vector Time

14. Vector Clock Ordering and Strong Consistency

15. Interactive Vector Clock Explorer

P1

P2

P3

16. Matrix Time

17. Summary

Check Your Understanding