DS-M2: Roots of Distributed Systems — Computation in Space & Time

1. Prologue — Why Foundations Matter

The study of distributed systems is built upon impossibility results — theorems that tell us what cannot be done, no matter how clever our engineering. Brewer's CAP theorem, the FLP impossibility, the consensus hierarchy — these are the landmarks that define the field.

Key insight

Impossibility results are theorems. They are essential because they delimit the space of admissible moves. But a theorem requires a well-defined ontology: a shared, unambiguous notation, a common definition of what a proof is, and a precise understanding of the objects the theorem talks about.

Do we have that in computer science? Brewer stated his CAP theorem in 2000 without a formal proof; a first proof came two years later from Gilbert and Lynch, under specific assumptions. Surveying the literature of impossibility results for distributed computing, we find heterogeneous notations, diverse ontologies, and different notions of proof — nothing like the uniform foundation of modern mathematics.

The examiner will ask

Why do impossibility results matter more to a field than possibility results? How do they define the boundaries of engineering?

This lesson therefore takes a step back. Before we can study distributed computation, we need to understand computation itself. And before that, we need to ask: what kind of science is computer science? What is its object of study? Only with a clear ontology can we give crisp definitions of parallel, concurrent, and distributed systems.

2. What is Science? Phenomena & Noumena

Foundational Questions

As computer scientists and engineers, we are supposed to ground our technical knowledge upon solid foundations. Yet the basic questions — what is science? what is a machine? what is a computer? what is a system? what is computation? — require deep thinking.

The Science Council defines science as "the pursuit and application of knowledge and understanding of the natural and social world following a systematic methodology based on evidence." This is a good starting point, but for our purposes we need a sharper philosophical distinction.

Phenomena vs. Noumena

From Kant's philosophy: a phenomenon is any object, fact, or occurrence perceived or observed — the thing as it appears to an observer. A noumenon is the thing-in-itself, as opposed to the phenomenon.

Science works in both directions:

We observe phenomena — we record, track, and measure them. We call them facts.
We seek the noumena that give reasons to phenomena — the models that explain the observed facts.

Core idea

There may exist many different models of a given set of facts. They are not equivalent. We typically prefer models with fewer assumptions and more facts explained — more expressive scientific models.

Concept	Definition	Role in science
Phenomenon	An observed fact or occurrence	The raw data science explains
Noumenon	The thing-in-itself, the underlying model	The explanatory theory
Explanation	Connecting phenomena to noumena	Making sense of observations
Prediction	Using noumena to anticipate phenomena	Validating the model

3. Explanation & Prediction

A working scientific model does not just explain observations — it also tells us what happens next. It predicts not-yet-observed phenomena.

Classic examples:

The gravitational redshift was predicted by Einstein's general theory of relativity before it was measured.
The Higgs boson was predicted in 1964 and finally observed in 2012 at CERN.

Editor's note

The predictive power of a scientific theory is an essential part of what makes it a scientific one. A model that only post-hoc explains but never predicts is closer to a narrative than a science.

For computer science, this raises a fundamental challenge. Do our models of computation predict the behaviour of distributed systems? Can we explain phenomena like the CAP theorem's trade-off between consistency and availability as a necessary consequence of a deeper model — much like Einstein explained gravity as a consequence of spacetime curvature?

This is what a mature science of distributed systems should aim for.

4. What is Computer Science?

The classic answer comes from Newell, Perlis, and Simon (1967):

Definition

"There are computers. Ergo, computer science is the study of computers." The phenomena surrounding computers are varied, complex, rich. Wherever there are phenomena, there can be a science to describe and explain them.

But what is a "computer"? The term is not well-defined, and its meaning changes with new developments. A science may bear a shifting object of study — astronomy did not originally include interstellar gases, physics did not include radioactivity, psychology did not include animal behaviour.

What Does a Computer Do?

Whatever a computer is, what it does is compute. A computer is a machine that computes. Computer systems (or computational systems) are systems made of computers. Computers and computational systems produce the evidence, the facts, the phenomena studied by computer science.

The examiner will ask

What is the noumenon at the core of computer science? (Answer: computation is the core object of study — the essence of what computation is represents the noumenon at the heart of the discipline.)

The definitions of computation and computer science go hand in hand. Over time, the definition of computer science has been a moving target, reflecting increasingly sophisticated understandings of computation.

5. The Evolution of a Discipline

Denning (2008, 2010) traces the progression of definitions for computer science across decades:

1940s — Study of automatic computing. The first electronic computers were built to automate numerical calculations for ballistics, cryptography, and science.
1950s — Study of information processing. Computers were seen as machines that process information, not just compute numbers.
1960s — Study of phenomena surrounding computers. Newell, Perlis, and Simon's definition broadened the field beyond the machine itself.
1970s — Study of what can be automated. The rise of AI, algorithm analysis, and the limits of computation.
1980s — Study of computation. A more abstract view: computation itself is the object, independent of the machine.
2000s — Study of information processes, both natural and artificial. Computation appears in biology, physics, social systems — not just in electronic devices.

Editor's note

This widening of scope is crucial for distributed systems. If computation can be natural, it can occur across space and time without a single clock or shared memory — precisely the conditions of distributed computing.

6. What is Computation?

The question "What is computation?" has been called "considered harmful" (Freeman, 2011) — a rigid standard adopted too early impedes progress. Yet it is the basic question around which all computer science revolves, and any well-founded study of distributed systems must be grounded on it.

Computation as Symbol Manipulation

Conery (2010) defines computation as symbol manipulation: a computation is a sequence of simple, well-defined steps that lead to the solution of a problem. The problem and its solution must be encoded in symbols; each step transforms one set of symbols into a new set.

Computation as Process

Frailey (2010) argues that computation can be found in any form of process. Every process is also a computation. A process is defined (Oxford) as "a series of actions or steps taken in order to achieve a particular end" or "a natural series of changes." In operating systems, it is "an instance of a program being executed in a multitasking operating system."

Key insight

Defining computation as process liberates it from the machine. A computation is not what a computer does — it is what a process is. This abstraction is essential for distributed systems, where processes span multiple machines, networks, and time zones.

Five Points from Denning (2011)

Claim	Implication
The computational model matters	Different models capture different phenomena
Many important computations are natural	Computation is not limited to electronic devices
Many important computations are non-terminating	Servers, operating systems, protocols run indefinitely
Many important computations are continuous	Not all computation is discrete step-by-step
Computational thinking can be defined	The intellectual framework is teachable and transferable

7. Computational Models & Machines

Computing Machines

Two canonical models dominate our understanding:

Turing's computing machine (1937): an abstract device manipulating symbols on an infinite tape according to a finite set of rules. The foundation of computability theory.
von Neumann's computing machine (Burks, Goldstine, von Neumann, 1982): a stored-program architecture with a central processing unit, memory, and I/O — the blueprint for nearly all real computers.

Critical question

These models are artificial — do they work when we include natural computations? They are discrete — do they work when we include continuous computations? They represent a single computing device — do they work when we deal with computational systems (multiple devices)?

What is a Machine?

Britannica defines a machine as a device, having a unique purpose, that augments or replaces human or animal effort for the accomplishment of physical tasks. All machines have:

Input: what affects a machine from the outside
Output: how a machine affects the outside
State at time t: whatever is necessary to understand the evolution of a machine after t given some input — or, more generally, given the context where the machine operates

What is a Computing Machine?

A computing machine is a different sort of machine: its task is cognitive instead of physical; its input and output are basically information in some form. If we choose not to stick with one specific computational model, it is appropriate to abstract away from the machinery and focus instead on the computational process.

8. Anatomy of a Computational Process

We now build the basic ontology for distributed systems. Our starting assumption: the elementary computational process is sequential. It represents the phenomenal expression of the dynamics of a computing machine, and has both input/output and context.

A computing machine is the place where a computational process occurs. Visually, we represent a sequential computation as happening inside a closed shape:

flowchart LR
  subgraph Process["Computational Process"]
    direction LR
    P["• Sequential steps
    • Symbol manipulation
    • State transitions"]
  end

Computational Process with Input and Output

The process receives information from outside (input) and sends information to the outside (output). This is the basic I/O model:

flowchart LR
  Input["Input"] --> Process["Process"]
  Process --> Output["Output"]

Computational Process with Context

The process operates within a context that provides the environment for its execution:

flowchart LR
  subgraph Context["Context"]
    direction LR
    C1["Resources, Time, Space, Machine"]
  end
  subgraph Process["Computational Process"]
    P["Sequential computation"]
  end
  Context -.-> Process

The examiner will ask

Why is the assumption of sequential computation the starting point for our ontology? What happens when we relax it?

9. Context for Computation

What is context when computation is concerned? Four dimensions matter:

Dimension	Role	Example
Computing machine	The device that executes the process	CPU, virtual machine, container
Resources	Memory, bandwidth, energy, files	RAM, disk space, network throughput
Time	When computation occurs; ordering of events	Clock rate, scheduling, synchrony
Space	Where computation occurs; physical location	Data center, network node, core

We need to represent context whenever the machine, resources, time, space — or all of them — are relevant to model, represent, or understand computation. In other words, whenever understanding the dynamics of the computational process depends on the environment it lives in.

Key idea

For most of computing history, we have abstracted away from context. We assumed a single machine with abundant resources, a single clock, and no need to know where computation happens. Distributed computing is precisely the subfield where context cannot be abstracted away.

10. Sorts of Computations

Depending on which contextual dimension is essential, we classify computations as:

Type	Essential dimension	Definition
Timed computation	Time	The time of the computing machine is relevant or essential for the computing process
Spatial computation	Space	The spatial features of the computing machine are relevant or essential for the computing process
Situated computation (Suchman, 1987)	Environment	The environment of the computing machine is relevant or essential — any meaningful combination of temporal and spatial features with the resources required by the computation

Editor's note

The notion of situated computation originates in human-computer interaction (Suchman's "Plans and Situated Actions"), but is profoundly relevant here. A distributed computation is the quintessential situated computation: its meaning and correctness depend on where and when each part executes.

Graphically, understanding a computational process requires the precise definition of its computational context. The process (blue area) depends on how we define the features of the context (grey area). This dependency is what we explore next.

12. Context for Computational Systems

When we have multiple processes, how should we represent the context? This is the critical design question that will lead us to the definitions of parallel, concurrent, and distributed systems.

Three possible approaches exist:

A different context for each process

Each computational process has its own independent context. This means each process has its own machine, its own resources, its own sense of time, its own spatial location. This is the most general model — it encompasses all distributed systems where processes are truly independent.

flowchart LR
  subgraph C1["Context 1 (T,S,R)"]
    P1["Process 1"]
  end
  subgraph C2["Context 2 (T',S',R')"]
    P2["Process 2"]
  end
  P1 <--> P2

One context for all processes

A single shared context for the entire computational system. All processes share the same machine, the same clock, the same spatial location, the same resources. This is the model of a single computer running multiple threads — parallelism.

flowchart LR
  subgraph C["Shared Context (T,S,R)"]
    P1["Process 1"]
    P2["Process 2"]
    P3["Process 3"]
  end

A different context for each process plus one context for all processes

A hybrid model where each process has its own individual context, but there is also an overarching context shared by all processes. This captures systems where processes are on different machines (separate spatial contexts) but share a common time reference (e.g., synchronized clocks) or common resources.

flowchart LR
  subgraph Shared["Shared Context (e.g., global time)"]
    subgraph C1["Context 1 (S, R)"]
      P1["Process 1"]
    end
    subgraph C2["Context 2 (S', R')"]
      P2["Process 2"]
    end
  end
  P1 <--> P2

Core insight

The choice of the sort of context defines the sort of computational system. Temporal context determines parallelism vs. concurrency. Spatial context determines distribution.

13. Parallel vs. Concurrent Systems

In the literature, the terms parallel and concurrent are often used interchangeably, but they represent fundamentally different relationships between events.

Parallel Computing

Typically refers to non-sequential computing where more than one computation can be performed at the same time (Shonkwiler & Lefton, 2006). Requires multi-core architectures. Used for computationally-intensive scientific/mathematical problems.

Concurrency

Used with a twofold acceptation (Degano & Montanari, 1987):

Interleaving: events occurring in separate concurrent processes could occur in any relative order — temporal/causal relations between events are not relevant.
True concurrency: partial orderings are used to explicitly capture temporal/causal relations between events.

Relative Ordering of Events is the Key

System type	Event ordering	Temporal context
Parallel	Totally ordered	Same for all processes
Concurrent	At most partially ordered	Different across processes

Critical distinction

In a parallel system, we know exactly when every instruction happens relative to every other instruction (total order). In a concurrent system, events from different processes have no guaranteed temporal order — we only know causal relationships (partial order). This difference is not a limitation of our measurement tools; it is a fundamental ontological property of the system.

16. Combined Contexts

The temporal and spatial dimensions are independent. A system can be distributed and parallel, or distributed and concurrent:

Distributed Parallel Computing

Processes have different spatial contexts (S ≠ S') but the same temporal context (T = T'). This applies to tightly-coupled cluster computing where nodes are geographically separate but share a synchronized clock (e.g., via NTP or hardware clock sync).

flowchart LR
  subgraph T["Shared Time (T)"]
    subgraph S1["Space S"]
      P1["Process 1"]
    end
    subgraph S2["Space S'"]
      P2["Process 2"]
    end
  end
  P1 <-->|"Message passing"| P2

Distributed Concurrent Computing

Processes have different spatial contexts (S ≠ S') and different temporal contexts (T ≠ T'). This is the most general and most common case for modern distributed systems — nodes in different data centres, with independent clocks, communicating asynchronously.

flowchart LR
  subgraph C1["Context 1 (T,S)"]
    P1["Process 1"]
  end
  subgraph C2["Context 2 (T',S')"]
    P2["Process 2"]
  end
  P1 <-->|"Asynchronous messaging"| P2

The examiner will ask

Is a distributed system necessarily concurrent? Or can a distributed system be parallel? Explain using the temporal and spatial context framework.

17. Conclusion

Summing Up

This lesson established the foundational ontology for distributed systems:

Foundations We began by understanding what science is in general — phenomena and noumena, explanation and prediction — and where computer science fits.
Computation We explored definitions of computation (symbol manipulation, process) and examined canonical computational models (Turing, von Neumann), noting their limitations for natural, continuous, and multi-device computation.
Basic ontology We built a simple but rigorous ontology: computational process and device, context, computational system. Each process is sequential; processes interact within a context defined by machine, resources, time, and space.
Definitions We derived reference definitions of parallel, concurrent, and distributed computation and systems based on temporal and spatial context. These definitions allow us to navigate the literature fruitfully, against all odds.

With this foundation in place, we can proceed to study the fundamental results of distributed computing — impossibility results, consensus algorithms, consistency models, and the like — with a shared, precise vocabulary.

Looking ahead

The next lessons will build on this ontology to explore distributed algorithms, logical clocks, consensus protocols, replication strategies, and the CAP theorem. Each of these topics depends on the precise definitions we have established here.

Check Your Understanding

1. Explain the relationship between phenomena and noumena in the context of computer science. What are the "phenomena" and what is the "noumenon" for our field?

In Kant's philosophy, phenomena are observed facts or occurrences — the things as they appear to an observer. Noumena are the things-in-themselves, the underlying models that explain phenomena. In computer science, the phenomena are the observed behaviour of computers and computational systems: program outputs, network traffic, system crashes, performance metrics. The noumenon at the core of computer science is computation itself — the abstract essence of what it means to compute, which gives rise to all the observable phenomena. A mature science of distributed systems seeks to explain distributed phenomena (like consensus failures or consistency violations) in terms of fundamental models of distributed computation.

2. Derive the definition of a distributed system from the concept of context. Why is spatial context the defining dimension?

We start from the basic ontology: a computational process is sequential and operates within a context defined by machine, resources, time, and space. For a computational system of multiple processes, we must decide how to represent the context for each process. If we give each process its own spatial context (different location), and those processes interact via message passing (no shared memory), then the system is distributed. Spatial context is the defining dimension because the separation in space forces processes to communicate asynchronously through messages rather than through shared state — and this has profound consequences for consistency, availability, and fault tolerance.

3. Compare and contrast parallel and concurrent systems using the temporal context framework.

In a parallel system, all processes share the same temporal context (same clock). Events across processes are totally ordered — we know exactly what happens when. In a concurrent system, at least two processes have different temporal contexts. Events across those processes are at most partially ordered — we cannot determine the relative timing of events from different contexts. This indeterminacy is not a measurement problem; it is an ontological property. Parallel computing is about doing multiple things simultaneously; concurrent computing is about dealing with multiple things whose relative timing is unknown. A parallel system is always concurrent in the interleaving sense, but a concurrent system is not necessarily parallel — the processes might execute on a single core with time-slicing.

4. Why do impossibility results (like CAP) require a well-defined ontology before they can be properly stated and proved?

An impossibility result is a theorem about what a system cannot do. For it to be provable, we need precise definitions of: what a system is (components, interactions), what the system model includes (synchrony assumptions, failure models, communication primitives), and what the claimed property means (consistency, availability, partition tolerance). Without a shared ontology, different researchers might mean different things by the same terms, and proofs cannot be compared or verified. The history of Brewer's CAP theorem illustrates this: it was originally stated without formal proof, and the first proof (Gilbert & Lynch, 2002) had to first define the system model precisely before proving the trade-off. A mature field requires the kind of ontological clarity that, say, modern mathematics has achieved with set theory as its common foundation.

5. Give an example of a computation that is non-terminating and explain why the traditional Turing machine model is insufficient to capture it.

A web server is a classic example of a non-terminating computation: it listens for requests indefinitely, processing each one in a loop. Turing machines, by contrast, are defined to compute a function and halt — the computation ends when the machine reaches a halting state. A web server never halts by design. This means the Turing model, while fundamental for computability theory, is insufficient for modeling reactive, interactive, or distributed systems that maintain ongoing behaviour. Models like I/O automata, process algebras (CSP, CCS, pi-calculus), and state machines with liveness conditions are needed to capture non-terminating computations. This connects to Denning's (2011) observation that "many important computations are non-terminating."

6. How does the choice of context representation define the sort of computational system? Walk through the three possible choices.

We have three choices for representing context in a multi-process computational system: (1) A different context for each process — this gives the most general model where each process has its own machine, time, space, and resources. This encompasses distributed concurrent systems. (2) A single context for all processes — all processes share the same machine, clock, and location. This is the model of a parallel system on a single computer. (3) Individual contexts plus one shared context — processes have their own spatial/resource contexts but share a temporal context (e.g., synchronized clocks). This yields distributed parallel systems. The granularity and structure of context directly determines the system classification. This framework shows that parallel, concurrent, and distributed are not arbitrary categories but emerge from a systematic analysis of context.

In this lesson

1. Prologue — Why Foundations Matter

2. What is Science? Phenomena & Noumena

Foundational Questions

Phenomena vs. Noumena

3. Explanation & Prediction

4. What is Computer Science?

What Does a Computer Do?

5. The Evolution of a Discipline

6. What is Computation?

Computation as Symbol Manipulation

Computation as Process

Five Points from Denning (2011)

7. Computational Models & Machines

Computing Machines

What is a Machine?

What is a Computing Machine?

8. Anatomy of a Computational Process

Computational Process with Input and Output

Computational Process with Context

9. Context for Computation

10. Sorts of Computations

11. Computational Systems & Interaction

12. Context for Computational Systems

A different context for each process

One context for all processes

A different context for each process plus one context for all processes

13. Parallel vs. Concurrent Systems

Parallel Computing

Concurrency

Relative Ordering of Events is the Key

14. Distributed Systems & Spatial Context

Distributed Computing

Distributed Systems

15. Formal Definitions

System Classifier

16. Combined Contexts

Distributed Parallel Computing

Distributed Concurrent Computing

17. Conclusion

Summing Up

Check Your Understanding