The Thinking Machine Chronicles #0001: The Paper That Started Everything: Turing's "On Computable Numbers"

Mushroom cloud of the Trinity nuclear test, 16 July 1945 — Trinity, 16 July 1945: the first atomic detonation, the defining image of the era in which Turing's abstract machine became an urgent practical necessity. U.S. Department of Energy, public domain.

Era 1 · The Foundations (1936–1955) Mathematics breaks itself open. Something new stirs in the wreckage.

The World in 1936

The year 1936 was one of gathering storms. Adolf Hitler remilitarised the Rhineland in March; the Spanish Civil War erupted in July; at the Berlin Olympics in August, Jesse Owens won four gold medals as the Nazi hierarchy looked on in silence. In Britain, King Edward VIII abdicated in December. The world was fracturing along fault lines that would, within three years, ignite the largest war in human history. Meanwhile, in the quieter theatre of pure mathematics, a parallel crisis had been building since Kurt Gödel's 1931 incompleteness theorems demolished Hilbert's program: not all mathematical truth could be proved. One question from Hilbert's 1928 agenda remained open: the Entscheidungsproblem, the decision problem. Could there exist a mechanical procedure that, given any well-formed mathematical statement, would determine in finite time whether it was provable?

It was this question that a 24-year-old Cambridge graduate student, working largely in isolation during the summer of 1935, sat down to answer. His name was Alan Mathison Turing.

The Problem Turing Was Solving

David Hilbert's Entscheidungsproblem was not merely an academic puzzle. It was a foundational wager: that mathematics was, at its core, a finite formal game that could, in principle, be mechanised. If a decision procedure existed, then any mathematical question was, in theory, answerable by rote, by a sufficiently tireless clerk following rules.

To answer this, Turing had to do something radical: he needed a precise mathematical definition of what an algorithm actually is. Before 1936, "algorithm" was intuitive: a sequence of well-defined steps. Turing's genius was to make the concept rigorous by describing the most minimal possible machine that could carry out any such procedure. His answer was so clean, so austere, that it remains the standard model of computation nearly ninety years later.

The Machine on Infinite Paper

A Turing machine is not a physical device. It is a mathematical abstraction. Picture three components:

An infinite tape divided into cells, each containing one symbol from a finite alphabet, padded with a special blank symbol $\sqcup$ wherever no input has been written.
A read/write head positioned over one cell at a time, capable of moving one cell left or right.
A finite state control: a set of internal states and a transition table governing what to do in each state when reading each symbol.

At every step the machine reads the symbol under the head; consults the transition table; writes a new symbol, moves the head one step, and transitions to a new internal state. It halts when it enters one of two terminal states: accept or reject.

Formally, a Turing machine is the 7-tuple:

$M = (Q,\; \Sigma,\; \Gamma,\; \delta,\; q_0,\; q_{\text{accept}},\; q_{\text{reject}})$

where:

$Q$ : finite, non-empty set of states
$\Sigma$ : input alphabet (symbols that appear in the initial input; $\sqcup \notin \Sigma$ )
$\Gamma \supseteq \Sigma \cup \{\sqcup\}$ : tape alphabet
$\delta : Q \times \Gamma \rightarrow Q \times \Gamma \times \{L,\, R\}$ : transition function
$q_0 \in Q$ : initial state
$q_{\text{accept}} \in Q$ : accepting state
$q_{\text{reject}} \in Q$ : rejecting state ( $q_{\text{accept}} \neq q_{\text{reject}}$ )

The transition function $\delta$ is the machine's soul. Given a state $q$ and a tape symbol $a$ , the tuple $\delta(q,\, a) = (q',\, b,\, D)$ says: move to state $q'$ , write symbol $b$ , shift the head in direction $D$ .

A configuration is the complete snapshot of the machine at any instant: the current state, the full tape contents, and the head position. Conventionally written as $u\, q\, v$ , where $u$ is the string to the left of the head, $q$ is the current state, and $v$ is everything at and to the right of the head. A computation is a sequence of configurations $C_1 \vdash C_2 \vdash \cdots \vdash C_n$ , each derived from the previous by applying $\delta$ once.

A Worked Example: Parity Checker

The simplest non-trivial example: decide whether the input (over the alphabet $\{1\}$ ) consists of an even number of 1s. Two states suffice: one for "even count so far" and one for "odd count so far".

State	Read	Write	Move	→ State
`even`	`1`	`1`	R	`odd`
`even`	`⊔`	`⊔`	R	ACCEPT
`odd`	`1`	`1`	R	`even`
`odd`	`⊔`	`⊔`	R	REJECT

Tracing on input 1111:

$\text{even}\_1111 \vdash \text{odd}\_111 \vdash \text{even}\_11 \vdash \text{odd}\_1 \vdash \text{even}\_\sqcup \vdash \text{ACCEPT}$

The machine read four 1s, the parity flipped four times back to even, and it accepts. On input 111 the final state would be odd, leading to REJECT.

The Universal Turing Machine

The most transformative insight in the 1936 paper is not the Turing machine itself; it is the Universal Turing Machine (UTM).

Turing showed that any Turing machine $M$ can be encoded as a finite string $\langle M \rangle$ over a fixed alphabet. This encoding is essentially a textual description of $M$ 's transition table. He then described a special machine $U$ , the universal machine, that takes the pair $\langle M,\, w \rangle$ as input and simulates $M$ running on $w$ :

$U(\langle M,\, w \rangle) = M(w)$

$U$ accepts iff $M$ would accept, rejects iff $M$ would reject, and loops iff $M$ would loop.

This was the first precise description of a stored-program computer. The program (the encoding $\langle M \rangle$ ) is just data on the tape. The hardware ( $U$ ) interprets it. This is the architecture of every computer built since 1945. John von Neumann cited Turing's UTM directly when he designed the EDVAC stored-program architecture in 1945. The iPhone in your pocket is a physical instantiation of a universal machine.

The Halting Problem: The First Undecidable Question

With the UTM established, Turing turned to the Entscheidungsproblem. His answer was no: there is no mechanical decision procedure for all mathematical statements. He proved this by identifying a specific, concrete problem that no Turing machine can decide.

Define the halting language:

$A_{TM} = \{\, \langle M,\, w \rangle \;:\; M \text{ is a Turing machine that accepts } w \,\}$

Theorem (Turing, 1936). $A_{TM}$ is undecidable.

Proof. Suppose, for contradiction, that a machine $H$ decides $A_{TM}$ . That is:

$H(\langle M,\, w \rangle) = \begin{cases} \text{accept} & \text{if } M \text{ accepts } w \\ \text{reject} & \text{if } M \text{ does not accept } w \end{cases}$

Construct a new machine $D$ that takes the encoding $\langle M \rangle$ as input, calls $H$ on the pair $\langle M,\, \langle M \rangle \rangle$ , and does the opposite:

$D(\langle M \rangle) = \begin{cases} \text{reject} & \text{if } H(\langle M,\, \langle M \rangle \rangle) = \text{accept} \\ \text{accept} & \text{if } H(\langle M,\, \langle M \rangle \rangle) = \text{reject} \end{cases}$

Now ask: what does $D$ do on input $\langle D \rangle$ , its own encoding?

If $D$ accepts $\langle D \rangle$ : then $H$ reported that $D$ rejects $\langle D \rangle$ . So $D$ must reject $\langle D \rangle$ . Contradiction.
If $D$ rejects $\langle D \rangle$ : then $H$ reported that $D$ accepts $\langle D \rangle$ . So $D$ must accept $\langle D \rangle$ . Contradiction.

In both cases we have a contradiction. Therefore $H$ cannot exist, and $A_{TM}$ is undecidable. $\square$

This is the diagonalisation argument, borrowed directly from Cantor's 1891 proof that the real numbers are uncountable. Turing recognised that the same logical structure applied to computation. It was devastating and elegant in equal measure.

ℹ

Note on recognisability vs. decidability. $A_{TM}$ is recognisable (Turing-enumerable): the universal machine $U$ can simulate $M$ on $w$ and accept whenever $M$ accepts, but it cannot always halt and reject. A language is decidable only if a TM always halts with the right answer; recognisable if a TM accepts everything in the language but may loop on strings outside it.

What This Means for Computation and AI

The Turing machine model established several concepts that remain foundational:

The Church-Turing thesis. Independently, Alonzo Church published a different formalism, the lambda calculus, capturing the same class of functions. The conjecture that every "reasonable" model of computation captures exactly the same class is the Church-Turing thesis. It is not a provable theorem, but a claim about the nature of effective computation. Decades of failed counterexamples have made it one of the most robust theses in mathematics.

Computational complexity. Once we know what is computable, we ask how efficiently. The time complexity classes, $P$ , $NP$ , and $PSPACE$ , are all defined in terms of resource-bounded Turing machines. The $P$ vs $NP$ question, whether every problem whose solution can be verified quickly can also be found quickly, is the central open problem in computer science.

Limits of AI. Any AI system running on a conventional computer is, at the level of computability, a Turing machine. Turing's undecidability results apply directly: no AI can solve all problems, verify all properties, or decide all questions. The undecidability of program verification, the impossibility of a perfect bug detector, and the frame problem in planning all trace back to 1936. When modern researchers ask "can we formally verify that a neural network is safe?", they are navigating the landscape Turing mapped.

The Code

The Python implementation below faithfully models the formal definition. The TuringMachine class encodes the transition function $\delta$ as a dictionary mapping (state, symbol) pairs to (new_state, write_symbol, direction) triples. The run method simulates a computation from the initial configuration to a halting state.

from dataclasses import dataclass
from typing import Dict, Tuple

@dataclass
class TuringMachine:
    """
    M = (Q, Σ, Γ, δ, q₀, q_accept, q_reject)
    transitions: {(state, symbol): (new_state, write_symbol, direction)}
    """
    transitions: Dict[Tuple[str, str], Tuple[str, str, str]]
    initial_state: str
    accept_state:  str = "ACCEPT"
    reject_state:  str = "REJECT"
    blank:         str = "_"

    def run(self, input_str: str, max_steps: int = 100_000):
        tape  = [self.blank] + list(input_str) + [self.blank]
        head  = 1
        state = self.initial_state

        while state not in (self.accept_state, self.reject_state):
            if head < 0:
                tape.insert(0, self.blank); head = 0
            elif head >= len(tape):
                tape.append(self.blank)

            key = (state, tape[head])
            if key not in self.transitions:
                return False, "".join(tape).strip(self.blank)

            new_state, write, direction = self.transitions[key]
            tape[head] = write
            state      = new_state
            head      += 1 if direction == "R" else -1

        return state == self.accept_state, "".join(tape).strip(self.blank)

# Parity checker: accept iff input has an even number of 1s
parity_machine = TuringMachine(
    transitions={
        ("even", "1"): ("odd",    "1", "R"),
        ("even", "_"): ("ACCEPT", "_", "R"),
        ("odd",  "1"): ("even",   "1", "R"),
        ("odd",  "_"): ("REJECT", "_", "R"),
    },
    initial_state="even",
)

The full project includes a palindrome checker, a unary adder, a visualiser that prints tape snapshots at each step, and a demonstration of encoding a machine as JSON: the stored-program principle in miniature.

Why It Mattered

Turing's 1936 paper accomplished three things simultaneously, each of which would have been enough to secure a place in history:

It answered the Entscheidungsproblem. The answer was no: there is no decision procedure for all of mathematics. This was already known to Church (who proved it weeks earlier via lambda calculus), but Turing's proof was the more conceptually powerful of the two. His machine was a device, not an abstraction. People could picture it.

It invented the stored-program computer. The UTM: a program that reads another program as data and simulates it: is the conceptual blueprint for every programmable computer ever built. The distinction between hardware and software is Turing's distinction between the universal machine and its input tape.

It defined computation. Before 1936, "algorithm" was intuitive. After Turing, it was mathematical. You could ask: is this problem computable? Is it decidable? How many steps does it require? All of modern computer science, and most of modern AI theory, lives inside the formal framework Turing established at age 24.

Turing completed this paper in 1936 while a second-year graduate student at Cambridge. He had not yet started his PhD formally. He had not yet met von Neumann, built a machine, or broken a single cipher. This was just the beginning.

What Came Next

In 1937, Claude Shannon: then a 21-year-old master's student at MIT: would show that Boolean algebra maps directly onto electrical switching circuits. Where Turing defined what computation is, Shannon would show how to build it in hardware. That story is next: The Thinking Machine Chronicles #0002: Electricity Learns to Think.

ℹ

The Original Paper

A. M. Turing, "On Computable Numbers, with an Application to the Entscheidungsproblem," Proceedings of the London Mathematical Society, Series 2, Vol. 42 (1936–37), pp. 230–265. A correction appears in Vol. 43 (1937), pp. 544–546.

A scanned PDF is freely available via the University of Virginia: cs.virginia.edu/~robins/Turing_Paper_1936.pdf

References

Turing, A. M. (1936). On Computable Numbers, with an Application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, 42(1), 230–265.
Sipser, M. (2012). Introduction to the Theory of Computation (3rd ed.). Cengage Learning. Chapters 3–4 cover Turing machines and decidability with exceptional clarity.
Hodges, A. (1983). Alan Turing: The Enigma. Burnett Books. The definitive biography; Chapter 2 covers the 1936 paper in detail.
Church, A. (1936). An Unsolvable Problem of Elementary Number Theory. American Journal of Mathematics, 58(2), 345–363. The contemporaneous lambda-calculus proof of undecidability.
Davis, M. (2000). The Universal Computer: The Road from Leibniz to Turing. Norton. A readable history of the mathematical ideas leading to Turing's paper.
Cantor, G. (1891). Über eine elementare Frage der Mannigfaltigkeitslehre. The diagonalisation argument Turing adapted.