Week 9: 2018/03/26-30

Monday reading

Skim Section 4.1 but take a good look at Figure 4.10; read Section 4.2.

Tuesday class

Section 4.1 is long. It shows:

  • Regular, context-free, decidable, and Turing-recognizable languages form a hierarchy (Figure 4.10).
  • Some results about regular and context-free languages that are interesting in their own right, like the fact that you can test whether two DFAs are equivalent. (We used this fact to partially automate the grading of one exam question.)
  • A more subtle point is that a priori we might think that some context-free languages are not decidable. For a PDA can have a computation branch that goes on forever. But in fact, it's possible to simulate the PDA so that infinite computation branches can be detected and cut short.

The universal Turing machine

Our focus for today is the universal TM $U$ defined at the beginning of the proof of Theorem 4.11. It is a TM that can simulate another TM. We've talked many times in this class about machines that simulate other machines (for example, the DFA for the intersection of two regular languages simulates the two machines that recognize the two languages), but whereas in those cases, the simulated machine was always hard-coded into the simulator, the universal TM is different because the code of the simulated machine is part of the input. That is, it takes as input both the code of a TM $M$ and an input string $w$, and simulates what $M$ would do on $w$.

For some reason, the book deemphasizes this idea. It's significant not only because of its relevance to Theorem 4.11, but also because it gave rise to (or at least anticipated) the idea of a stored program computer.

First, we have to show how to encode a TM as a string. This is easy to do and the details are not important, but it's worth at least mentioning one way to do it for concreteness. Turing's original encoding used an alphabet $\{\mathtt{A}, \mathtt{C}, \mathtt{D}, \mathtt{L}, \mathtt{R}, \mathtt{N}\}$. If the states in $Q$ are numbered $q_1, q_2, \ldots$, then state $q_i$ is encoded as $\mathtt{DA}^i$. If the symbols in $\Gamma$ are numbered $a_0, a_1, \ldots$, where $a_0$ is the blank symbol, then symbol $a_i$ is encoded as $\mathtt{DC}^i$. Then, the transition $\delta(q_i, a_j) = (q_k, a_\ell, \textrm{L})$ is encoded as $\mathtt{DA}^i \mathtt{DC}^j \mathtt{DC}^\ell \mathtt{L} \mathtt{DA}^k$, and similarly if the move is R or N (for "no move," equivalent to the book's S). We also need some convention for indicating the start, accept, and reject states: perhaps $q_1$ is always the start state, $q_2$ the accept state, and $q_3$ the reject state.

Second, we have to show the universal TM itself. It is often constructed as a TM with three tapes:

  1. An encoding of $M$, the machine being simulated.
  2. The tape of $M$.
  3. The state of $M$.

An implementation description would be: On input $\langle M, w\rangle$, where $M$ is a TM and $w$ is a string:

  1. Copy $M$ onto tape 1 and $w$ onto tape 2.
  2. Initialize tape 3 to the start state of $M$.
  3. Repeat:
    1. If the state (tape 3) is the accept state, accept; if it is the reject state, reject.
    2. Search on tape 1 for an instruction that matches the current state (encoded on tape 3) and current input symbol (encoded on tape 2).
    3. Write the new state to tape 3 and the new symbol to tape 2.
    4. Move the head on tape 2 to the left or right as indicated by the instruction.


There's a cottage industry of seeing who can make the smallest universal TM. The current record holder, due to Rogozhin, has only a couple dozen instructions.

Wednesday reading

Re-read Section 4.2.

Thursday class

Today's topic is undecidability: is there such a thing as a language that can't be decided by a Turing machine (and therefore, by any computer program as we know it)?

Sipser's explanation has three parts. First ("The Diagonalization Method"), there is a long digression about countable and uncountable infinities. If you are not familiar with these concepts and Cantor's diagonalization argument, pay careful attention to this subsection, even though it may not seem relevant at first. The point of this digression is that the vast majority of languages are not Turing-recognizable (or decidable).

Second, he gives an actual example of an undecidable language, $A_{\mathsf{TM}}$. In the 2nd edition of the book, this subsection (beginning on page 179) is misleadingly called "The Halting Problem." But the language $A_{\mathsf{TM}}$ is not the halting problem, and this subsection was rightly retitled "An Undecidable Language" in the 3rd edition. What would it mean for $A_{\mathsf{TM}}$ to be decidable? We saw previously that even if a NFA or PDA has an infinite loop, we can convert it into one that doesn't have an infinite loop; so we didn't make a distinction between NFA-recognizable and NFA-decidable, or PDA-recognizable and PDA-decidable. Similarly, if $A_{\mathsf{TM}}$ were decidable, we'd be able to convert any TM into a decider TM, and we wouldn't make a distinction between Turing-recognizable and Turing-decidable.

Third (starting with "Where is the diagonalization...?"), he gives a visual explanation that connects back with the Cantor diagonal argument. If you're a visual learner, you may want to start here.

The core of the argument

The proof has several layers, of which the innermost is the contradiction. Perhaps it's easier to start with the innermost contradiction.

We saw in the last lecture that any Turing machine can be encoded as a string (its "source code"). We also saw that this opens up the possibility of Turing machines that operate on other Turing machines.

Imagine a table whose rows are Turing machines and whose columns are also Turing machines. In the cell at row $i$ and column $j$, we write "yes" if the $i$'th Turing machine, when run on the $j$'th Turing machine, halts and accepts. And we write "no" if it halts and rejects or if it loops.

Then, as in the argument about the cardinality of the reals, let's take the diagonal of this table and invert every cell, which gives the language of Turing machines that do not accept themselves:

$$\overline{\mathit{SELF}}_{\mathsf{TM}} = \{ \langle M \rangle \mid \text{$M$ is a TM and $M$ does not accept $\langle M \rangle$}\}.$$

Is this a decidable language? No. We can see this by imagining that if there is a TM $D$ that decides it, then $\overline{\mathit{SELF}}_{\mathsf{TM}}$ must be a row in the table, but this is a contradiction because we constructed it to differ from every row in the table.

Another way of arguing this is: Suppose that there is a TM $D$ that decides it. We arrive at a contradiction by asking whether $\langle D \rangle$ itself belongs to the language:

  • If $\langle D \rangle \in \overline{\mathit{SELF}}_{\mathsf{TM}}$, then by the definition of $\overline{\mathit{SELF}}_{\mathsf{TM}}$, $D$ does not accept $\langle D \rangle$. But because $D$ decides $\overline{\mathit{SELF}}_{\mathsf{TM}}$, it must be that $D$ does accept $\langle D \rangle$. This is a contradiction.

  • Similarly, if $\langle D \rangle \not\in \overline{\mathit{SELF}}_{\mathsf{TM}}$, then by the definition of $\overline{\mathit{SELF}}_{\mathsf{TM}}$, $D$ does accept $\langle D \rangle$. But because $D$ decides $\overline{\mathit{SELF}}_{\mathsf{TM}}$, it must be that $D$ rejects $\langle D \rangle$. This is a contradiction.

Therefore, $D$ cannot exist, and $\overline{\mathit{SELF}}_{\mathsf{TM}}$ is not decidable.

The language $A_{\mathsf{TM}}$

Now we can tackle

$$A_{\mathsf{TM}} = \{ \langle \langle M \rangle, w \rangle \mid \text{$M$ is a TM and $M$ accepts $w$} \}.$$

Suppose that this language was decidable, that is, there is a TM $H$ that decides it. Then we could implement $D$ as follows:

$D = $"On input $\langle M \rangle$:

  1. Run $H$ on input $\langle M, \langle M \rangle \rangle$.
  2. If $H$ accepts, reject; if $H$ rejects, accept.

But we already saw that $D$ cannot exist. Therefore, $H$ cannot exist and $A_{\mathsf{TM}}$ is undecidable.

The halting problem

Most books use the following language as their first undecidable language:

$$\mathit{HALT}_{\mathsf{TM}} = \{ \langle\langle M \rangle, w \rangle \mid \text{$M$ is a TM and $M$ halts on $w$} \}.$$

This is probably the most well-known undecidable problem: Can you write a program that looks at another program $M$ and input $w$ and decides whether $M$ halts or loops on $w$?

The proof of undecidability is similar to the one above. Suppose that this language was decidable, that is, there is a TM $P$ that decides it. Then we could implement a Turing machine $Q$:

$Q = $"On input $\langle M \rangle$:

  1. Run $P$ on input $\langle M, \langle M \rangle \rangle$.
  2. If $P$ accepts, loop; if $H$ rejects, halt.

But -- by a similar to that for $D$ -- $Q$ cannot exist. Therefore, $P$ cannot exist and $\mathit{HALT}_{\mathsf{TM}}$ is undecidable.

In verse

Read Geoff Pullum's "Scooping the Loop Snooper: A proof that the Halting Problem is undecidable."

In Python

We can also demonstrate the argument in any programming language, here Python. Suppose (for the sake of contradiction) that we have a function called accepts that takes two arguments, M and w, and supposedly can decide whether M accepts w (even if M does not halt on w). Here, we provide a supposed implementation of accepts, but we will see that it is broken.

In [20]:
def accepts(M, w): # this is H
    """Supposedly returns True if M(w) is True
                          False if M(w) is False or does not halt."""
        return M(w) == True
    except (RuntimeError, KeyboardInterrupt): # catch stack overflow or ctrl-C in case of infinite loop
        return False

Then we can design a function and an input that "breaks" accepts.

In [21]:
def does_not_accept_self(M): # this is D
    return not accepts(M, M)

Let's test it out! When we use accepts to test whether D accepts w_bad, we get:

In [22]:
accepts(does_not_accept_self, does_not_accept_self)

But the correct answer is:

In [23]:

No matter how you modify accepts, the above two expressions will disagree, because there is no such thing as a correct implementation of accepts.

Turing-recognizable and co-Turing-recognizable

The last subsection exhibits a language that is not even Turing-recognizable. It is, frankly, not that interesting. The more important thing we learn here is Theorem 4.22,

\begin{align*} \text{decidable} &= \text{Turing-recognizable} \cap \text{co-Turing-recognizable}. \end{align*}