Week 9: 2017/03/27-31

Monday reading

Skim Section 4.1 but take a good look at Figure 4.10.

Read the beginning of Section 4.2, up to but not including "The Diagonalization Method."

Tuesday class

Section 4.1 is long. It shows:

  • Regular, context-free, decidable, and Turing-recognizable languages form a hierarchy (Figure 4.10).
  • Some results about regular and context-free languages that are interesting in their own right, like the fact that you can test whether two DFAs are equivalent. (We used this fact to partially automate the grading of one exam question.)
  • A more subtle point is that a priori we might think that some context-free languages are not decidable. For a PDA can have a computation branch that goes on forever. But in fact, it's possible to simulate the PDA so that infinite computation branches can be detected and cut short.

The universal Turing machine

Our focus for today is the universal TM $U$ defined at the beginning of the proof of Theorem 4.11. It is a TM that can read in the description of another TM and simulate it. For some reason, the book deemphasizes this idea. It's significant not only because of its relevance to Theorem 4.11, but also because it gave rise to (or at least anticipated) the idea of a stored program computer.

First, we have to show how to encode a TM as a string. This is easy to do and the details are not important, but it's worth at least mentioning one way to do it for concreteness. Turing's original encoding used an alphabet $\{\mathtt{A}, \mathtt{C}, \mathtt{D}, \mathtt{L}, \mathtt{R}, \mathtt{N}\}$. If the states in $Q$ are numbered $q_1, q_2, \ldots$, then state $q_i$ is encoded as $\mathtt{DA}^i$. If the symbols in $\Gamma$ are numbered $a_0, a_1, \ldots$, where $a_0$ is the blank symbol, then symbol $a_i$ is encoded as $\mathtt{DC}^i$. Then, the transition $\delta(q_i, a_j) = (q_k, a_\ell, \textrm{L})$ is encoded as $\mathtt{DA}^i \mathtt{DC}^j \mathtt{DC}^\ell \mathtt{L} \mathtt{DA}^k$, and similarly if the move is R or N. We also need some convention for indicating the start, accept, and reject states: perhaps $q_1$ is always the start state, $q_2$ the accept state, and $q_3$ the reject state.

Second, we have to show the universal TM itself. It is often constructed as a TM with three tapes:

  1. An encoding of $M$, the machine being simulated.
  2. The tape of $M$.
  3. The state of $M$.

An implementation-level description would be: On input $\langle M, w\rangle$, where $M$ is a TM and $w$ is a string:

  1. Copy $M$ onto tape 1 and $w$ onto tape 2.
  2. Initialize tape 3 to the start state of $M$.
  3. Repeat:
    1. If the state (tape 3) is the accept state, accept; if it is the reject state, reject.
    2. Search on tape 1 for an instruction that matches the current state (encoded on tape 3) and current input symbol (encoded on tape 2).
    3. Write the new state to tape 3 and the new symbol to tape 2.
    4. Move the head on tape 2 to the left or right as indicated by the instruction.

Miscellany

There's a cottage industry of seeing who can make the smallest universal TM. The current record holder, due to Rogozhin, has only a couple dozen instructions.

Wednesday reading

Read Section 4.2 and, for fun, the poem "Scooping the Loop Snooper: A proof that the Halting Problem is undecidable."

Thursday class

Countable and uncountable

The book has a long digression about countable and uncountable infinities. If you are not familiar with these concepts and Cantor's diagonalization argument, pay careful attention to this part of the section, even though it may not seem relevant at first. The point of this digression is that the vast majority of languages are not Turing-recognizable (or decidable).

An undecidable language

The next subsection gives an actual example of an undecidable language. In the 2nd edition of the book, this subsection (beginning on page 179) is misleadingly called "The Halting Problem." But the language $A_{\mathsf{TM}}$ is not the halting problem, and this subsection was retitled "An Undecidable Language" in the 3rd edition.

What would it mean for $A_{\mathsf{TM}}$ to be decidable? If $A_{\mathsf{TM}}$ were decidable, then that would mean we could simulate any non-decider TM in such a way that looping could be avoided (as is the case for PDAs).

A simpler proof

The proof that $A_{\mathsf{TM}}$ is undecidable is a mind-bender. Perhaps this version of the proof is conceptually easier (similar to the proof in Hopcroft and Ullman).

Suppose that there exists a decider $H$ that decides $A_{\mathsf{TM}}$. Then we can imagine a big table of all possible outcomes of $H$ (assuming an alphabet $\{\mathtt{0}, \mathtt{1}\}$ for illustration's sake):

$\varepsilon$ $\mathtt{0}$ $\mathtt{1}$ $\mathtt{00}$ $\cdots$
$M_1$ reject accept reject accept
$M_2$ reject accept accept reject
$M_3$ accept reject accept reject
$M_4$ reject reject reject accept
$\vdots$

We should be able to compute the value of any cell of this table since $H$ is supposedly a decider. Therefore, we should be able to compute a function $D$ that is the opposite of every diagonal entry of this table. But this is a contradiction, because $D$ must itself have a row in the table, yet it differs in at least one column from every row in the table.

More precisely, let $M_i$ be the $i$th machine in some ordering of Turing machines, and let $w_i$ be the $i$th string in some ordering of strings in $\Sigma^\ast$ (page 14). Then $D$ is defined as follows. On input $w$,

  1. Find the column in the table corresponding to $w$, that is, find $i$ such that $w = w_i$.
  2. Find the $i$th row in the table, which corresponds to machine $M_i$.
  3. Run $H$ on input $\langle M_i, w_i \rangle$.
  4. If $H$ accepts, reject, and if $H$ rejects, accept.

This new TM $D$ must be different from every single row of the table, which is a contradiction, because we said that the table contains every possible TM. We conclude that $H$ does not exist, and $A_{\mathsf{TM}}$ is undecidable.

To modify this version of the proof into the version in the book, observe that we can label the columns of the table with whatever strings we want as long as there's an infinite number of them. So label $i$th column with the encoding of the $i$th machine, that is, $w_i = \langle M_i \rangle$. The fact that this labeling leaves out some strings is not a problem. Then $D$ is defined as follows. On input $w$,

  1. Find the column in the table corresponding to $w$, that is, find $i$ such that $w=\langle M_i\rangle$. If there is no such $i$, reject. (Or accept -- it doesn't matter.)
  2. There is no step 2.
  3. Run $H$ on input $\langle M_i, \langle M_i \rangle\rangle$.
  4. If $H$ accepts, reject, and if $H$ rejects, accept.

Again, this new TM $D$ must be different from every single row of the table, which is a contradiction.

A Python demo

We can also demonstrate the argument in any programming language, here Python. Assume that we have a function called accepts that takes two arguments, function and input, and supposedly can decide whether function accepts input (even if function does not halt on input).

In [3]:
def accepts(M, w):
    """Supposedly returns True if M(w) is True
                          False if M(w) is False or does not halt."""
    # Note: this code is nonsense. You can replace it with whatever you want.
    return (hash(M.__code__.co_code) + hash(w)) % 2 == 0

Then we can design a function and an input that "breaks" accepts.

In [4]:
def D(w):
    if callable(w): # if w is a function, 
        return not accepts(w, w)
    else:
        return False # arbitrary return value for non-callables
w_bad = D

Let's test it out! When we use accepts to test whether D accepts w_bad, we get:

In [5]:
accepts(D, w_bad)
Out[5]:
False

But the correct answer is:

In [6]:
D(w_bad)
Out[6]:
True

Turing-recognizable and co-Turing-recognizable

The last subsection exhibits a language that is not even Turing-recognizable. It is, frankly, not that interesting. The more important thing we learn here is Theorem 4.22,

\begin{align*} \text{decidable} &= \text{Turing-recognizable} \cap \text{co-Turing-recognizable}. \end{align*}