Week 10: 2018/04/01-05

Monday

Read Section 5.1. This is a difficult section, so read it carefully. But we probably won't do Theorems 5.10, 5.13 in detail.

Tuesday class

Now that we've proven that one language ($A_{\mathsf{TM}}$) is undecidable, we can use it to prove that other languages are undecidable.

The halting problem

The first example is the halting problem. In many textbooks (and the poem "Scooping the Loop Snooper"), the halting problem is actually the prototypical undecidable language, but Sipser does things a little differently.

The proof is not difficult but it has a weird feel to it. The first thing that you have to remember is that the direction of the reduction is the opposite of what most people intuitively think of first. If you want to show that the halting problem is undecidable, you do not reduce the halting problem to $A_{\mathsf{TM}}$. You assume that the halting problem is decidable, then show via a reduction from $A_{\mathsf{TM}}$ to the halting problem that $A_{\mathsf{TM}}$ would also be decidable, which is a contradiction.

Recall that we designed $U$, a universal TM, that recognizes $A_{\mathsf{TM}}$, but we showed that we cannot design a TM $S$, a "universal decider," that decides $A_{\mathsf{TM}}$. Practically, the sticking point is that on input $\langle M, w\rangle$, if $M$ loops on $w$, we have no way to detect it so that we can reject. Enter $R$, a hypothetical TM that decides the halting problem $\mathit{HALT}_{\mathsf{TM}}$. If we had such a machine, then designing $S$ would be easy: use $R$ to check for looping. if a loop is detected, reject; if no loop is detected, we can safely simulate $M$ and return what it returns. But last time we showed (by diagonalization) that $S$ does not exist. Therefore, $R$ cannot exist either.

Reducibility

In general, to show that some problem P is undecidable, you assume that P is decidable by some TM $R$ and use that to implement a universal decider $S$. That implementation usually has three steps:

  1. Convert an input to $S$ (a pair $\langle M, w\rangle$) into an instance of P.
  2. Run $R$ on the converted instance.
  3. Interpret the result of $R$ to decide whether $M$ accepts $w$.

In the reduction from $A_{\textsf{TM}}$ to the halting problem, step 1 was trivial: $\langle M, w \rangle$ maps to itself. But in other cases, this step may be more complicated.

For example, let's show that it is undecidable whether a given Python program $P$ writes to the filesystem.

Suppose, for the sake of contradiction, that this is decidable. That is, there exists a TM $R$ that accepts $\langle P\rangle$ if and only if $P$ would write to the filesystem. Let's call it the "write-detector."

We're going to build a universal decider $S$ that somehow uses the write-detector $R$ to decide $A_{\mathsf{TM}}$. We can't possibly feed $\langle M, w\rangle$ to $R$ because $R$ wants a Python program. So we need to convert $M$ and $w$ into a Python program. We assume a Python function simulate that simulates a TM and returns True for accept or False for reject; this function can also potentially loop. It should be obvious that such a function exists, although it's long enough that we don't bother to write it out.

Then $S = $ “On input $\langle M, w\rangle$:

  1. Construct the Python program, called $P$:
     import os
     if simulate(M, w):
         os.system("rm -rf /")
    where M and w are filled in with data structures representing $M$ and $w$, respectively.
  2. Run $R$, the write-detector, on $P$.
  3. If $R$ detected a write, then accept.
  4. Otherwise, reject.

The program $P$ acts as an "adapter" between the property we need to detect ($M$ accepting $w$) and the property that we can detect ($P$ writing to the filesystem).

To see that $S$ is a universal decider, let's walk through the possible cases:

  • If $M$ accepts $w$, then $P$ would wipe out your files. $R$ detects this, and so $S$ accepts.
  • If $M$ rejects $w$, then $P$ does not wipe out your files. $R$ does not detect any writes, and so $S$ rejects.
  • Similarly, if $M$ loops on $w$, then $P$ would run forever but would not wipe out your files. $R$ does not detect any writes, and so $S$ rejects.

Thus $S$ decides $A_{\mathsf{TM}}$ as desired, which is a contradiction. We conclude that it's undecidable whether a given Python program writes to the filesystem.

In general, when you're trying to prove that detecting property P is undecidable, this "adapter" usually has to do the following things:

  • Simulate $M$ on $w$.
  • If $M$ accepts, then exhibit property P.
  • If $M$ rejects, then don't exhibit property P.
  • You must also set it up so that if $M$ loops, property P is not exhibited.

Question. Try Problem 5.12: Show that it is undecidable whether a Turing machine ever erases a symbol (that is, overwrites a non-blank symbol with a blank symbol).

Sometimes, the existence of a Turing machine simulator is not so obvious, and we have to think more carefully about it.

Question. An unrestricted grammar is like a CFG, but the left-hand sides can be strings. A derivation starts with $S$ and continues until no more rules can be applied. Show that it is undecidable whether an unrestricted grammar generates a nonempty language. Hint: Convert $\langle M, w\rangle$ to a grammar whose starting rule is $$ S \rightarrow q_0 w $$ and for each transition of $M$, create a grammar rule (or rules) to simulate that transition.

Some other examples of undecidable problems are:

  • Do two context-free grammars generate the same language (Exercise 5.1)?
  • Post's Correspondence Problem (Section 5.2).
  • Does a polynomial equation in several variables have an integral solution?

See the survey by Poonen for more examples.

Thursday class

Properties of languages

The next two examples in the book, Theorems 5.2 and 5.3, are slightly different. They are not questions about a single run of a machine, but about the entire language that the machine recognizes. Theorem 5.2 is about whether the language is empty; Theorem 5.3 is about whether the language is regular.

The basic shape of the proof is still the same. Assume that the question is decidable, so there is a TM $R$ that decides it. As before, we need to construct a new TM $S$ that decides $A_{\mathsf{TM}}$. The first thing that $S$ needs to do is to convert $\langle M, w\rangle$ to just a machine (called $M_1$ or $M_2$ in the book). We have to take care to construct this "adapter" machine to link the question that $S$ wants to decide (does $M$ accept $w$?) with the question that $R$ has the magic ability to answer.

Let's look at Theorem 5.2 first. Given $M$ and $w$, we can construct an adapter $M_1$ that recognizes a nonempty language iff $M$ accepts $w$. We're going to construct $M_1$ differently from the book; if you like the book's version better, that's fine too.

$M_1 =$ “On input $x$:

  1. Run $M$ on input $w$.
  2. If $M$ accepts $w$, then accept $x$.
  3. If $M$ rejects $w$, then reject $x$.

Let's think carefully about what $M_1$ would do.

  • If $M$ accepts $w$, then $M_1$ accepts all strings $x$. That is, $M_1$ recognizes $\Sigma^\ast$, a nonempty language.
  • If $M$ rejects $w$, then $M_1$ rejects all strings $x$. That is, $M_1$ recognizes $\emptyset$.
  • If $M$ loops on $w$, then $M_1$ also loops on all strings $x$. So, $M_1$ recognizes $\emptyset$.

Finally, we define a universal decider $S =$ “On input $\langle M, w\rangle$:

  1. Construct $M_1$ as described above.
  2. Run $R$, the emptiness-detector, on $M_1$.
  3. If $R$ detected an empty language, reject.
  4. Otherwise, accept.

Here's how $S$ works:

  • If $M$ accepts $w$, then $M_1$ would recognize $\{w\}$, a non-empty language, and so $S$ accepts.
  • If $M$ rejects or loops on $w$, then $M_1$ would recognize $\emptyset$, and so $S$ rejects.

Thus $S$ decides $A_{\mathsf{TM}}$ as desired, which is a contradiction. We conclude that it's undecidable whether a given Turing machine recognizes the empty language.

The proof of Theorem 5.3 (regularity) is quite similar. Given $M$ and $w$, we can construct an adapter $M_2$ that recognizes a regular language iff $M$ accepts $w$. Again, we'll do this a little differently from the book.

$M_2 =$ “On input $x$:

  1. Run $M$ on input $w$.
  2. If $M$ accepts $w$, and $x$ has the form $\mathtt{0}^n\mathtt{1}^n$, accept $x$.
  3. Otherwise, reject $x$.

Let's think carefully about what $M_2$ does.

  • If $M$ accepts $w$, then $M_2$ accepts strings $x$ of the form $\mathtt{0}^n\mathtt{1}^n$. So it recognizes a non-regular language.
  • If $M$ rejects $w$, then $M_2$ rejects all strings $x$. So it recognizes $\emptyset$, a regular language.
  • If $M$ loops on $w$, then $M_2$ also loops on all strings $x$. So it recognizes $\emptyset$, a regular language.

The rest of the proof is very similar to the last one. We define a universal decider $S =$ “On input $\langle M, w\rangle$:

  1. Construct $M_2$ as described above.
  2. Run $R$, the regularity-detector, on $M_2$.
  3. If $R$ detected a regular language, reject.
  4. Otherwise, accept.

Here's how $S$ works:

  • If $M$ accepts $w$, then $M_2$ would recognize $\{\mathtt{0}^n\mathtt{1}^n\}$, a non-regular language, and so $S$ accepts.
  • If $M$ rejects or loops on $w$, then $M_2$ would recognize $\emptyset$, a regular language, and so $S$ rejects.

Thus $S$ decides $A_{\mathsf{TM}}$ as desired, which is a contradiction. We conclude that it's undecidable whether a given Turing machine recognizes a regular language.

Question. Try proving: It is undecidable whether two given Turing machines $M_1$ and $M_2$ recognize the same language.

We've tried to follow a general recipe for proving undecidability results. It turns out that this recipe can be distilled into a catch-all theorem called Rice's Theorem: any property of a language recognized by a Turing machine, as long as the property is not always true or always false, is undecidable.

There are some properties of the internal working of a Turing machine that are decidable (you'll see one in HW7). But broadly speaking, questions about behaviors of computer programs tend to be undecidable. When you see problems like this in real life -- and you definitely will -- you should smell undecidability and think twice before trying to implement a solution to the problem.