Week 8: 2017/03/20-24

In [1]:
from tock import *

Monday reading

Read Section 3.1 and 3.3.

Tuesday class

Turing machines, finally

By now, the definition of Turing machines (TMs) should be easy to understand. But pay special attention to the distinction between Turing-recognizable (a.k.a. "recursively enumerable" or "computably enumerable") and (Turing-)decidable (a.k.a. "recursive" or "computable"). When a TM runs on an input string, there are three possible outcomes: halt and accept, halt and reject, and loop (which is treated as a reject).

We distinguish between two kinds of TMs: TMs and decider TMs. A TM can accept, reject, or loop on a string, whereas a decider TM must either accept or reject, but never loop. Then, a language $L$ is Turing-recognizable iff there is a TM that accepts all and only the strings in $L$. And $L$ is (Turing-)decidable iff there is a decider TM that accepts all and only the strings in $L$.

This is a distinction we haven't worried about in the past. Why? Deterministic FAs and PDAs always halt, so there's no distinction between accepting and deciding. Nondeterministic FAs and PDAs can have $\varepsilon$ loops, but it's easy to change such machines into an equivalent ones that do not loop. So the distinction is unimportant. With Turing machines (which are deterministic unless stated otherwise), the distinction is extremely important.

There are many variations out there in the definition of TM. Be careful if you refer to other sources. We're going to allow one small extension to Sipser's definition. In Sipser's definition, $\delta(q, a) = (r, b, d)$, where $d \in \{\text{L}, \text{R}\}$. Following Turing's original definition, we'll also allow $d = \text{N}$, which means "no move".

Examples: formal descriptions

Here are the book's two examples of formal descriptions. Apologies that the notation used here is slightly different from Sipser's: it uses a caret (^) to specify the movement of the head. So $a \rightarrow \hat{} b$ means "read $a$, write $b$, and move left," while $a \rightarrow b \hat{}$ means "read $a$, write $b$, and move right." And $a \rightarrow b$ means "read $a$, write $b$, and don't move."

In [2]:
m2 = read_csv("tm-m2.csv")
m2
%3 _START 0 q1 _START->0 2 q2 0->2 0 → _ ^ 3 reject 0->3 x → x ^ _ → _ ^ 1 accept 2->1 _ → _ ^ 2->2 x → x ^ 4 q3 2->4 0 → x ^ 4->4 x → x ^ 5 q4 4->5 0 → 0 ^ 6 q5 4->6 _ → ^ _ 5->3 _ → _ ^ 5->4 0 → x ^ 5->5 x → x ^ 6->2 _ → _ ^ 6->6 0 → ^ 0 x → ^ x
In [3]:
run(m2, "0 0 0 0")
%3 _START 10 q1,[0] 0 0 0 _START->10 0 q2,_ x [0] x 11 q3,_ x x [x] 0->11 1 q5,[_] x 0 x 2 q2,_ [x] 0 x 1->2 2->0 3 q5,_ x 0 [x] 4 q5,_ x [0] x 3->4 5 q5,_ [x] 0 x 4->5 5->1 6 q3,_ x 0 x ^ 6->3 7 q4,_ x 0 [0] 7->6 8 q3,_ x [0] 0 8->7 9 q2,_ [0] 0 0 9->8 10->9 12 q3,_ x x x ^ 11->12 13 q5,_ x x [x] 12->13 15 q5,_ x [x] x 13->15 14 q5,[_] x x x 17 q2,_ [x] x x 14->17 16 q5,_ [x] x x 15->16 16->14 18 q2,_ x [x] x 17->18 19 q2,_ x x [x] 18->19 20 q2,_ x x x ^ 19->20 21 accept,_ x x x _ ^ 20->21

Since the machine is deterministic, the run is easier to view as a list of configurations, so we'll write a little function to do that:

In [4]:
def print_run(*args):
    r = run(*args)
    for c in r.only_path():
        print(c)
In [5]:
print_run(m2, "0 0 0 0")
q1,[0] 0 0 0
q2,_ [0] 0 0
q3,_ x [0] 0
q4,_ x 0 [0]
q3,_ x 0 x ^
q5,_ x 0 [x]
q5,_ x [0] x
q5,_ [x] 0 x
q5,[_] x 0 x
q2,_ [x] 0 x
q2,_ x [0] x
q3,_ x x [x]
q3,_ x x x ^
q5,_ x x [x]
q5,_ x [x] x
q5,_ [x] x x
q5,[_] x x x
q2,_ [x] x x
q2,_ x [x] x
q2,_ x x [x]
q2,_ x x x ^
accept,_ x x x _ ^
In [6]:
print_run(m2, "0 0 0")
q1,[0] 0 0
q2,_ [0] 0
q3,_ x [0]
q4,_ x 0 ^
reject,_ x 0 _ ^
In [7]:
m1 = read_csv("tm-m1.csv")
m1
%3 _START 8 q1 _START->8 0 q7 0->0 0 → ^ 0 1 → ^ 1 0->8 x → x ^ 1 accept 2 q2 2->2 0 → 0 ^ 1 → 1 ^ 5 q4 2->5 # → # ^ 3 q3 3->3 0 → 0 ^ 1 → 1 ^ 6 q5 3->6 # → # ^ 4 q8 4->1 _ → _ ^ 4->4 x → x ^ 5->5 x → x ^ 7 q6 5->7 0 → ^ x 6->6 x → x ^ 6->7 1 → ^ x 7->0 # → ^ # 7->7 0 → ^ 0 1 → ^ 1 x → ^ x 8->2 0 → x ^ 8->3 1 → x ^ 8->4 # → # ^

Note that in this example, the reject state has been omitted. If at any point in the run, there is no transition defined for the current state and symbol, the machine rejects. You can do this when you write formal descriptions of TMs also.

In [8]:
print_run(m1, "0 1 0 # 0 1 0")
q1,[0] 1 0 # 0 1 0
q2,x [1] 0 # 0 1 0
q2,x 1 [0] # 0 1 0
q2,x 1 0 [#] 0 1 0
q4,x 1 0 # [0] 1 0
q6,x 1 0 [#] x 1 0
q7,x 1 [0] # x 1 0
q7,x [1] 0 # x 1 0
q7,[x] 1 0 # x 1 0
q1,x [1] 0 # x 1 0
q3,x x [0] # x 1 0
q3,x x 0 [#] x 1 0
q5,x x 0 # [x] 1 0
q5,x x 0 # x [1] 0
q6,x x 0 # [x] x 0
q6,x x 0 [#] x x 0
q7,x x [0] # x x 0
q7,x [x] 0 # x x 0
q1,x x [0] # x x 0
q2,x x x [#] x x 0
q4,x x x # [x] x 0
q4,x x x # x [x] 0
q4,x x x # x x [0]
q6,x x x # x [x] x
q6,x x x # [x] x x
q6,x x x [#] x x x
q7,x x [x] # x x x
q1,x x x [#] x x x
q8,x x x # [x] x x
q8,x x x # x [x] x
q8,x x x # x x [x]
q8,x x x # x x x ^
accept,x x x # x x x _ ^
In [9]:
print_run(m1, "0 1 0 # 0 1 1")
q1,[0] 1 0 # 0 1 1
q2,x [1] 0 # 0 1 1
q2,x 1 [0] # 0 1 1
q2,x 1 0 [#] 0 1 1
q4,x 1 0 # [0] 1 1
q6,x 1 0 [#] x 1 1
q7,x 1 [0] # x 1 1
q7,x [1] 0 # x 1 1
q7,[x] 1 0 # x 1 1
q1,x [1] 0 # x 1 1
q3,x x [0] # x 1 1
q3,x x 0 [#] x 1 1
q5,x x 0 # [x] 1 1
q5,x x 0 # x [1] 1
q6,x x 0 # [x] x 1
q6,x x 0 [#] x x 1
q7,x x [0] # x x 1
q7,x [x] 0 # x x 1
q1,x x [0] # x x 1
q2,x x x [#] x x 1
q4,x x x # [x] x 1
q4,x x x # x [x] 1
q4,x x x # x x [1]

Question. Try writing a (formal description of) a TM that decides the language $\{\texttt{a}^{2n} \mid n \geq 0\}$.

Question. Try writing a TM (first informal, then formal description) that decides the language $\{ww^R \mid w \in \{\texttt{0},\texttt{1}\}^\ast\}$.

Macros

Writing Turing machines is extremely tedious -- much more so than assembly language. To reduce the tedium, many textbooks introduce the idea of macros. In the state transition diagram of a TM, a node can be labeled with the name of another TM. If machine $M$ reaches a node labeled with another TM $M'$, then $M'$ is run until it reaches its accept state; then control returns to $M$ right where it left off. Note that this is not really an extension of the TM formalism; it's just a shorthand for writing repetitive machines more compactly.

For example, in $M_1$ above, the same subgraph appears twice (states $q_2, q_4$ are the same as states $q_3, q_5$). So we could simplify it to: m1a where $M'$ is: m1b

However, Sipser does not use notation like this for macros. Instead, he quickly graduates you from formal descriptions of machines to implementation-level descriptions. These are like pseudocode. But they still speak in terms of the head moving on the tape and reading and writing symbols on the tape. They shouldn't, for example, make use of variables or arithmetic. See the examples in the book to get an idea of what is allowed and what isn't.

It's common to use macros in implementation-level descriptions, too: if you define a TM $M'$, then in the course of defining $M$, you can write "simulate $M'$."

The Church-Turing thesis

Recall that Turing's original paper (1936) proposed Turing machines as a model of what humans do when they compute. He imagined a computing person in an idealized scenario:

  • He has an infinitely long paper tape, divided into squares.
  • He can write one symbol in each square, and the number of possible symbols is finite (e.g., 0 to 9).
  • He can only look at a finite number of squares at a time.
  • He can only move a finite distance at a time.
  • He has only a finite number of “states of mind.”

Basically this is an appeal to intuition that when people compute, this is what they do. Furthermore, he had to argue (quite briefly) that Turing machines can perform any such computation. This came to be known as the Church-Turing thesis (CTT).

One brief statement of the thesis: "[Turing] machines can do anything that could be described as 'rule of thumb' or 'purely mechanical'." Elsewhere, he characterized these problems as "those problems which can be solved by human clerical labour, working to fixed rules, and without understanding."

Wednesday reading

Read Section 3.2.

Thursday class

An additional argument made for the Church-Turing thesis is "robustness" -- lots of different extensions of Turing machines have been proposed, and while they make writing Turing machine programs more convenient, they can be shown not to increase their power.

Question. Name every possible thing you could imagine adding to a Turing machine to make it more powerful.

First of all, there are some extensions in Turing's image of a human computer:

  • He is allowed to look at multiple (but finitely many) squares at a time.
  • He is allowed to move multiple (but finitely many) squares to the left or right.
  • Furthermore, Turing also mentions the possibility of a two-dimensional grid.

It is easy to convert a TM which can look at multiple squares or move multiple squares into a standard TM by adding more states. A two dimensional grid can be simulated in a manner similar to multiple tapes (see below).

The book discusses two bigger extensions which are used later: multiple tapes, and nondeterminism. Multiple tapes are described clearly enough in the book; a few comments here on nondeterminism.

Nondeterminism

We can make Turing machines nondeterministic, similarly to how we made finite automata nondeterministic. Remarkably, NTMs are no more powerful than TMs. But if it were possible to build a real NTM, it would be useful for all kinds of very very practical things, because it would be (we think) so much faster than a deterministic TM. That speed difference is the focus of Unit Four.

When defining NTMs, we must pay attention to the distinction between recognizing and deciding. A NTM accepts a string iff at least one branch of the computation accepts it. The other branches can reject or loop. A NTM is a decider iff all of its branches halt (not loop).

So a NTM recognizes a language $L$ iff for each string $w \in L$, there is at least one branch that accepts $w$, and the other branches either reject or loop on $w$; and for each string $w \notin L$, all branches either reject or loop on $w$. And a decider NTM decides a language $L$ iff for each string $w \in L$, there is at least one branch that accepts $w$, and the other branches reject $w$; and for each string $w \notin L$, all branches reject $w$.

The proof of Theorem 3.16 treats the computation as a directed tree. Each node is a configuration, and each edge is a step of the computation. The simulating DTM does a breadth-first search (BFS) on this tree. Each node in the tree is given an address (known as a Gorn address), which is stored on tape 3. The proof is a little bit unclear on one point: Stage 4 just says "Replace the string on tape 3 with the next string in the string ordering," but doesn't explain what the ordering is -- it's the order defined on page 14, which causes the addresses to be listed in breadth-first order.

Alternatives

Taking the robustness argument further, we can argue that many alternative definitions of computability have been proposed, alternatives that look radically different from Turing machines, and those, too, can be shown to be equivalent to Turing machines. These are mentioned at the end of Section 3.2. The most important ones are probably:

Accidentally Turing Complete is a collection of other weird things that are equivalent to TMs.