Week 1: 2017/01/16-20

Monday reading

Read 0.1, which is an overview of the whole course.

Tuesday class

Welcome to Theory of Computing! I'd like to introduce the course by discussing a particular problem: given two functions, for example:

In [1]:
def fib1(n):
    if n <= 1:
        return n
        return fib1(n-2) + fib1(n-1)
In [2]:
def fib2(n):
    a = 0
    b = 1
    for i in range(n):
        b_new = a + b
        a = b
        b = b_new
    return a

Are they equivalent? As a programmer, you have to answer this kind of question all the time -- either implicitly, as when you are writing a function and you need to ensure that the function you write is equivalent to the idea in your head, or explicitly, as when you are optimizing or refactoring code and need to ensure that you don't introduce new bugs.

Proofs and programs

The standard practice is to write test cases:

In [3]:
print(fib1(0) == fib2(0))
print(fib1(1) == fib2(1))
print(fib1(2) == fib2(2))

But the only way to be sure that the two functions are equivalent in all cases is a logical proof. Though you won't write many programs in this course, you will write a lot of proofs about programs, and I suggest that if you get good at writing proofs about programs, you will become a better programmer!

Computers and computation

If checking program equivalence is so important, you might be thinking, can we write another program that automatically does it for us? The answer is yes and no. We'll study some simple types of computers, and for the first of these (which are equivalent to regular expressions), we can indeed automatically check equivalence. But as we move to more complex types of computers, things change. In general, there is no such thing as a program that can check the equivalence of other programs.

The most complex type of computer we'll study is still quite rudimentary. A Turing machine has a head that moves left and right on a tape and follows instructions of the form "if the current symbol is $a$, then write $b$, move left/right, and goto $r$". Turing machines have been constructed out of wood or Legos, and it's possible to compile C code into Turing machines. But primarily, they exist only in theory.

The reason they are important is that, as we will show, any current digital computer (under some reasonable assumptions) can be emulated by a Turing machine; we will further argue that even computers of the future can be emulated by Turing machines. Thus, Turing machines serve as a definition of what a computer is. If something can be done on a computer, it can be done on a Turing machine; if something cannot be done on a Turing machine, it cannot be done on any other computer.

Machines and minds

It's not too hard to show that the example functions above are equivalent (for nonnegative integers $n$), but are there examples that would stump even the smartest people? In fact, Turing machines were invented as an answer to a very similar question. Turing introduced them in 1936, years before the first computer was built, to formalize what it means for humans to compute (for example, to do arithmetic or to write a formal proof). He proposed some assumptions about how a human (with pencil, eraser, and paper) computes, and showed that everything he or she does could be emulated by a Turing machine (a proposition now known as the Church-Turing Thesis). The implication is that there are pairs of programs out there whose equivalence or nonequivalence humans will never be able to prove.

Many people (especially computer scientists) take this thesis further, saying that all human thinking is computation and can therefore be emulated by a Turing machine, or any other computer with sufficient memory. Some people take the thesis much further and say that the universe can be simulated by, or even is, a computer. In order to understand and interact with such ideas, you need to understand theory of computing. So, the theory of computing is a point of contact between computer science and other fields like mathematics, physics, philosophy, and even theology. If you've wondered what your core courses have to do with computer science, I hope you'll see some connections here.

Wednesday reading

Skim Sections 0.2-4, which cover mathematical preliminaries that you should have gotten in Discrete Math. If anything seems unfamiliar to you, study it a little more carefully. The subsection "Strings and Languages" is especially important (and surprisingly short); we'll be focusing on it in class.

Thursday class

In this course we want to study all the things that a computer can do. That seems very diverse, and it would seem that we have to study many different kinds of computations separately. But maybe we can reduce them all to one kind of problem, and focus our study on that one kind of problem. And we claim that any computational problem can be reduced to problems of the form: Does string $w$ belong to language $L$?

People are fairly comfortable these days with the idea that any kind of object we might want to compute about can be represented as a string. We deal every day with messages, music, pictures, movies, books, etc., in digital form, with the awareness that these are just strings of 0’s and 1’s.

Less obvious, perhaps, is that it makes sense to reduce all kinds of computations down to questions about whether strings belong to languages. First of all, our experience with computers today is highly interactive. But user-computer interaction can be thought of as a sequence of inputs and outputs. Second, the output from a computer is much more than just a "yes" or "no" answer. But if you have any function $f(x)$ that takes an input $x$ and outputs another string, we can think of it as the related function $f'(x,y)$ that takes an input $x$ and a possible output $y$, and returns "yes" iff $f(x) = y$.

Caveat: The function $f'$ might be much slower than the original $f$. This will matter when we get to Unit IV, and we'll deal with this issue when the time comes.


An alphabet is a nonempty finite set of symbols. We use $\Sigma$ (uppercase sigma) or sometimes $\Gamma$ (uppercase gamma) to stand for an alphabet. We write actual symbols as $\texttt{a}, \texttt{b},$ etc., but write variables standing for symbols as $a, b$, etc.

A string is a finite sequence of symbols. Unlike with sets, order matters and duplicates matter. We usually use $w$ for a variable that stands for a string and $u,v$ or $x,y,z$ when we need more variables.

Note the following properties:

  • If $a$ is a symbol, it is also a string.
  • We write either $vw$ or $v \circ w$ for the concatenation of $v$ and $w$. (Some authors write $v \cdot w$.) Concatenation is associative: $(uv)w = u(vw) = uvw$.
  • We write $\varepsilon$ for the empty string: $\varepsilon \circ w = w \circ \varepsilon = w$.

Note that $\varepsilon$ (epsilon) is not a symbol! It is a "meta-symbol" that stands for the empty string. (Also, it is more commonly written $\epsilon$, but we'll try to stick to the book's style.)

It might be helpful to compare this notation for strings with C++. A symbol is a char. Just like a constant 'a' and a variable a are not the same in C++, so in math, the symbol $\mathtt{a}$ and the variable $a$ are not the same. A string is a C++ string. In C++, the constants 'a' and "a" are different; in math (and in Python), they are identical. In C++, the empty string is ""; in math, $\varepsilon$. In C++, you concatenate strings with v+w; in math, we write $v\circ w$ or $vw$.

Question. (This question and all questions below are not for credit; they will be discussed in class.) True or false? Assume $\Sigma = \{\mathtt{a},\mathtt{b}\}$.

\begin{align} \mathtt{ab}&=\mathtt{ba} \\ \mathtt{aa}&=\mathtt{a} \\ \varepsilon &\in \Sigma \\ \varepsilon \varepsilon &= \varepsilon \end{align}


A set of strings is called a language. We write $\Sigma^\ast$ for the language of all strings over $\Sigma$. As usual, $\emptyset$ stands for the empty language.

Question. True or false?

\begin{align} \{\mathtt{a}, \mathtt{b}\} &= \{\mathtt{b}, \mathtt{a}\} \\ \{\mathtt{a}, \mathtt{a}\} &= \{\mathtt{a}\} \\ \{\texttt{a}\} \cup \emptyset &= \{\texttt{a}\} \\ \{\texttt{a}\} \cup \{\varepsilon\} &= \{\texttt{a}\} \end{align}

Question. Is there such a thing (according to the definitions above) as

  • an infinite alphabet?
  • an infinite string?
  • an infinite language of finite strings?
  • an infinite language of infinite strings?

Question. Prove that the length of strings in a infinite language is unbounded. That is, if $L$ is infinite, there does not exist an $N$ such that for all $w \in L$, $|w| \leq N$.

Operations on strings often induce analogous operations on languages. For example, if $L$ and $L'$ are languages,

\begin{align} L^R &= \{ w^R \mid w \in L \} \\ L \circ L' = L L' &= \{ w w' \mid w \in L, w' \in L'\}. \end{align}

The $\ast$ operator that we saw above, known as the Kleene star, can be applied to any language. Thus $L^\ast$ is defined as the smallest language such that:

  • $L \subseteq L^\ast$.
  • If $v \in L^\ast$ and $w \in L^\ast$, then $vw \in L^\ast$.
  • $\varepsilon \in L^\ast$.

Language classes

Since every language corresponds to a computational problem (given a string $w$, does $w$ belong to $L$?), and since some problems are harder than others, it follows that some languages are harder to recognize than others. Some languages can be recognized by a Turing machine and some can't. Some can be recognized by a Turing machine in polynomial time ($O(n^k)$ time for some $k$) and some can't. So we can talk about sets of languages, commonly called language classes.

There are only a handful of language classes we will learn this semester. They are: Overview of language classes Each of the six boxes is a language class, with the name of the class written above; inside each box are kinds of machines that accept that language class. Each class is a strict superset of the previous one, except NP is not known to be a strict superset of P.

In [ ]: