Markov matrices¶

A matrix $A$ is a Markov matrix if

Its entries are all $\ge 0$
Each column's entries sum to 1

Typicaly, a Markov matrix's entries represent transition probabilities from one state to another.

For example, consider the $2 \times 2$ Markov matrix:

In [1]:

A = [0.9 0.2
     0.1 0.8]

Out[1]:

2×2 Array{Float64,2}:
 0.9  0.2
 0.1  0.8

Let us suppose that this represents the fraction of people switching majors each year between math and English literature.

Let $$ x = \begin{pmatrix} m \\ e \end{pmatrix} $$

represent the number of math majors $m$ and English majors $e$. Suppose that each year, 10% of math majors and 20% of English majors switch majors. After one year, the new number of math and English majors is:

$$ m' = 0.9 m + 0.2 e e' = 0.1 m + 0.8 e $$

But this is equivalent to a matrix multiplication! i.e. the numbers $x'$ of majors after one year is

$$ x' = A x \, $$

Note that the two Markov properties are critical: we never have negative numbers of majors (or negative probabilities), and the probabilities must sum to 1 (the net number of majors is not changing: we're not including new students or people that graduate in this silly model).

Eigenvalues of Markov matrices¶

There are two key questions about Markov matrices that can be answered by analysis of their eigenvalues:

Is there a steady state?
- i.e. is there an $x_0 \ne 0$ such that $A x_0 = x_0$?
- i.e. is there $\lambda_0 = 1$ eigenvector $x_0$?
Does the system tend toward a steady state?
- i.e. does $A^n x \to \mbox{multiple of } x_0$ as $n \to \infty$?
- i.e. is $\lambda = 1$ the largest $|\lambda|$?

The answers are YES for any Markov matrix $A$, and YES for any positive Markov matrix (Markov matrices with entries $> 0$, not just $\ge 0$). For any Markov matrix, all of the λ satisfy $|\lambda| \le 1$, but if there are zero entries in the matrix we may have multiple $|\lambda|=1$ eigenvalues (though this doesn't happen often in practical Markov problems).

In [2]:

eigvals(A)

Out[2]:

2-element Array{Float64,1}:
 1.0
 0.7

Let's try just multipling it many times by a "random" vector and see whether it is converging to a steady state:

In [3]:

A^100 * [17, 4]

Out[3]:

2-element Array{Float64,1}:
 14.0
  7.0

Yes, it seems to be giving a vector that is not changing, which shoud be a multiple $c x_0$ of a steady-state eigenvector:

In [4]:

cx₀ = A^1000 * [17, 4]

Out[4]:

2-element Array{Float64,1}:
 14.0
  7.0

Let's check that this is an eigenvector of $A$ with eigenvalue $\lambda=1$:

In [5]:

A * cx₀

Out[5]:

2-element Array{Float64,1}:
 14.0
  7.0

To see why, the key idea is to write the columns-sum-to-one property of Markov matrices in linear-algebra terms. It is equivalent to the statement:

$$ \underbrace{\begin{pmatrix} 1 & 1 & \cdots & 1 & 1 \end{pmatrix}}_{o^T} A = o^T $$

since this is just the operation that sums all of the rows of $A$. Equivalently, if we transpose both sides:

$$ A^T o = o $$

i.e. $o$ is an eigenvector of $A^T$ (called a left eigenvector of A) with eigenvalue $\lambda = 1$.

But since $A$ and $A^T$ have the same eigenvalues (they have the same characteristic polynomial $\det (A - \lambda I) = \det (A^T - \lambda I)$ because transposed don't change determinants), this means that $A$ also has an eigenvalue 1 but with a different eigenvector.

In [6]:

o = [1,1]
o' * A

Out[6]:

1×2 RowVector{Float64,Array{Float64,1}}:
 1.0  1.0

In [7]:

A' * o

Out[7]:

2-element Array{Float64,1}:
 1.0
 1.0

The eigenvector of $A$ with eigenvalue $1$ must be a basis for $N(A - I)$:

In [8]:

A - 1*I

Out[8]:

2×2 Array{Float64,2}:
 -0.1   0.2
  0.1  -0.2

By inspection, $A - I$ is singular here: the second column is -2 times the first. So, $x_0 = (2,1)$ is a basis for its nullspace, and is the steady state:

In [9]:

(A - I) * [2,1]

Out[9]:

2-element Array{Float64,1}:
 5.55112e-17
 5.55112e-17

Let's check if some arbitrary starting vector $(3,0)$ tends towards the steady state:

In [10]:

using Interact
@manipulate for n in slider(0:100,value=0)
    A^n * [3,0]
end

Out[10]:

2-element Array{Float64,1}:
 3.0
 0.0

Yes! In fact, it tends to exactly $(2,1)$, because the other eigenvalue is $< 1$ (and hence that eigenvector component decays exponentially fast).

An interesting property is that the sum of the vector components is conserved when we multiply by a Markov matrix. Given a vector $x$, $o^T x$ is the sum of its components. But $o^T A = o^T$, so:

$$ o^T A x = o^T x = o^T A^n x$ $$

for any $n$! This is why $(3,0)$ must tend to $(2,1)$, and not to any other multiple of $(2,1)$, because both of them sum to 3. (The "number of majors" is conserved in this problem.)

Why no eigenvalues > 1?¶

Why are all $|\lambda| \le 1$ for a Markov matrix?

The key fact is that the product AB of two Markov matrices A and B is also Markov. Reasons:

If $A$ and $B$ have nonnegative entries, $AB$ does as well: matrix multiplication uses only $\times$ and $+$, and can't introduce a minus sign.
If $o^T A = o^T$ and $o^T B = o^T$ (both have columns summing to 1), then $o^T AB = o^T B = o^T$: the columns of $AB$ sum to 1.

For example, $A^n$ is a Markov matrix for any $n$ if $A$ is Markov.

Now, if there were an eigenvalue $|\lambda| > 1$, the matrix $A^n$ would have to blow up exponentially as $n\to \infty$ (since the matrix times that eigenvector, or any vector with a nonzero component of that eigenvector, would blow up). But since $A^n$ is Markov, all of its entries must be between 0 and 1. It can't blow up! So we must have all $|\lambda| \le 1$.

In [11]:

A^100

Out[11]:

2×2 Array{Float64,2}:
 0.666667  0.666667
 0.333333  0.333333

(In fact, $A^n$ is pretty boring for large $n$: it just takes in any vector and redistributes it to the steady state.)

Can there be more than one steady state?¶

We have just showed that we have at least one eigenvalue $\lambda = 1$, and that all eigenvalues satisfy $|\lambda| \le 1$. But can there be more than one independent eigenvector with $\lambda = 1$?

Yes! For example, the identity matrix $I$ is a Markov matrix, and all of its eigenvectors have eigenvalue $1$. Since $Ix = x$ for any $x$, every vector is a steady state for $I$!

But this does not usually happen for interesting Markov matrices coming from real problems. In fact, there is a theorem:

If all the entries of a Markov matrix are $> 0$ (not just $\ge 0$), then exactly one of its eigenvalues $\lambda = 1$ (that eigenvalue has "multiplicity 1": $N(A-I)$ is one-dimensional), and all other eigenvalues have $|\lambda| < 1$. There is a unique steady state (up to an overall scale factor).

I'm not going to prove this in 18.06, however.

Can the solutions oscillate?¶

If you have a Markov matrix with zero entries, then there might be more than one eigenvalue with $|\lambda| = 1$, but these additional solutions might be oscillating solutions rather than steady states.

For example, consider the permutation matrix $$ P = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} $$ that simply swaps the first and second entries of any 2-component vector.

If $x = (1,0)$, then $P^n x$ will oscillate forever, never reaching a steady state! It simply oscillates between $(1,0)$ (for even $n$) and $(0,1)$ (for odd $n$):

In [12]:

P = [0 1
     1 0]

Out[12]:

2×2 Array{Int64,2}:
 0  1
 1  0

In [13]:

[P^n * [1,0] for n = 0:5]

Out[13]:

6-element Array{Array{Int64,1},1}:
 [1, 0]
 [0, 1]
 [1, 0]
 [0, 1]
 [1, 0]
 [0, 1]

But this is a Markov matrix, so all $|\lambda|$ are $\le 1$:

In [14]:

eigvals(P)

Out[14]:

2-element Array{Float64,1}:
 -1.0
  1.0

The problem is that the $\lambda = -1$ eigenvalue corresponds to an oscillating solution:

$$ P^n \begin{pmatrix} 1 \\ -1 \end{pmatrix} = (-1)^n \begin{pmatrix} 1 \\ -1 \end{pmatrix} $$

for the eigenvector $(1,-1)$.

The steady state still exists, corresponding to the eigenvector $(1,1)$:

$$ P^n \begin{pmatrix} 1 \\ 1 \end{pmatrix} = \begin{pmatrix} 1 \\ 1 \end{pmatrix} $$

In [15]:

eig(P)[2] # the eigenvectors

Out[15]:

2×2 Array{Float64,2}:
 -0.707107  0.707107
  0.707107  0.707107

Since $(1,0) = [(1,1) + (1,-1)]/2$, we have:

$$ P^n \begin{pmatrix} 1 \\ 0 \end{pmatrix} = \frac{1}{2} \left[ \begin{pmatrix} 1 \\ 1 \end{pmatrix} + (-1)^n \begin{pmatrix} 1 \\ -1 \end{pmatrix} \right] $$

which alternates between $(1,0)$ and $(0,1)$.

Another example¶

Let's generate a random 5x5 Markov matrix:

In [16]:

M = rand(5,5) # random entries in [0,1]

Out[16]:

5×5 Array{Float64,2}:
 0.738225  0.72637    0.554436  0.472905  0.170035
 0.575379  0.641102   0.421929  0.142563  0.897826
 0.992513  0.0505591  0.682035  0.959969  0.297188
 0.692958  0.408806   0.101324  0.526985  0.310136
 0.121889  0.395139   0.527851  0.710186  0.318843

In [17]:

sum(M,1) # not Markov yet

Out[17]:

1×5 Array{Float64,2}:
 3.12096  2.22198  2.28758  2.81261  1.99403

In [18]:

M = M ./ sum(M,1)

Out[18]:

5×5 Array{Float64,2}:
 0.236537   0.326903   0.242368   0.168138  0.0852724
 0.184359   0.288528   0.184444   0.050687  0.450257 
 0.318015   0.0227541  0.298147   0.341309  0.149039 
 0.222033   0.183983   0.0442933  0.187365  0.155532 
 0.0390548  0.177832   0.230747   0.252501  0.159899

In [19]:

sum(M,1)

Out[19]:

1×5 Array{Float64,2}:
 1.0  1.0  1.0  1.0  1.0

In [20]:

eigvals(M)

Out[20]:

5-element Array{Complex{Float64},1}:
       1.0+0.0im     
 -0.217733+0.0im     
  0.116733+0.211049im
  0.116733-0.211049im
  0.154743+0.0im

In [21]:

abs.(eigvals(M))

Out[21]:

5-element Array{Float64,1}:
 1.0     
 0.217733
 0.241182
 0.241182
 0.154743

In [22]:

x = rand(5)
x = x / sum(x)

Out[22]:

5-element Array{Float64,1}:
 0.338351
 0.142161
 0.139727
 0.16293 
 0.21683

In [23]:

M^100 * x

Out[23]:

5-element Array{Float64,1}:
 0.222706
 0.231999
 0.220523
 0.157424
 0.167348

In [24]:

λ, X = eig(M)
X[:,1] / sum(X[:,1])

Out[24]:

5-element Array{Complex{Float64},1}:
 0.222706+0.0im
 0.231999+0.0im
 0.220523+0.0im
 0.157424+0.0im
 0.167348+0.0im