A matrix $A$ is a Markov matrix if
Typicaly, a Markov matrix's entries represent transition probabilities from one state to another.
For example, consider the $2 \times 2$ Markov matrix:
A = [0.9 0.2
0.1 0.8]
2×2 Array{Float64,2}: 0.9 0.2 0.1 0.8
Let us suppose that this represents the fraction of people switching majors each year between math and English literature.
Let $$ x = \begin{pmatrix} m \\ e \end{pmatrix} $$
represent the number of math majors $m$ and English majors $e$. Suppose that each year, 10% of math majors and 20% of English majors switch majors. After one year, the new number of math and English majors is:
$$ m' = 0.9 m + 0.2 e e' = 0.1 m + 0.8 e $$But this is equivalent to a matrix multiplication! i.e. the numbers $x'$ of majors after one year is
$$ x' = A x \, $$Note that the two Markov properties are critical: we never have negative numbers of majors (or negative probabilities), and the probabilities must sum to 1 (the net number of majors is not changing: we're not including new students or people that graduate in this silly model).
There are two key questions about Markov matrices that can be answered by analysis of their eigenvalues:
Is there a steady state?
Does the system tend toward a steady state?
The answers are YES for any Markov matrix $A$, and YES for any positive Markov matrix (Markov matrices with entries $> 0$, not just $\ge 0$). For any Markov matrix, all of the λ satisfy $|\lambda| \le 1$, but if there are zero entries in the matrix we may have multiple $|\lambda|=1$ eigenvalues (though this doesn't happen often in practical Markov problems).
eigvals(A)
2-element Array{Float64,1}: 1.0 0.7
Let's try just multipling it many times by a "random" vector and see whether it is converging to a steady state:
A^100 * [17, 4]
2-element Array{Float64,1}: 14.0 7.0
Yes, it seems to be giving a vector that is not changing, which shoud be a multiple $c x_0$ of a steady-state eigenvector:
cx₀ = A^1000 * [17, 4]
2-element Array{Float64,1}: 14.0 7.0
Let's check that this is an eigenvector of $A$ with eigenvalue $\lambda=1$:
A * cx₀
2-element Array{Float64,1}: 14.0 7.0
To see why, the key idea is to write the columns-sum-to-one property of Markov matrices in linear-algebra terms. It is equivalent to the statement:
$$ \underbrace{\begin{pmatrix} 1 & 1 & \cdots & 1 & 1 \end{pmatrix}}_{o^T} A = o^T $$since this is just the operation that sums all of the rows of $A$. Equivalently, if we transpose both sides:
$$ A^T o = o $$i.e. $o$ is an eigenvector of $A^T$ (called a left eigenvector of A) with eigenvalue $\lambda = 1$.
But since $A$ and $A^T$ have the same eigenvalues (they have the same characteristic polynomial $\det (A - \lambda I) = \det (A^T - \lambda I)$ because transposed don't change determinants), this means that $A$ also has an eigenvalue 1 but with a different eigenvector.
o = [1,1]
o' * A
1×2 RowVector{Float64,Array{Float64,1}}: 1.0 1.0
A' * o
2-element Array{Float64,1}: 1.0 1.0
The eigenvector of $A$ with eigenvalue $1$ must be a basis for $N(A - I)$:
A - 1*I
2×2 Array{Float64,2}: -0.1 0.2 0.1 -0.2
By inspection, $A - I$ is singular here: the second column is -2 times the first. So, $x_0 = (2,1)$ is a basis for its nullspace, and is the steady state:
(A - I) * [2,1]
2-element Array{Float64,1}: 5.55112e-17 5.55112e-17
Let's check if some arbitrary starting vector $(3,0)$ tends towards the steady state:
using Interact
@manipulate for n in slider(0:100,value=0)
A^n * [3,0]
end
2-element Array{Float64,1}: 3.0 0.0
Yes! In fact, it tends to exactly $(2,1)$, because the other eigenvalue is $< 1$ (and hence that eigenvector component decays exponentially fast).
An interesting property is that the sum of the vector components is conserved when we multiply by a Markov matrix. Given a vector $x$, $o^T x$ is the sum of its components. But $o^T A = o^T$, so:
$$ o^T A x = o^T x = o^T A^n x$ $$for any $n$! This is why $(3,0)$ must tend to $(2,1)$, and not to any other multiple of $(2,1)$, because both of them sum to 3. (The "number of majors" is conserved in this problem.)
Why are all $|\lambda| \le 1$ for a Markov matrix?
The key fact is that the product AB of two Markov matrices A and B is also Markov. Reasons:
If $A$ and $B$ have nonnegative entries, $AB$ does as well: matrix multiplication uses only $\times$ and $+$, and can't introduce a minus sign.
If $o^T A = o^T$ and $o^T B = o^T$ (both have columns summing to 1), then $o^T AB = o^T B = o^T$: the columns of $AB$ sum to 1.
For example, $A^n$ is a Markov matrix for any $n$ if $A$ is Markov.
Now, if there were an eigenvalue $|\lambda| > 1$, the matrix $A^n$ would have to blow up exponentially as $n\to \infty$ (since the matrix times that eigenvector, or any vector with a nonzero component of that eigenvector, would blow up). But since $A^n$ is Markov, all of its entries must be between 0 and 1. It can't blow up! So we must have all $|\lambda| \le 1$.
A^100
2×2 Array{Float64,2}: 0.666667 0.666667 0.333333 0.333333
(In fact, $A^n$ is pretty boring for large $n$: it just takes in any vector and redistributes it to the steady state.)
We have just showed that we have at least one eigenvalue $\lambda = 1$, and that all eigenvalues satisfy $|\lambda| \le 1$. But can there be more than one independent eigenvector with $\lambda = 1$?
Yes! For example, the identity matrix $I$ is a Markov matrix, and all of its eigenvectors have eigenvalue $1$. Since $Ix = x$ for any $x$, every vector is a steady state for $I$!
But this does not usually happen for interesting Markov matrices coming from real problems. In fact, there is a theorem:
I'm not going to prove this in 18.06, however.
If you have a Markov matrix with zero entries, then there might be more than one eigenvalue with $|\lambda| = 1$, but these additional solutions might be oscillating solutions rather than steady states.
For example, consider the permutation matrix $$ P = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} $$ that simply swaps the first and second entries of any 2-component vector.
If $x = (1,0)$, then $P^n x$ will oscillate forever, never reaching a steady state! It simply oscillates between $(1,0)$ (for even $n$) and $(0,1)$ (for odd $n$):
P = [0 1
1 0]
2×2 Array{Int64,2}: 0 1 1 0
[P^n * [1,0] for n = 0:5]
6-element Array{Array{Int64,1},1}: [1, 0] [0, 1] [1, 0] [0, 1] [1, 0] [0, 1]
But this is a Markov matrix, so all $|\lambda|$ are $\le 1$:
eigvals(P)
2-element Array{Float64,1}: -1.0 1.0
The problem is that the $\lambda = -1$ eigenvalue corresponds to an oscillating solution:
$$ P^n \begin{pmatrix} 1 \\ -1 \end{pmatrix} = (-1)^n \begin{pmatrix} 1 \\ -1 \end{pmatrix} $$for the eigenvector $(1,-1)$.
The steady state still exists, corresponding to the eigenvector $(1,1)$:
$$ P^n \begin{pmatrix} 1 \\ 1 \end{pmatrix} = \begin{pmatrix} 1 \\ 1 \end{pmatrix} $$eig(P)[2] # the eigenvectors
2×2 Array{Float64,2}: -0.707107 0.707107 0.707107 0.707107
Since $(1,0) = [(1,1) + (1,-1)]/2$, we have:
$$ P^n \begin{pmatrix} 1 \\ 0 \end{pmatrix} = \frac{1}{2} \left[ \begin{pmatrix} 1 \\ 1 \end{pmatrix} + (-1)^n \begin{pmatrix} 1 \\ -1 \end{pmatrix} \right] $$which alternates between $(1,0)$ and $(0,1)$.
Let's generate a random 5x5 Markov matrix:
M = rand(5,5) # random entries in [0,1]
5×5 Array{Float64,2}: 0.738225 0.72637 0.554436 0.472905 0.170035 0.575379 0.641102 0.421929 0.142563 0.897826 0.992513 0.0505591 0.682035 0.959969 0.297188 0.692958 0.408806 0.101324 0.526985 0.310136 0.121889 0.395139 0.527851 0.710186 0.318843
sum(M,1) # not Markov yet
1×5 Array{Float64,2}: 3.12096 2.22198 2.28758 2.81261 1.99403
M = M ./ sum(M,1)
5×5 Array{Float64,2}: 0.236537 0.326903 0.242368 0.168138 0.0852724 0.184359 0.288528 0.184444 0.050687 0.450257 0.318015 0.0227541 0.298147 0.341309 0.149039 0.222033 0.183983 0.0442933 0.187365 0.155532 0.0390548 0.177832 0.230747 0.252501 0.159899
sum(M,1)
1×5 Array{Float64,2}: 1.0 1.0 1.0 1.0 1.0
eigvals(M)
5-element Array{Complex{Float64},1}: 1.0+0.0im -0.217733+0.0im 0.116733+0.211049im 0.116733-0.211049im 0.154743+0.0im
abs.(eigvals(M))
5-element Array{Float64,1}: 1.0 0.217733 0.241182 0.241182 0.154743
x = rand(5)
x = x / sum(x)
5-element Array{Float64,1}: 0.338351 0.142161 0.139727 0.16293 0.21683
M^100 * x
5-element Array{Float64,1}: 0.222706 0.231999 0.220523 0.157424 0.167348
λ, X = eig(M)
X[:,1] / sum(X[:,1])
5-element Array{Complex{Float64},1}: 0.222706+0.0im 0.231999+0.0im 0.220523+0.0im 0.157424+0.0im 0.167348+0.0im