This small notebook shows a naive implementation of an algorithm to compute the eigen values and eigen vectors of a matrix, written in the Julia programming language.
For a real or complex valued square matrix $A \in \mathbb{M}_{n,n}(\mathbb{K})$ (on a field $\mathbb{K}=\mathbb{R}$ or $\mathbb{C}$), an eigenvalue $\lambda$ (of multiplicity $k\in\mathbb{N}^*$) and its associated eigenvector $v$ satisfy $$ (A - \lambda I)^k v = 0.$$
We will need at least two examples of matrix, to test our algorithms.
A = [ 1. 2 3; 4 5 6; 7 8 9 ]
3×3 Array{Float64,2}: 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0
B = [12 -51 4; 6 167 -68; -4 24 -41]
3×3 Array{Int64,2}: 12 -51 4 6 167 -68 -4 24 -41
Based on the definition, it is not difficult to test if a pair $\lambda, v$ is an eigenvalue, eigenvector pair.
We can write a small function that will check if the eigen values and eigen vectors are correct.
function check_DV(A, D, V)
I = eye(size(A)[1])
for i = 1:size(A)[1]
check = (A - D[i] * I) * V[:,i]
println("For the ", i, "th eigen value and vector, (A - lambda I)^k * v = ", check, "\n\twhich has a L2 norm = ", norm(check))
assert(norm(check) <= 1e-14)
end
end
check_DV (generic function with 1 method)
D_A, V_A = eig(A)
([16.1168, -1.11684, -1.30368e-15], [-0.231971 -0.78583 0.408248; -0.525322 -0.0867513 -0.816497; -0.818673 0.612328 0.408248])
check_DV(A, D_A, V_A)
For the 1th eigen value and vector, (A - lambda I)^k * v = [5.32907e-15, -1.77636e-15, -8.88178e-16] which has a L2 norm = 5.687116766346677e-15 For the 2th eigen value and vector, (A - lambda I)^k * v = [-6.66134e-16, -8.88178e-16, -2.66454e-15] which has a L2 norm = 2.886579864025407e-15 For the 3th eigen value and vector, (A - lambda I)^k * v = [-2.22045e-16, -1.77636e-15, 4.44089e-16] which has a L2 norm = 1.8444410139024814e-15
A second example, showing that $B$ has two eigen values that are $0$ (numerically found very close to $0$).
D_B, V_B = eig(B)
([156.137, 16.06, -34.1967], [0.328147 -0.990526 0.254758; -0.936881 0.0871754 0.302793; -0.120717 0.106104 0.918376])
check_DV(B, D_B, V_B)
For the 1th eigen value and vector, (A - lambda I)^k * v = [-2.88658e-15, -1.95399e-14, 3.55271e-15] which has a L2 norm = 2.0068951041893117e-14
AssertionError: Stacktrace: [1] assert at ./error.jl:68 [inlined] [2] check_DV(::Array{Int64,2}, ::Array{Float64,1}, ::Array{Float64,2}) at ./In[3]:6 [3] include_string(::String, ::String) at ./loading.jl:515
Let's compute the eigenpair with largest eigenvalue, through the power iteration method.
"Power iteration method on A, to find eigenpair with largest eigen value."
function poweriteration(A, maxiter=20, userandomstart=true)
n = size(A)[1]
if userandomstart
X = randn((n, 1))
else
X = ones((n, 1))
end
for i = 1:maxiter
next_X = A * X
X = next_X / norm(next_X)
end
# lambda = (X^* ⋅ A ⋅ X) / (X^* ⋅ X)
lambda = (transpose(conj(X)) * (A * X)) / (transpose(conj(X)) * X)
lambda, X
end
poweriteration
For instance, we obtain correctly the largest eigenpair for our example $A$. (upto a $-1$ sign in $v$)
poweriteration(A)
([16.1168], [0.231971; 0.525322; 0.818673])
D_A, V_A = eig(A)
([16.1168, -1.11684, -1.30368e-15], [-0.231971 -0.78583 0.408248; -0.525322 -0.0867513 -0.816497; -0.818673 0.612328 0.408248])
(D_A[1], -1 * V_A[:,1])
(16.116843969807043, [0.231971, 0.525322, 0.818673])
What is the typical dependency on the number of iterations?
poweriteration
poweriteration (generic function with 3 methods)
println(poweriteration(A, 1))
println(poweriteration(A, 2))
println(poweriteration(A, 5))
println(poweriteration(A, 10))
([15.3427], [0.130291; 0.494751; 0.859212]) ([16.1581], [-0.239638; -0.527385; -0.815131]) ([16.1168], [-0.23197; -0.525322; -0.818674]) ([16.1168], [0.231971; 0.525322; 0.818673])
The QR method can be used to compute eigenvalues, and then X method to compute an approximation of a correspond eigenvector for each eigenvalue.
First, we need to implement the QR decomposition, which can be based on the Gram-Schmidt process.
It requires a generic inner-product function:
"Inner product of two vectors v, w. Can be vertical (n,1) or horizontal (1,n) vectors."
function inner(v, w)
assert(size(v) == size(w))
nm = size(v)
if length(nm) == 1
return transpose(conj(v)) * w
elseif nm[1] == 1
return conj(v) * transpose(w)
end
end
inner
It works for both $(1,n)$ and $(n,1)$ vectors:
inner([1 1 1], [2 3 4])
1×1 Array{Int64,2}: 9
inner([1; 1; 1], [2; 3; 4])
9
Then a projection operator:
"projection(a, e): projection operator of vector a onto base vector e. Uses inner(e, a) and inner(e, e)."
function projection(a, e)
return (inner(e, a) / inner(e, e)) * e
end
projection
Then the Gram-Schmidt process:
"gramschmidt(A): Gram-Schmidt orthogonalization operator, returns us, es (unormalized matrix, base matrix)."
function gramschmidt(A, verbose=false)
assert(size(A)[1] == size(A)[2])
n = size(A)[1]
if verbose
println("n = ", n)
end
us = zeros((n, n))
es = zeros((n, n))
for i = 1:n
if verbose
println("i = ", i)
end
us[:,i] = A[:,i]
for j = 1:i-1
if verbose
println("\tj = ", j)
println("\tus[:,i] = ", us[:,i])
end
us[:,i] -= projection(A[:,i], us[:,j])
end
if verbose
println("us[:,i] = ", us[:,i])
end
es[:,i] = us[:,i] / norm(us[:,i])
if verbose
println("es[:,i] = ", es[:,i])
end
end
return us, es
end
gramschmidt
Example:
A
3×3 Array{Float64,2}: 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0
us, es = gramschmidt(A)
([1.0 0.818182 7.99361e-15; 4.0 0.272727 3.44169e-15; 7.0 -0.272727 -7.77156e-16], [0.123091 0.904534 0.914844; 0.492366 0.301511 0.393891; 0.86164 -0.301511 -0.0889431])
us
3×3 Array{Float64,2}: 1.0 0.818182 7.99361e-15 4.0 0.272727 3.44169e-15 7.0 -0.272727 -7.77156e-16
es
3×3 Array{Float64,2}: 0.123091 0.904534 0.914844 0.492366 0.301511 0.393891 0.86164 -0.301511 -0.0889431
We can check that $a_1 = <e_1,a_1> e_1$, $a_2 = <e_1,a_2> e_1 + <e_2,a_2> e_2$ and so on:
inner(es[:,1], A[:,1]) * es[:,1]
3-element Array{Float64,1}: 1.0 4.0 7.0
inner(es[:,1], A[:,2]) * es[:,1] + inner(es[:,2], A[:,2]) * es[:,2]
3-element Array{Float64,1}: 2.0 5.0 8.0
It's not true for the last one, as $A$ was not full rank.
rank(A)
2
inner(es[:,1], A[:,3]) * es[:,1] + inner(es[:,2], A[:,3]) * es[:,2] + inner(es[:,3], A[:,3]) * es[:,3]
3-element Array{Float64,1}: 6.94059 7.69664 8.61689
$E$ is an orthogonal matrix indeed:
[ norm(es[:,i]) for i = 1:size(es)[1] ]
3-element Array{Float64,1}: 1.0 1.0 1.0
(es' * es)[1:2,1:2]
2×2 Array{Float64,2}: 1.0 -7.77156e-16 -7.77156e-16 1.0
Second example:
us, es = gramschmidt(B)
([12.0 -69.0 -11.6; 6.0 158.0 1.2; -4.0 30.0 -33.0], [0.857143 -0.394286 -0.331429; 0.428571 0.902857 0.0342857; -0.285714 0.171429 -0.942857])
[ norm(es[:,i]) for i = 1:size(es)[1] ]
3-element Array{Float64,1}: 1.0 1.0 1.0
es * es'
3×3 Array{Float64,2}: 1.0 -5.89806e-17 5.55112e-17 -5.89806e-17 1.0 -6.93889e-17 5.55112e-17 -6.93889e-17 1.0
Last example:
V = [3 2; 1 2]
2×2 Array{Int64,2}: 3 2 1 2
us, es = gramschmidt(V)
([3.0 -0.4; 1.0 1.2], [0.948683 -0.316228; 0.316228 0.948683])
es' * es
2×2 Array{Float64,2}: 1.0 -2.77556e-16 -2.77556e-16 1.0
It is very easy from the Gram-Schmidt decomposition:
"QR decomposition, returns (Q, R): Q is orthogonal and R is upper-triangular, such that A = Q R."
function QR(A)
assert(size(A)[1] == size(A)[2])
n = size(A)[1]
us, es = gramschmidt(A)
Q = copy(es)
R = zeros((n, n))
for i = 1:n
for j = i:n
R[i,j] = inner(es[:,i], A[:,j])
end
end
return Q, R
end
QR
First example
Q, R = QR(A)
([0.123091 0.904534 0.914844; 0.492366 0.301511 0.393891; 0.86164 -0.301511 -0.0889431], [8.12404 9.60114 11.0782; 0.0 0.904534 1.80907; 0.0 0.0 4.30739])
A
3×3 Array{Float64,2}: 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0
Q
3×3 Array{Float64,2}: 0.123091 0.904534 0.914844 0.492366 0.301511 0.393891 0.86164 -0.301511 -0.0889431
R
3×3 Array{Float64,2}: 8.12404 9.60114 11.0782 0.0 0.904534 1.80907 0.0 0.0 4.30739
Q * R
3×3 Array{Float64,2}: 1.0 2.0 6.94059 4.0 5.0 7.69664 7.0 8.0 8.61689
Second example
Q2, R2 = QR(B)
([0.857143 -0.394286 -0.331429; 0.428571 0.902857 0.0342857; -0.285714 0.171429 -0.942857], [14.0 21.0 -14.0; 0.0 175.0 -70.0; 0.0 0.0 35.0])
B
3×3 Array{Int64,2}: 12 -51 4 6 167 -68 -4 24 -41
Q2
3×3 Array{Float64,2}: 0.857143 -0.394286 -0.331429 0.428571 0.902857 0.0342857 -0.285714 0.171429 -0.942857
R2
3×3 Array{Float64,2}: 14.0 21.0 -14.0 0.0 175.0 -70.0 0.0 0.0 35.0
Q2 * R2
3×3 Array{Float64,2}: 12.0 -51.0 4.0 6.0 167.0 -68.0 -4.0 24.0 -41.0
Due to numerical errors, $A \neq QR$ but almost:
B == Q2 * R2
false
assert(norm(B - (Q2 * R2)) <= 1e-13)
"Apply the QR method for maxiter steps. Should return a triangular matrix similar to A (same eigenvalues)."
function QR_method(A, maxiter=50)
Ak = A
for k = 1:maxiter
Qk, Rk = QR(Ak)
Ak = Rk * Qk
end
return Ak
end
QR_method
It should produce matrix which are almost triangular!
Ak = QR_method(A)
3×3 Array{Float64,2}: 18.0249 -7.91186 -1.81805 2.23294e-35 -3.41599 -1.67461 2.70327e-164 -4.65436e-129 0.00799986
"Truncate to zero values under the diagonal (have to be smaller than tolerance)"
function truncate_to_zero_below_diag(A, tolerance=1e-12)
assert(size(A)[1] == size(A)[2])
n = size(A)[1]
for j = 1:n
for i = j+1:n
assert(norm(A[i,j]) <= tolerance)
A[i,j] = 0
end
end
return A
end
truncate_to_zero_below_diag
truncate_to_zero_below_diag(Ak)
3×3 Array{Float64,2}: 18.0249 -7.91186 -1.81805 0.0 -3.41599 -1.67461 0.0 0.0 0.00799986
Ak = QR_method(B)
3×3 Array{Float64,2}: 156.137 62.2705 -87.4604 3.54467e-31 -34.1967 15.8136 -2.29384e-47 1.53496e-14 16.06
truncate_to_zero_below_diag(Ak)
3×3 Array{Float64,2}: 156.137 62.2705 -87.4604 0.0 -34.1967 15.8136 0.0 0.0 16.06
Finally, we can extract the eigen values from the diagonal:
"Extract the diagonal from the QR method."
function QR_eigenvalues(A, maxiter=50)
return diag(QR_method(A, maxiter))
end
QR_eigenvalues
It does not work for matrices which are not full rank:
QR_eigenvalues(A)
3-element Array{Float64,1}: 18.0249 -3.41599 0.00799986
eigvals(A)
3-element Array{Float64,1}: 16.1168 -1.11684 -1.30368e-15
But it works fine for full rank matrix:
QR_eigenvalues(B)
3-element Array{Float64,1}: 156.137 -34.1967 16.06
eigvals(B)
3-element Array{Float64,1}: 156.137 16.06 -34.1967
Now, we have a numerical algorithm that gives an approximation of the eigen values. For each eigen value $\lambda$, we can use the inverse iteration to find the corresponding eigen vector.
"Inverse iteration method to find the eigen vector corresponding to a given eigen value."
function inverseIteration(A, val, maxiter=20, userandomstart=true)
mu_I = val * eye(A)
inv_A_minus_muI = inv(A - mu_I)
n = size(A)[1]
if userandomstart
X = randn((n, 1))
else
X = ones((n, 1))
end
for i = 1:maxiter
next_X = inv_A_minus_muI * X
X = next_X / norm(next_X)
end
X
end
inverseIteration
Now, putting it all together:
"Approximation of D, V: D contains the eigen values (vector) and V the eigen vectors (column based)."
function QR_eigen(A, maxiter=20, userandomstart=true)
n = size(A)[1]
D = QR_eigenvalues(A, maxiter)
V = zeros((n,n))
for i = 1:n
V[:,i] = inverseIteration(A, D[i], maxiter, userandomstart)
end
return D, V
end
QR_eigen
D2, V2 = QR_eigen(B)
([156.137, -34.1966, 16.06], [0.328147 0.254758 -0.990526; -0.936881 0.302793 0.0871754; -0.120717 0.918376 0.106104])
D2s, V2s = eig(B)
([156.137, 16.06, -34.1967], [0.328147 -0.990526 0.254758; -0.936881 0.0871754 0.302793; -0.120717 0.106104 0.918376])
Up to a permutation, and up to multiplication by $-1$ of some eigen vectors, we got the good values!
It is well known that finding eigenpairs and finding roots of a polynomial are equivalent problems (thanks to companion matrices). The Abel-Ruffini theorem states that there is no exact algorithm to find roots of polynomial of degrees larger than $4$, and therefore there is no hope of finding an exact algorithm for the eigenpair problem.
Hence, we focussed only on one numerical (i.e., iterative) method.
It's sad to see that there is not more details about general algorithms for the eigenpair problem.
That's it for today, folks! See here for other notebooks I wrote.