Table of Contents¶

1 An algorithm to compute eigen values and eigen vectors in Julia

1.1 Definition

1.2 Examples

2 Using the builtin functions

2.1 Testing

2.2 First example

2.3 Second example

3 Computing the largest eigenvalue with power iteration

4 QR algorithm

4.1 Gram-Schmidt process

4.2 QR decomposition

4.3 QR method

4.4 Computing the eigen vectors

5 Conclusion

An algorithm to compute eigen values and eigen vectors in Julia¶

This small notebook shows a naive implementation of an algorithm to compute the eigen values and eigen vectors of a matrix, written in the Julia programming language.

Ref: this Wikipedia page.

Definition¶

For a real or complex valued square matrix $A \in \mathbb{M}_{n,n}(\mathbb{K})$ (on a field $\mathbb{K}=\mathbb{R}$ or $\mathbb{C}$), an eigenvalue $\lambda$ (of multiplicity $k\in\mathbb{N}^*$) and its associated eigenvector $v$ satisfy $$ (A - \lambda I)^k v = 0.$$

Examples¶

We will need at least two examples of matrix, to test our algorithms.

In [1]:

A = [ 1. 2 3; 4 5 6; 7 8 9 ]

Out[1]:

3×3 Array{Float64,2}:
 1.0  2.0  3.0
 4.0  5.0  6.0
 7.0  8.0  9.0

In [2]:

B = [12 -51 4; 6 167 -68; -4 24 -41]

Out[2]:

3×3 Array{Int64,2}:
 12  -51    4
  6  167  -68
 -4   24  -41

Using the builtin functions¶

Julia has, of course, a builtin eig function.

Testing¶

Based on the definition, it is not difficult to test if a pair $\lambda, v$ is an eigenvalue, eigenvector pair.

We can write a small function that will check if the eigen values and eigen vectors are correct.

In [3]:

function check_DV(A, D, V)
    I = eye(size(A)[1])
    for i = 1:size(A)[1]
        check = (A - D[i] * I) * V[:,i]
        println("For the ", i, "th eigen value and vector, (A - lambda I)^k * v = ", check, "\n\twhich has a L2 norm = ", norm(check))
        assert(norm(check) <= 1e-14)
    end
end

Out[3]:

check_DV (generic function with 1 method)

First example¶

In [4]:

D_A, V_A = eig(A)

Out[4]:

([16.1168, -1.11684, -1.30368e-15], [-0.231971 -0.78583 0.408248; -0.525322 -0.0867513 -0.816497; -0.818673 0.612328 0.408248])

In [5]:

check_DV(A, D_A, V_A)

For the 1th eigen value and vector, (A - lambda I)^k * v = [5.32907e-15, -1.77636e-15, -8.88178e-16]
	which has a L2 norm = 5.687116766346677e-15
For the 2th eigen value and vector, (A - lambda I)^k * v = [-6.66134e-16, -8.88178e-16, -2.66454e-15]
	which has a L2 norm = 2.886579864025407e-15
For the 3th eigen value and vector, (A - lambda I)^k * v = [-2.22045e-16, -1.77636e-15, 4.44089e-16]
	which has a L2 norm = 1.8444410139024814e-15

Second example¶

A second example, showing that $B$ has two eigen values that are $0$ (numerically found very close to $0$).

In [6]:

D_B, V_B = eig(B)

Out[6]:

([156.137, 16.06, -34.1967], [0.328147 -0.990526 0.254758; -0.936881 0.0871754 0.302793; -0.120717 0.106104 0.918376])

In [7]:

check_DV(B, D_B, V_B)

For the 1th eigen value and vector, (A - lambda I)^k * v = [-2.88658e-15, -1.95399e-14, 3.55271e-15]
	which has a L2 norm = 2.0068951041893117e-14

AssertionError: 

Stacktrace:
 [1] assert at ./error.jl:68 [inlined]
 [2] check_DV(::Array{Int64,2}, ::Array{Float64,1}, ::Array{Float64,2}) at ./In[3]:6
 [3] include_string(::String, ::String) at ./loading.jl:515

Computing the largest eigenvalue with power iteration¶

Let's compute the eigenpair with largest eigenvalue, through the power iteration method.

In [8]:

"Power iteration method on A, to find eigenpair with largest eigen value."
function poweriteration(A, maxiter=20, userandomstart=true)
    n = size(A)[1]
    if userandomstart
        X = randn((n, 1))
    else
        X = ones((n, 1))
    end
    for i = 1:maxiter
        next_X = A * X
        X = next_X / norm(next_X)
    end
    # lambda = (X^* ⋅ A ⋅ X) / (X^* ⋅ X)
    lambda = (transpose(conj(X)) * (A * X)) / (transpose(conj(X)) * X)
    lambda, X
end

Out[8]:

poweriteration

For instance, we obtain correctly the largest eigenpair for our example $A$. (upto a $-1$ sign in $v$)

In [9]:

poweriteration(A)

Out[9]:

([16.1168], [0.231971; 0.525322; 0.818673])

In [10]:

D_A, V_A = eig(A)

Out[10]:

([16.1168, -1.11684, -1.30368e-15], [-0.231971 -0.78583 0.408248; -0.525322 -0.0867513 -0.816497; -0.818673 0.612328 0.408248])

In [11]:

(D_A[1], -1 * V_A[:,1])

Out[11]:

(16.116843969807043, [0.231971, 0.525322, 0.818673])

What is the typical dependency on the number of iterations?

In [12]:

poweriteration

Out[12]:

poweriteration (generic function with 3 methods)

In [13]:

println(poweriteration(A, 1))
println(poweriteration(A, 2))
println(poweriteration(A, 5))
println(poweriteration(A, 10))

([15.3427], [0.130291; 0.494751; 0.859212])
([16.1581], [-0.239638; -0.527385; -0.815131])
([16.1168], [-0.23197; -0.525322; -0.818674])
([16.1168], [0.231971; 0.525322; 0.818673])

QR algorithm¶

The QR method can be used to compute eigenvalues, and then X method to compute an approximation of a correspond eigenvector for each eigenvalue.

First, we need to implement the QR decomposition, which can be based on the Gram-Schmidt process.

Gram-Schmidt process¶

It requires a generic inner-product function:

In [14]:

"Inner product of two vectors v, w. Can be vertical (n,1) or horizontal (1,n) vectors."
function inner(v, w)
    assert(size(v) == size(w))
    nm = size(v)
    if length(nm) == 1
        return transpose(conj(v)) * w
    elseif nm[1] == 1
        return conj(v) * transpose(w)
    end
end

Out[14]:

inner

It works for both $(1,n)$ and $(n,1)$ vectors:

In [15]:

inner([1 1 1], [2 3 4])

Out[15]:

1×1 Array{Int64,2}:
 9

In [16]:

inner([1; 1; 1], [2; 3; 4])

Out[16]:

Then a projection operator:

In [17]:

"projection(a, e): projection operator of vector a onto base vector e. Uses inner(e, a) and inner(e, e)."
function projection(a, e)
    return (inner(e, a) / inner(e, e)) * e 
end

Out[17]:

projection

Then the Gram-Schmidt process:

In [18]:

"gramschmidt(A): Gram-Schmidt orthogonalization operator, returns us, es (unormalized matrix, base matrix)."
function gramschmidt(A, verbose=false)
    assert(size(A)[1] == size(A)[2])
    n = size(A)[1]
    if verbose
        println("n = ", n)
    end
    us = zeros((n, n))
    es = zeros((n, n))
    for i = 1:n
        if verbose
            println("i = ", i)
        end
        us[:,i] = A[:,i] 
        for j = 1:i-1
            if verbose
                println("\tj = ", j)
                println("\tus[:,i] = ", us[:,i])
            end
            us[:,i] -= projection(A[:,i], us[:,j]) 
        end
        if verbose
            println("us[:,i] = ", us[:,i])
        end
        es[:,i] = us[:,i] / norm(us[:,i])
        if verbose
            println("es[:,i] = ", es[:,i])
        end
    end
    return us, es
end

Out[18]:

gramschmidt

Example:

In [19]:

Out[19]:

3×3 Array{Float64,2}:
 1.0  2.0  3.0
 4.0  5.0  6.0
 7.0  8.0  9.0

In [20]:

us, es = gramschmidt(A)

Out[20]:

([1.0 0.818182 7.99361e-15; 4.0 0.272727 3.44169e-15; 7.0 -0.272727 -7.77156e-16], [0.123091 0.904534 0.914844; 0.492366 0.301511 0.393891; 0.86164 -0.301511 -0.0889431])

In [21]:

us

Out[21]:

3×3 Array{Float64,2}:
 1.0   0.818182   7.99361e-15
 4.0   0.272727   3.44169e-15
 7.0  -0.272727  -7.77156e-16

In [22]:

es

Out[22]:

3×3 Array{Float64,2}:
 0.123091   0.904534   0.914844 
 0.492366   0.301511   0.393891 
 0.86164   -0.301511  -0.0889431

We can check that $a_1 = <e_1,a_1> e_1$, $a_2 = <e_1,a_2> e_1 + <e_2,a_2> e_2$ and so on:

In [23]:

inner(es[:,1], A[:,1]) * es[:,1]

Out[23]:

3-element Array{Float64,1}:
 1.0
 4.0
 7.0

In [24]:

inner(es[:,1], A[:,2]) * es[:,1] + inner(es[:,2], A[:,2]) * es[:,2]

Out[24]:

3-element Array{Float64,1}:
 2.0
 5.0
 8.0

It's not true for the last one, as $A$ was not full rank.

In [25]:

rank(A)

Out[25]:

In [26]:

inner(es[:,1], A[:,3]) * es[:,1] + inner(es[:,2], A[:,3]) * es[:,2] + inner(es[:,3], A[:,3]) * es[:,3]

Out[26]:

3-element Array{Float64,1}:
 6.94059
 7.69664
 8.61689

$E$ is an orthogonal matrix indeed:

In [27]:

[ norm(es[:,i]) for i = 1:size(es)[1] ]

Out[27]:

3-element Array{Float64,1}:
 1.0
 1.0
 1.0

In [28]:

(es' * es)[1:2,1:2]

Out[28]:

2×2 Array{Float64,2}:
  1.0          -7.77156e-16
 -7.77156e-16   1.0

Second example:

In [29]:

us, es = gramschmidt(B)

Out[29]:

([12.0 -69.0 -11.6; 6.0 158.0 1.2; -4.0 30.0 -33.0], [0.857143 -0.394286 -0.331429; 0.428571 0.902857 0.0342857; -0.285714 0.171429 -0.942857])

In [30]:

[ norm(es[:,i]) for i = 1:size(es)[1] ]

Out[30]:

3-element Array{Float64,1}:
 1.0
 1.0
 1.0

In [31]:

es * es'

Out[31]:

3×3 Array{Float64,2}:
  1.0          -5.89806e-17   5.55112e-17
 -5.89806e-17   1.0          -6.93889e-17
  5.55112e-17  -6.93889e-17   1.0

Last example:

In [32]:

V = [3 2; 1 2]

Out[32]:

2×2 Array{Int64,2}:
 3  2
 1  2

In [33]:

us, es = gramschmidt(V)

Out[33]:

([3.0 -0.4; 1.0 1.2], [0.948683 -0.316228; 0.316228 0.948683])

In [34]:

es' * es

Out[34]:

2×2 Array{Float64,2}:
  1.0          -2.77556e-16
 -2.77556e-16   1.0

QR decomposition¶

It is very easy from the Gram-Schmidt decomposition:

In [35]:

"QR decomposition, returns (Q, R): Q is orthogonal and R is upper-triangular, such that A = Q R."
function QR(A)
    assert(size(A)[1] == size(A)[2])
    n = size(A)[1]
    us, es = gramschmidt(A)
    Q = copy(es)
    R = zeros((n, n))
    for i = 1:n
        for j = i:n
            R[i,j] = inner(es[:,i], A[:,j])
        end
    end
    return Q, R
end

Out[35]:

QR

First example

In [36]:

Q, R = QR(A)

Out[36]:

([0.123091 0.904534 0.914844; 0.492366 0.301511 0.393891; 0.86164 -0.301511 -0.0889431], [8.12404 9.60114 11.0782; 0.0 0.904534 1.80907; 0.0 0.0 4.30739])

In [37]:

Out[37]:

3×3 Array{Float64,2}:
 1.0  2.0  3.0
 4.0  5.0  6.0
 7.0  8.0  9.0

In [38]:

Out[38]:

3×3 Array{Float64,2}:
 0.123091   0.904534   0.914844 
 0.492366   0.301511   0.393891 
 0.86164   -0.301511  -0.0889431

In [39]:

Out[39]:

3×3 Array{Float64,2}:
 8.12404  9.60114   11.0782 
 0.0      0.904534   1.80907
 0.0      0.0        4.30739

In [40]:

Q * R

Out[40]:

3×3 Array{Float64,2}:
 1.0  2.0  6.94059
 4.0  5.0  7.69664
 7.0  8.0  8.61689

Second example

In [41]:

Q2, R2 = QR(B)

Out[41]:

([0.857143 -0.394286 -0.331429; 0.428571 0.902857 0.0342857; -0.285714 0.171429 -0.942857], [14.0 21.0 -14.0; 0.0 175.0 -70.0; 0.0 0.0 35.0])

In [42]:

Out[42]:

3×3 Array{Int64,2}:
 12  -51    4
  6  167  -68
 -4   24  -41

In [43]:

Q2

Out[43]:

3×3 Array{Float64,2}:
  0.857143  -0.394286  -0.331429 
  0.428571   0.902857   0.0342857
 -0.285714   0.171429  -0.942857

In [44]:

R2

Out[44]:

3×3 Array{Float64,2}:
 14.0   21.0  -14.0
  0.0  175.0  -70.0
  0.0    0.0   35.0

In [45]:

Q2 * R2

Out[45]:

3×3 Array{Float64,2}:
 12.0  -51.0    4.0
  6.0  167.0  -68.0
 -4.0   24.0  -41.0

Due to numerical errors, $A \neq QR$ but almost:

In [46]:

B == Q2 * R2

Out[46]:

false

In [47]:

assert(norm(B - (Q2 * R2)) <= 1e-13)

QR method¶

In [48]:

"Apply the QR method for maxiter steps. Should return a triangular matrix similar to A (same eigenvalues)."
function QR_method(A, maxiter=50)
    Ak = A
    for k = 1:maxiter
        Qk, Rk = QR(Ak)
        Ak = Rk * Qk
    end
    return Ak
end

Out[48]:

QR_method

It should produce matrix which are almost triangular!

In [49]:

Ak = QR_method(A)

Out[49]:

3×3 Array{Float64,2}:
 18.0249        -7.91186       -1.81805   
  2.23294e-35   -3.41599       -1.67461   
  2.70327e-164  -4.65436e-129   0.00799986

In [50]:

"Truncate to zero values under the diagonal (have to be smaller than tolerance)"
function truncate_to_zero_below_diag(A, tolerance=1e-12)
    assert(size(A)[1] == size(A)[2])
    n = size(A)[1]
    for j = 1:n
        for i = j+1:n
            assert(norm(A[i,j]) <= tolerance)
            A[i,j] = 0
        end
    end
    return A
end

Out[50]:

truncate_to_zero_below_diag

In [51]:

truncate_to_zero_below_diag(Ak)

Out[51]:

3×3 Array{Float64,2}:
 18.0249  -7.91186  -1.81805   
  0.0     -3.41599  -1.67461   
  0.0      0.0       0.00799986

In [52]:

Ak = QR_method(B)

Out[52]:

3×3 Array{Float64,2}:
 156.137         62.2705       -87.4604
   3.54467e-31  -34.1967        15.8136
  -2.29384e-47    1.53496e-14   16.06

In [53]:

truncate_to_zero_below_diag(Ak)

Out[53]:

3×3 Array{Float64,2}:
 156.137   62.2705  -87.4604
   0.0    -34.1967   15.8136
   0.0      0.0      16.06

Finally, we can extract the eigen values from the diagonal:

In [54]:

"Extract the diagonal from the QR method."
function QR_eigenvalues(A, maxiter=50)
    return diag(QR_method(A, maxiter))
end

Out[54]:

QR_eigenvalues

It does not work for matrices which are not full rank:

In [55]:

QR_eigenvalues(A)

Out[55]:

3-element Array{Float64,1}:
 18.0249    
 -3.41599   
  0.00799986

In [56]:

eigvals(A)

Out[56]:

3-element Array{Float64,1}:
 16.1168     
 -1.11684    
 -1.30368e-15

But it works fine for full rank matrix:

In [57]:

QR_eigenvalues(B)

Out[57]:

3-element Array{Float64,1}:
 156.137 
 -34.1967
  16.06

In [58]:

eigvals(B)

Out[58]:

3-element Array{Float64,1}:
 156.137 
  16.06  
 -34.1967

Computing the eigen vectors¶

Now, we have a numerical algorithm that gives an approximation of the eigen values. For each eigen value $\lambda$, we can use the inverse iteration to find the corresponding eigen vector.

In [59]:

"Inverse iteration method to find the eigen vector corresponding to a given eigen value."
function inverseIteration(A, val, maxiter=20, userandomstart=true)
    mu_I = val * eye(A)
    inv_A_minus_muI = inv(A - mu_I)
    n = size(A)[1]
    if userandomstart
        X = randn((n, 1))
    else
        X = ones((n, 1))
    end
    for i = 1:maxiter
        next_X = inv_A_minus_muI * X
        X = next_X / norm(next_X)
    end
    X
end

Out[59]:

inverseIteration

Now, putting it all together:

In [60]:

"Approximation of D, V: D contains the eigen values (vector) and V the eigen vectors (column based)."
function QR_eigen(A, maxiter=20, userandomstart=true)
    n = size(A)[1]
    D = QR_eigenvalues(A, maxiter)
    V = zeros((n,n))
    for i = 1:n
        V[:,i] = inverseIteration(A, D[i], maxiter, userandomstart)
    end
    return D, V
end

Out[60]:

QR_eigen

In [61]:

D2, V2 = QR_eigen(B)

Out[61]:

([156.137, -34.1966, 16.06], [0.328147 0.254758 -0.990526; -0.936881 0.302793 0.0871754; -0.120717 0.918376 0.106104])

In [62]:

D2s, V2s = eig(B)

Out[62]:

([156.137, 16.06, -34.1967], [0.328147 -0.990526 0.254758; -0.936881 0.0871754 0.302793; -0.120717 0.106104 0.918376])

Up to a permutation, and up to multiplication by $-1$ of some eigen vectors, we got the good values!

Conclusion¶

It is well known that finding eigenpairs and finding roots of a polynomial are equivalent problems (thanks to companion matrices). The Abel-Ruffini theorem states that there is no exact algorithm to find roots of polynomial of degrees larger than $4$, and therefore there is no hope of finding an exact algorithm for the eigenpair problem.

Hence, we focussed only on one numerical (i.e., iterative) method.

It's sad to see that there is not more details about general algorithms for the eigenpair problem.

That's it for today, folks! See here for other notebooks I wrote.