QR decomposition: the classical Gramm-Schmidt algorithm

The QR decomposition is the matrix factorization of the form

\begin{equation*} A=QR \end{equation*}

where $Q$ is orthogonal and $R$ is upper triangular. If $A$ is $m \times n$, the reduced QR decomposition gives $Q$ as $m \times n$ nd $R$ as $n \times n$.

The QR decomposition is computed most robustly using Householder reflections. However, the classical Gram-Schmidt algorithm forms a better first introduction to the QR decomposition, both because it's simpler to derive and understand, and because it very clearly illustrates several key properties of the QR decompositon, particularly how the column vectors of $Q$ span the same subspaces as the column vectors of $R$

In lecture we derived the following formulae for the column vectors $q_j$ of $Q$ and the elements $r_{ij}$ of $R$ in terms of the column vectors $a_j$ of $A$.

\begin{equation*} r_{ij} = q_i^T a_j, \text{ for } i>j \end{equation*}\begin{equation*} r_{jj} = \| a_j - \sum_{i=1}^{j-1} q_i r_{ij} \| \end{equation*}\begin{equation*} q_j = \frac{a_j - \sum_{i=1}^{j-1} q_i r_{ij}}{r_{jj}} \end{equation*}

Pseudocode

The following pseudocode calculates $Q$ and $R$ from $A$ according to these formulae. Note that the pseudocode uses a variable $v_j = a_j - \sum_{i=1}^{j-1} q_i r_{ij}$ to store the value common to both $r_{jj}$ and $q_j$.

In [1]:
# Inputs:  m x n matrix A
# Outputs: m x n matrix orthogonal matrix Q, and n x n upper-triangular matrix R

# for j = 1 to n        (loop over columns of A and Q)
#   vⱼ = aⱼ              (compute v_j)
#   for i = 1 to j-1
#      rᵢⱼ = qᵢᵀ aⱼ
#      vⱼ = vⱼ - qᵢ rᵢⱼ
#   end
#   rⱼⱼ = || vⱼ ||
#   q_j = vⱼ/rⱼⱼ
# end

Julia implementation

Now to implement that pseudo-code in Julia. As usual we have to be more explicit about function input and output values, allocate of memory for matrices of appropriate size, and translate subscripted quantities like $r_{ij}$ into square-bracket element-access syntax R[ij], and column vectors $q_i$ into Q[:,i].

In [1]:
function qr_cgs(A)
    (m,n) = size(A)
    Q = zeros(m,n)
    R = zeros(n,n)
    
    # loop over columns of A
    for j=1:n
        
        # compute vⱼ = aⱼ - ∑ᵢ₋₁ʲ⁻¹ qᵢ rᵢⱼ
        vj = A[:,j]
        for i=1:j-1
            R[i,j] = dot(Q[:,i], A[:,j])
            vj -= Q[:,i]*R[i,j]
        end
        
        # compute rⱼⱼ and normalize vⱼ
        R[j,j] = norm(vj)
        Q[:,j] = vj/R[j,j]
    end
    (Q,R)  # return Q and R
end
Out[1]:
qr_cgs (generic function with 1 method)

Testing

Let's test this QR algorithm on a random 5 x 5 matrix

In [2]:
A = randn(5,5)
Out[2]:
5×5 Array{Float64,2}:
 -1.44306  -0.61375    0.580981  1.26641    1.14068 
  1.74684  -1.37988    0.58241   0.320656   0.259255
  1.44289   1.07749    1.02144   0.891422   0.475755
 -1.30716   0.236091   0.146343  0.675819  -0.633998
  3.18121  -2.36817   -2.66886   1.24314    1.19592 
In [3]:
(Q,R) = qr_cgs(A);
In [4]:
Q
Out[4]:
5×5 Array{Float64,2}:
 -0.330672  -0.496341   0.359639    0.549528   0.461498 
  0.400282  -0.273054   0.776945   -0.249697  -0.314996 
  0.330634   0.687214   0.283991    0.577266   0.0673151
 -0.299533  -0.123549  -0.0784139   0.453596  -0.826509 
  0.728966  -0.437676  -0.424519    0.310942   0.0121658
In [5]:
R
Out[5]:
5×5 Array{Float64,2}:
 4.36401  -1.79017  -1.61061   0.708095   0.94557 
 0.0       2.4292    1.40457  -0.731117  -0.755105
 0.0       0.0       2.07303   0.37701    0.288797
 0.0       0.0       0.0       1.82354    0.92102 
 0.0       0.0       0.0       0.0        1.01534 

Great! Q is plausibly orthogonal (no elements greater than 1 in absolute value) and R is definitely upper-triangular. Let's check that Q is actually orthogonal.

In [6]:
Q'*Q
Out[6]:
5×5 Array{Float64,2}:
  1.0          -1.64543e-16  -3.64219e-17   6.66211e-18  -5.14229e-17
 -1.64543e-16   1.0          -1.12728e-16   1.83302e-16   5.12308e-17
 -3.64219e-17  -1.12728e-16   1.0          -1.03826e-16   1.01141e-16
  6.66211e-18   1.83302e-16  -1.03826e-16   1.0           2.71973e-18
 -5.14229e-17   5.12308e-17   1.01141e-16   2.71973e-18   1.0        

Yes, $Q$ is orthogonal to 16 digits. Now let's check that $QR = A$.

In [7]:
A
Out[7]:
5×5 Array{Float64,2}:
 -0.0430713   0.877899  -0.132762   0.117988      1.46608 
 -1.47081     1.28873   -1.10443   -1.04895      -0.888195
  0.339914   -2.2713    -0.352038   0.0882619    -0.727327
  0.887461    1.27564    0.782805  -0.000162938   0.827192
 -0.203991   -1.16642   -0.136293  -1.67001       0.523204
In [8]:
Q*R
Out[8]:
5×5 Array{Float64,2}:
 -0.0430713   0.877899  -0.132762   0.117988      1.46608 
 -1.47081     1.28873   -1.10443   -1.04895      -0.888195
  0.339914   -2.2713    -0.352038   0.0882619    -0.727327
  0.887461    1.27564    0.782805  -0.000162938   0.827192
 -0.203991   -1.16642   -0.136293  -1.67001       0.523204
In [9]:
norm(A-Q*R)
Out[9]:
2.406669721963077e-16

Oh yes indeed, $A=QR$ to sixteen digits accuracy!

Further reading

In [ ]: