Lecture 14: QR factorization and least squares

In this lecture we see that Given's rotations can be used to calculate the QR decomposition line-by-line, and how it can also be used for rectangular matrices.

We consider a (random) $4 \times 4$ matrix:

In [77]:
A=rand(4,4)
Out[77]:
4x4 Array{Float64,2}:
 0.779135  0.463434  0.792271  0.960972
 0.74052   0.994196  0.990579  0.486781
 0.052372  0.794275  0.532929  0.943658
 0.176553  0.697142  0.609449  0.402935

We need to construct a 4 x 4 orthogonal matrix that acts only the first two rows. We can do so by modifying the identity:

In [79]:
a,b=A[1:2,1]

U=[a b;
    -b a]/sqrt(a^2+b^2)

Q1=eye(4)
Q1[1:2,1:2] = U

Q1
Out[79]:
4x4 Array{Float64,2}:
  0.724841  0.688916  0.0  0.0
 -0.688916  0.724841  0.0  0.0
  0.0       0.0       1.0  0.0
  0.0       0.0       0.0  1.0

This satisfies the property that Q1*A has a zero in the (2,1) entry. We assign this updated matrix to B:

In [82]:
Q1*A
Out[82]:
4x4 Array{Float64,2}:
  1.0749       1.02083   1.2567     1.0319  
 -1.11022e-16  0.401366  0.172204  -0.309191
  0.052372     0.794275  0.532929   0.943658
  0.176553     0.697142  0.609449   0.402935

We continue the process to insert a zero in the (3,1) entry by acting on the first and third rows of B=Q1*A:

In [83]:
B=Q1*A   # 
a,b=B[[1,3],1]

U=[a b;
    -b a]/sqrt(a^2+b^2)
Q2=eye(4)
Q2[[1,3],[1,3]] = U

B=Q2*B
Out[83]:
4x4 Array{Float64,2}:
  1.07618      1.05828   1.28114    1.0766  
 -1.11022e-16  0.401366  0.172204  -0.309191
  0.0          0.743655  0.47114    0.892323
  0.176553     0.697142  0.609449   0.402935

We now act on the 1st and 4th rows to introduce one more zero:

In [84]:
row1=1
row2=4

a,b=B[[row1,row2],1]

U=[a b;
    -b a]/sqrt(a^2+b^2)
Q3=eye(4)
Q3[[row1,row2],[row1,row2]] = U

B=Q3*B
Out[84]:
4x4 Array{Float64,2}:
  1.09057      1.15718   1.36291    1.12763 
 -1.11022e-16  0.401366  0.172204  -0.309191
  0.0          0.743655  0.47114    0.892323
  0.0          0.51662   0.394004   0.223327

We now proceed to the 2nd column, introducing zeros entry by entry. The final result is an upper triangular matrix R:

In [85]:
col=2
row1,row2=2,3

a,b=B[[row1,row2],col]

U=[a b;
    -b a]/sqrt(a^2+b^2)
Q4=eye(4)
Q4[[row1,row2],[row1,row2]] = U

B=Q4*B

col=2
row1,row2=2,4

a,b=B[[row1,row2],col]

U=[a b;
    -b a]/sqrt(a^2+b^2)
Q5=eye(4)
Q5[[row1,row2],[row1,row2]] = U

B=Q5*B

col=3
row1,row2=3,4

a,b=B[[row1,row2],col]

U=[a b;
    -b a]/sqrt(a^2+b^2)
Q6=eye(4)
Q6[[row1,row2],[row1,row2]] = U

R=Q6*B
Out[85]:
4x4 Array{Float64,2}:
  1.09057       1.15718      1.36291    1.12763 
 -4.49897e-17   0.990461     0.629033   0.661164
  8.68201e-17  -4.05456e-17  0.105754   0.371273
 -5.25751e-17  -3.79149e-17  0.0       -0.605585

In otherwords,

R=Q6*Q5*Q4*Q3*Q2*Q1*A
In [86]:
Q6*Q5*Q4*Q3*Q2*Q1*A
Out[86]:
4x4 Array{Float64,2}:
  1.09057      1.15718      1.36291    1.12763 
 -2.77556e-17  0.990461     0.629033   0.661164
  2.77556e-17  0.0          0.105754   0.371273
 -1.38778e-17  1.11022e-16  0.0       -0.605585

Thus the Q in QR is

Q=inv(Q6*Q5*Q4*Q3*Q2*Q1)=Q1'*Q2'*Q3'*Q4'*Q5'*Q6'
In [35]:
Q=Q1'*Q2'*Q3'*Q4'*Q5'*Q6'

norm(Q'*Q-eye(4))
Out[35]:
2.6923844779914502e-16

QR on rectangular matrices

We can apply the same procedure to a rectangular matrix, for example:

In [87]:
A[:,1:3]
Out[87]:
4x3 Array{Float64,2}:
 0.779135  0.463434  0.792271
 0.74052   0.994196  0.990579
 0.052372  0.794275  0.532929
 0.176553  0.697142  0.609449

Given's rotations proceed just as before:

In [88]:
B=A[:,1:3]


a,b=B[1:2,1]

U=[a b;
    -b a]/sqrt(a^2+b^2)

Q1=eye(4)
Q1[1:2,1:2] = U


B=Q1*B


a,b=B[[1,3],1]

U=[a b;
    -b a]/sqrt(a^2+b^2)
Q2=eye(4)
Q2[[1,3],[1,3]] = U

B=Q2*B

row1=1
row2=4

a,b=B[[row1,row2],1]

U=[a b;
    -b a]/sqrt(a^2+b^2)
Q3=eye(4)
Q3[[row1,row2],[row1,row2]] = U

B=Q3*B


col=2
row1,row2=2,3

a,b=B[[row1,row2],col]

U=[a b;
    -b a]/sqrt(a^2+b^2)
Q4=eye(4)
Q4[[row1,row2],[row1,row2]] = U

B=Q4*B

col=2
row1,row2=2,4

a,b=B[[row1,row2],col]

U=[a b;
    -b a]/sqrt(a^2+b^2)
Q5=eye(4)
Q5[[row1,row2],[row1,row2]] = U

B=Q5*B

col=3
row1,row2=3,4

a,b=B[[row1,row2],col]

U=[a b;
    -b a]/sqrt(a^2+b^2)
Q6=eye(4)
Q6[[row1,row2],[row1,row2]] = U

R=Q6*B
Out[88]:
4x3 Array{Float64,2}:
  1.09057       1.15718      1.36291 
 -4.49897e-17   0.990461     0.629033
  8.68201e-17  -4.05456e-17  0.105754
 -5.25751e-17  -3.79149e-17  0.0     

Now the notion of upper triangular has been changed to allow for rectangular R. We can verify that Q*R returns the desired result:

In [93]:
Q=Q1'*Q2'*Q3'*Q4'*Q5'*Q6'
norm(Q*R-A[:,1:3])
Out[93]:
6.92215038654262e-16

QR and least squares

We can use the fact that the 2-norm is invariant under orthogonal transformations to reduce the problem of least squares to a triangular system:

$$\|{A\mathbf{x} - \mathbf{b}}\|_2 = \|{QR\mathbf{x} - \mathbf{b}}\|_2 = \|Q({R\mathbf{x} - Q^\top\mathbf{b}})\|_2=\|{R\mathbf{x} - Q^\top\mathbf{b}}\|_2$$

Recall that \ is the inbuilt routine for solving least squares:

In [96]:
A=rand(4,3)
b=rand(4)

x=A\b
Out[96]:
3-element Array{Float64,1}:
  0.607407
  0.377505
 -0.49233 

We can create a QR decomposition as follows:

In [98]:
Q,Rsmall=qr(A;thin=false)
R=zeros(4,3)
R[1:3,1:3]=Rsmall

Q,R
Out[98]:
(
4x4 Array{Float64,2}:
 -0.495448   0.600443    0.545061  -0.3113   
 -0.413077  -0.0192636  -0.692123  -0.591576 
 -0.600523   0.137519   -0.276001   0.737756 
 -0.472516  -0.787519    0.384315  -0.0940497,

4x3 Array{Float64,2}:
 -1.63043  -1.14771   -1.2136  
  0.0      -0.500113  -0.221979
  0.0       0.0        0.654625
  0.0       0.0        0.0     )

Indeed, we have:

In [99]:
(R\(Q'*b))-x
Out[99]:
3-element Array{Float64,1}:
 -1.11022e-16
  2.77556e-16
 -3.88578e-16

While R is rectangular, since the bottom rows are zero they do not contribute at all to R*x. Thus R*x is equivalent to [R[1:3,1:3]*x; 0]. And so we can determine that x can be found by inverting a square upper triangular matrix (using back substitution):

In [101]:
R[1:3,1:3]\(Q'*b)[1:3]
Out[101]:
3-element Array{Float64,1}:
  0.607407
  0.377505
 -0.49233 

Thin QR

An alternative version of QR is for Q to be rectangular and R to be square. This is is equivalent to dropping the zero rows of R and taking the corresponding columns of Q: if

$$A = Q R = \left(q_1 | \cdots | q_m | q_{m+1} | \cdots | q_n\right) \begin{pmatrix} \bar R \cr 0_{n-m \times m} \end{pmatrix}$$

where $A$ is $n \times m$ and $\bar R$ is $m \times m$ then we also have

$$A = \bar Q \bar R \qquad \hbox{for} \qquad \bar Q = \left(q_1 | \cdots | q_n\right)$$

where $\bar Q$ is $n \times m$. This thin QR is what is returned by default by qr:

In [2]:
A=rand(5,3)

Q,R=qr(A)

size(Q),size(R)
Out[2]:
((5,3),(3,3))

Q is orthogonal in the sense that $Q^\top Q = I$:

In [3]:
norm(Q'*Q-eye(3))
Out[3]:
4.483099266212884e-16

On the other hand, $QQ^\top \neq I$:

In [76]:
Q*Q'
Out[76]:
4x4 Array{Float64,2}:
  0.942685    0.191028   -0.130566    0.0221612
  0.191028    0.363317    0.435167   -0.0738617
 -0.130566    0.435167    0.702567    0.0504838
  0.0221612  -0.0738617   0.0504838   0.991431