AUTHOR
MD SHORIFUL ISLAM;
EMAIL- ISLAM5@BYU.EDU
MODIFIED
DALLIN SKOUSON
changelog December 2020: Updated code to work with python3 by default Added information about other methods of calculating the QR factorization Added code example for calcuation using Gram-Schmidt Clarified some language Added additional exploration to existing code
In linear algebra, a QR factorization of a matrix is a factorization of a matrix A into a product A=QR of an orthogonal matrix Q and an upper triangular matrix R. QR factorization is often used to solve the linear least squares problem and is the basis for a particular eigenvalue algorithm, the QR algorithm.
The Q in the QR factorization will be unitrary (orthogonal when dealing with real numbers). What this means is that QHQ=I. Additionally it will be shown that QR does yield A.
One of the key benefits of using QR factorization over other methods for solving linear least squares is that it is more numerically stable, albeit at the expense of being slower to execute. Hence if we are performing a large quantity of regressions as part of a trading backtest, for instance, we need to consider very extensively whether QR factorization is the best fit.
For a square matrix A the QR factorization converts A into the product of an orthogonal matrix Q (i.e. QTQ=I) and an upper triangular matrix R. Hence:
A=QR The first step is to create the vector x, which is the k-th column of the matrix A, for step k. We define α=−sgn(xk)(||x||) where sgn is the sign operator. The norm ||⋅|| used here is the Euclidean norm. Given the first column vector of the identity matrix, I of equal size to A, e1=(1,0,...,0)T, we create the vector u:
u=x+αe1 Once we have the vector u, we need to convert it to a unit vector, which we denote as v:
v=u/||u|| Now we form the matrix Q out of the identity matrix I and the vector multiplication of v:
Q=I−2vvT Q is now an m×m Householder matrix, with Qx=(α,0,...,0)T. We will use Q to transform A to upper triangular form, giving us the matrix R. We denote Q as Qk and, since k=1 in this first step, we have Q1 as our first Householder matrix. Hence, we define Qk as the block matrix Qk:
[Ik−100Qk]Qk=(Ik−100Q′k) Once we have carried out t iterations of this process we have R as an upper triangular matrix:
R=Qt...Q2Q1A Q is then fully defined as the multiplication of the transposes of each Qk:
Q=QT1QT2...QTt
This gives A=QR, the QR factorization of A.
This method is the most poorly numerically conditioned method of those highlighted here, however it is also the least computationally intensive method of calculating the QR factorization.
The Gram-Schmidt method creates an orthonormal basis to span the columns of A. This orthonormal basis can be used to create an upper triangular R matrix and by extension a unitary Q matrix.
Givens rotations are similar to householder transformations, but they zero out individual elements of the matrix one at a time instead of zeroing out portions of columns all at once. Additionally, givens rotations can be implemented using CORDIC rotations which allows the solution to be parallelized and pipleined. Carefully selected rotations can aid in simplifying computational complexity, by ensuring that mulitplications are powers of 2 and that fewer multiplications need to be done.
The QR factorization is found using a common library. In order to compare the result for each method the same matrix A is used for each.
A =
[[12, -51, 4],
[6, 167, -68],
[-4, 24, -41]]
#Finding the QR factorization using SciPy
#Print A, Q and R at the output
import pprint
import scipy
import scipy.linalg # SciPy Linear Algebra Library
#Create an matrix
A = scipy.array([[12, -51, 4], [6, 167, -68], [-4, 24, -41]]) # From the Wikipedia Article on QR factorization
Q, R = scipy.linalg.qr(A)
print("A:")
pprint.pprint(A)
A: array([[ 12, -51, 4], [ 6, 167, -68], [ -4, 24, -41]])
print ("Q:")
pprint.pprint(Q)
Q: array([[-0.85714286, 0.39428571, 0.33142857], [-0.42857143, -0.90285714, -0.03428571], [ 0.28571429, -0.17142857, 0.94285714]])
print ("R:")
pprint.pprint(R)
R: array([[ -14., -21., 14.], [ 0., -175., 70.], [ 0., 0., -35.]])
QR is in fact a factorization of A as demonstrated by multiplying the matricies togehter
# demonstrate that QR yields A
def mult_matrix(X, Y):
"""Multiply square matrices X and Y of same dimension"""
# Nested list comprehension to calculate matrix multiplication
return [[sum(a*b for a,b in zip(X_row,Y_col)) for Y_col in zip(*Y)] for X_row in X]
print("QR:")
pprint.pprint(scipy.array(mult_matrix(Q,R)))
QR: array([[ 12., -51., 4.], [ 6., 167., -68.], [ -4., 24., -41.]])
Additionally Q is a unitary, and in this case orthogonal matrix as shown by multiplying QTQ and QQT both of which should yield I
#show that Q is unitary
def mult_matrix(X, Y):
"""Multiply square matrices X and Y of same dimension"""
# Nested list comprehension to calculate matrix multiplication
return [[sum(a*b for a,b in zip(X_row,Y_col)) for Y_col in zip(*Y)] for X_row in X]
def trans_matrix(M):
"""Take the transpose of a matrix."""
n = len(M)
return [[ M[i][j] for i in range(n)] for j in range(n)]
print("Q tranpose(Q)")
pprint.pprint(scipy.array(mult_matrix(Q,trans_matrix(Q))))
print("tranpose(Q) Q")
pprint.pprint(scipy.array(mult_matrix(trans_matrix(Q),Q)))
Q tranpose(Q) array([[ 1.00000000e+00, 8.84708973e-17, 0.00000000e+00], [ 8.84708973e-17, 1.00000000e+00, -3.46944695e-17], [ 0.00000000e+00, -3.46944695e-17, 1.00000000e+00]]) tranpose(Q) Q array([[ 1.00000000e+00, -2.08166817e-17, -5.55111512e-17], [-2.08166817e-17, 1.00000000e+00, 2.77555756e-17], [-5.55111512e-17, 2.77555756e-17, 1.00000000e+00]])
Note that although the values are not exactly zero off the diagonal this can be explained by floating point math error.
The QR factorization is computed using householder transformations.
from math import sqrt
from pprint import pprint
def mult_matrix(X, Y):
"""Multiply square matrices X and Y of same dimension"""
# Nested list comprehension to calculate matrix multiplication
return [[sum(a*b for a,b in zip(X_row,Y_col)) for Y_col in zip(*Y)] for X_row in X]
def trans_matrix(M):
"""Take the transpose of a matrix."""
n = len(M)
return [[ M[i][j] for i in range(n)] for j in range(n)]
def norm(x):
"""Return the Euclidean norm of the vector x."""
return sqrt(sum([x_i**2 for x_i in x]))
def Q_i(Q_min, i, j, k):
"""Construct the Q_t matrix by left-top padding the matrix Q
with elements from the identity matrix."""
if i < k or j < k:
return float(i == j)
else:
return Q_min[i-k][j-k]
def householder(A):
"""Performs a Householder Reflections based QR Decomposition of the
matrix A. The function returns Q, an orthogonal matrix and R, an
upper triangular matrix such that A = QR."""
n = len(A)
# Set R equal to A, and create Q as a zero matrix of the same size
R = A
Q = [[0.0] * n for i in range(n)]
# The Householder procedure
for k in range(n-1): # We don't perform the procedure on a 1x1 matrix, so we reduce the index by 1
# Create identity matrix of same size as A
I = [[float(i == j) for i in range(n)] for j in range(n)]
# Create the vectors x, e and the scalar alpha
# find the sign of the values. used -cmp() in python2
x = [row[k] for row in R[k:]]
e = [row[k] for row in I[k:]]
alpha = -((x[0] > 0) - (x[0] < 0)) * norm(x)#-cmp(x[0],0) * norm(x)
# Using anonymous functions, we create u and v
u = list(map(lambda p,q: p + alpha * q, x, e))
norm_u = norm(u)
v = list(map(lambda p: p/norm_u, u))
# Create the Q minor matrix
Q_min = [ [float(i==j) - 2.0 * v[i] * v[j] for i in range(n-k)] for j in range(n-k) ]
# "Pad out" the Q minor matrix with elements from the identity
Q_t = [[ Q_i(Q_min,i,j,k) for i in range(n)] for j in range(n)]
# If this is the first run through, right multiply by A,
# else right multiply by Q
if k == 0:
Q = Q_t
R = mult_matrix(Q_t,A)
else:
Q = mult_matrix(Q_t,Q)
R = mult_matrix(Q_t,R)
# Since Q is defined as the product of transposes of Q_t,
# we need to take the transpose upon returning it
return trans_matrix(Q), R
#creating a matrix
A = [[12, -51, 4], [6, 167, -68], [-4, 24, -41]]
#finding the QR factorization
Q, R = householder(A)
print ("A:")
pprint(A)
A: [[12, -51, 4], [6, 167, -68], [-4, 24, -41]]
print ("Q:")
pprint(Q)
Q: [[0.8571428571428571, 0.39428571428571435, -0.33142857142857135], [0.4285714285714286, -0.9028571428571429, 0.034285714285714114], [-0.28571428571428575, -0.17142857142857126, -0.942857142857143]]
print ("R:")
pprint(R)
R: [[13.999999999999998, 21.00000000000001, -14.000000000000004], [-5.506706202140776e-16, -175.00000000000003, 70.0], [3.0198066269804245e-16, -3.552713678800501e-14, 35.000000000000014]]
This result is comparable to the result achieved with scipy, being off by some rounding error. Casting the resulting matricies to scipy.array will aid in comparison because it will print with fewer significant digits.
import scipy
print("Q:")
pprint(scipy.array(Q))
print("R:")
pprint(scipy.array(R))
Q: array([[ 0.85714286, 0.39428571, -0.33142857], [ 0.42857143, -0.90285714, 0.03428571], [-0.28571429, -0.17142857, -0.94285714]]) R: array([[ 1.40000000e+01, 2.10000000e+01, -1.40000000e+01], [-5.50670620e-16, -1.75000000e+02, 7.00000000e+01], [ 3.01980663e-16, -3.55271368e-14, 3.50000000e+01]])
import numpy as np
def gram_schmidt(A):
m = A.shape[0]
n = A.shape[1]
R=np.zeros([m,n])
Q=np.zeros([m,m])
R[0,0]=np.linalg.norm(A[:,0])
if R[0, 0] == 0:
print ("Could not compute the QR factorization")
else:
Q[:,0]=A[:,0]/R[0,0]
for i in range(1,n):
Q[:,i]=A[:,i]
for j in range(i):
R[j,i]=np.dot(Q[:,j].T,Q[:,i])
Q[:,i]-=R[j,i]*Q[:,j]
R[i,i]=np.linalg.norm(Q[:,i])
if R[i,i]==0:
print("Could not compute the QR factorization")
else:
Q[:, i] = Q[:, i] / R[i, i]
return Q,R
A = np.array([[12, -51, 4], [6, 167, -68], [-4, 24, -41]])
Q, R = gram_schmidt(A)
print("Q = ", Q)
print("R = ", R)
Q = [[ 0.85714286 -0.39428571 -0.33142857] [ 0.42857143 0.90285714 0.03428571] [-0.28571429 0.17142857 -0.94285714]] R = [[ 14. 21. -14.] [ 0. 175. -70.] [ 0. 0. 35.]]
In numerical analysis, different factorization are used to implement efficient matrix algorithms. For instance, when solving a system of linear equations , the matrix A can be factorized via the LU factorization. The LU factorization factors a matrix into a lower triangular matrix L and an upper triangular matrix U. Solving the system will require fewer additions and multiplications in the LU factorization, though one might require significantly more digits in inexact arithmetic such as floating point. Similarly, the QR factorization expresses A as QR with Q a unitary matrix and R an upper triangular matrix. The system Q(Rx)=b is solved by Rx=QTb=c, and the system Rx=c is solved by 'back substitution'. The number of additions and multiplications required is about twice that of using the LU solver, but because the QR factorization is numerically stable, no additional significant digits are required in inexact arithmetic.
Even though QR factorization can be used to solve a system of Linear Equations and is in fact better than Gaussian Elimination in terms of stability, it is rarely used for the same because of its cost of computation being twice that of Gaussian elimination. Another important use of QR factorization is inspiring the QR algorithm for calculation of eigen value factorization.
Householder transformations and Gram-Schmidt are an appropriate methods for calculating the QR factorization of a matrix. Gram-Schmidt is much simpler compuatationally but it is less robust to poorly conditioned matricies. All methods yield similar results as well