Author: Daniel Free, daniel.free816@gmail.com
The LU factorization method provides a way to break down a square matrix into two triangular matrices, one lower and one upper (hence L and U). Triangular matrices have very desirable properties. For example, systems of equations with a triangular matrix are easily solved with forward or backward substitution. Solving a system in this manner is numerically superior to computing the inverse of a matrix explicitly.
A square matrix A can be factored as PA=LU where P is a permutation matrix, L is a lower triangular matrix with ones along the diagonal, and U is an upper triangular matrix. Permutation matrices are orthogonal (i.e. PTP=PPT=I), so the factorization can also be written as A=PTLU.
The main idea for computing the LU factorization is to use Gauss elimination or row reduction on the matrix A and keep track of each row-reducing step. To aid in this, we define an elementary matrix E as an identify matrix with a few elements in a single column off the diagonal that are non-zero. This represents the row reduction step. Doing these reduction steps to progressively make an upper triangular matrix results in the matrix U. For example, a 3×3 matrix will have 2 E matrices, each one zeroing out the lower elements of a column to make the upper triangular matrix, as shown below.
A=[×××××××××]where × is a non-zero element.
To find the off diagonal elements of E, simply divide the current element by the negative of the diagonal element. For example, if the first column of a 3×3 matrix was [49−1], then the first elementary matrix would be [100−94101401]. However, since finding the elements of the E matrices involves division, there is a possibility of accruing numerical error. To mititgate the risk of dividing by a large number by a small number and causing numerical issues, we use a permutation matrix Pij to reorder the rows of the current column so that the largest absolute value in the lower portion of the column is on the diagonal, where Pij is the identity matrix with rows i and j swapped (e.g. P13=[001010100],P23=[100001010]). Following the same example just given, we would create a permutation matrix P12 and then multiply the resulting matrix by E=[100−49101901]. In the general case, this would yield E2PcdE1PabA=[×××0××00×]=U
The matrix V is not neccessarily lower triangular. However, by applying the same permutation matrices in the original order, we can obtain the lower triangular matrix L.
PcdPabA=PA=PcdPabVU=LUwhere P=PcdPab.
Suppose you have a system of equations to solve in the form of Ax=b with A being a square n×n matrix. Multiplying both sides of the equation by P and substituting in the LU factorization, we obtain LUx=Pb. This can be rewritten as Ly=c, where y=Ux and c=Pb. Written out, this looks like [100⋯0l2110⋯0l31l321⋯0⋮⋮lk(n−1)⋱⋮ln1ln2⋯ln(n−1)1][y1y2y3⋮yn]=[c1c2c3⋮cn]
From this matrix equation, we can see that we will end up with n linear equations. However, since L is lower triangular, solving for y can be done using forward substitution. For example, the first equation will be y1=c1
Now that we know y we can solve the equation y=Ux for x. Similar to the lower triangular case, solving a matrix equation with an upper triangular matrix makes use of backwards substitution. The equation for the jth element of x can be solved by xj=1ujj(yj−n∑k=j+1ujkxk)
In addition to solving matrix equations, the LU factorization can be used to compute the determinate of a square matrix. Substituting in the definition of the factorization, we get det(A)=det(PTLU)
Suppose we have a matrix A and we want to find the LU factorization of it. The example below shows the step by step process.
import numpy as np
A = np.array([[9,4,8],[1,2,7],[-6,3,5]])
print("A =\n", A)
E1 = np.array([[1.,0.,0.],[-1/9,1.,0.],[6/9,0.,1.]])
print("E1 =\n", E1)
E1A = np.dot(E1,A)
print("E1*A =\n",E1A)
P23 = np.array([[1,0,0],[0,0,1],[0,1,0]])
P23E1A = np.dot(P23,E1A)
print("P23*E1*A = \n",P23E1A)
E2 = np.array([[1,0,0],[0,1,0],[0,-P23E1A[2,1]/P23E1A[1,1],1]])
print("E2 =\n",E2)
U = np.dot(E2,P23E1A)
print("E2*P23*E1*A = U =\n", U)
V = np.dot(np.dot((2*np.identity(3) - E1),P23),(2*np.identity(3)-E2))
print("V = \n",V)
L = np.dot(P23,V)
print("L = \n",L)
A = [[ 9 4 8] [ 1 2 7] [-6 3 5]] E1 = [[ 1. 0. 0. ] [-0.11111111 1. 0. ] [ 0.66666667 0. 1. ]] E1*A = [[ 9. 4. 8. ] [ 0. 1.55555556 6.11111111] [ 0. 5.66666667 10.33333333]] P23*E1*A = [[ 9. 4. 8. ] [ 0. 5.66666667 10.33333333] [ 0. 1.55555556 6.11111111]] E2 = [[ 1. 0. 0. ] [ 0. 1. 0. ] [ 0. -0.2745098 1. ]] E2*P23*E1*A = U = [[ 9. 4. 8. ] [ 0. 5.66666667 10.33333333] [ 0. 0. 3.2745098 ]] V = [[ 1. 0. 0. ] [ 0.11111111 0.2745098 1. ] [-0.66666667 1. 0. ]] L = [[ 1. 0. 0. ] [-0.66666667 1. 0. ] [ 0.11111111 0.2745098 1. ]]
Suppose you are sending data, in this case an image, but it has been encrypted or scrambled in some way. Through other techniques, you have determined the matrix that can help unscramble the image, given in the form Ax=b. However, this matrix applies only to a 50x50 grid of pixels at a time (i.e. b is the 50x50 matrix of pixel values reshaped into a single vector of length 2500). This means that the matrix A is 2500x2500 and an explicit inverse could be costly. Instead, lets do the LU factorization and use forward and backward substitution to resolve the image. The code below loads in the mixed data and the determined filter, computes the LU factorization, then resolves each 50x50 grid of pixels for each color. Finally, the image is converted to the right data type and displayed.
# Import libraries necessary to run the code
import matplotlib.pyplot as plt
import numpy as np
import scipy.linalg as alg
import scipy.io as sio
# Plot the images inline
%matplotlib inline
# Load the data. Comes with an A matrix and a matrix mx with size (300,150,3) representing rgb values. This is the scrambled image
data = sio.loadmat('mixed_img_andFilter.mat')
A = data['A']
mx = data['mx']
# Plot the scrambled image
fig1 = plt.imshow(mx)
plt.show()
# Compute the LU factorization. the scipy.linalg library computes it as A = P*L*U as opposed to Matlab's A = P'*L*U
P,L,U = alg.lu(A)
# Transpose the permutation matrix P for easier readability later
Pt = np.transpose(P)
# Loop through the scambled image doing 50x50 grids at a time
for i in range(3):
for j in range(0,251,50):
for k in range(0,101,50):
# Extract the current data
temp = mx[j:j+50,k:k+50,i]
tt = np.transpose(temp)
# Reshape the matrix into a vector
b = np.reshape(tt,2500)
Pb = np.dot(Pt,b)
# Solve the first triangular system (Ly = Pb)
y = alg.solve_triangular(L,Pb,lower=True)
# Solve the second triangular system (Ux = y)
x = alg.solve_triangular(U,y,lower = False)
# Reshape the output to be a 50x50 matrix again
mx[j:j+50,k:k+50,i] = np.transpose(np.reshape(x,(50,50)))
# Convert the matrix back into unsigned integers for image plotting
final = np.uintc(mx)
# Show the final image. Merry Christmas!
fig2 = plt.imshow(final)
plt.show()
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Add a homework assignment that might take 10 minutes to complete. Make sure you can work the problem yourself, but you do not need to submit a solution to the problem.