Author: Nicholas Kohls - kohlsn@gmail.com
The matrix inversion lemma is also known as the Sherman-Morrison formula or sometimes the Woodbury's idenity.
The matrix inversion lemma says that the inverse of a rank-k correction of some matrix can be computed by doing a rank-k correction to the inverse of the original matrix. In other words, the lemma gives a method of finding the inverse of an updated matrix without having to directly compute the inverse of the entire updated matrix. This is possible if the original matrix's inverse in known. A common application is when something of low rank added to something of full rank and the inverse needs to be known.
Let B−1=(A+UCV)−1 ,the matrix inversion lemma finds B−1 without computing it directly. If the size of the matrix is a large number, computing B−1 directly is computational heavy. If this is the case finding B−1 using a less computational heavy method is desired. The matrix inverstion lemma finds B−1 using the known A−1. The matrix inversion lemma proves that the inverse of the updated Matrix does not need to recomputed directly.
The matrix inversion lemma can be used to provide an update to the inverse ot the Grammian matrix in a least-squares problem, to produce a simple version of what is known as the recursive least-squares (RLS) adaptive filter (explored in the next section).
The matrix inversion leema is applied in the Kalman filter and recursive least squares methods, to replace the parametric solution, requiring inversion of a state vector sized matrix, with a condition equations based solution. In case of the Kalman filter this matrix has the dimensions of the vector of observations, i.e., as small as 1 in case only one new observation is processed at a time. This significantly speeds up the often real time calculations of the filter.
Essentailly this lemma is used in many engineering practices where input is constnaly being fed and the inverse of the new data set needs to be computed. This is common in control systems.
Matrix Inversion Lemma (General Form):
The Woodbury matrix identity gives the inverse of an n×n square matrix A modified by a perturbation term UCV
Let B=A+UCV where A is n×n, U is n×k, C is k×k, and V is k×n
Matrix Inversion Lemma: B−1=(A+UCV)−1=A−1−A−1U(C−1+VA−1U)−1)VA−1
Direct Proof:
The formula can be proven by checking that times its alleged inverse on the right side of the matrix inversion lemma gives the identity matrix:
Derivation Proof: Two identities are needed to derive the matrix inversion lemma.
Idenity 1:
(I+P)−1=(I+P)−1(I+P−P)=I−(I+P)−1P
Idenity 2:
P+PQP=P(I+QP)=(I+PQ)P(I+PQ)−1P=P(I+QP)−1
Proof:
Use Idenity 1 (A+UCV)−1=(A[I+A−1UCV])−1=[I+A−1UCV]−1A−1=[I−(I+A−1UCV)−1A−1UCV]A−1 =A−1−(I+A−1UCV)−1A−1UCVA−1
Repeatly using Idenity 2 produces the following: (A+UCV)−1=A−1−(I+A−1UCV)−1A−1UCVA−1=A−1−A−1(I+UCVA−1)−1A−1UCVA−1=A−1−A−1U(I+CVA−1U)−1A−1CVA−1=A−1−A−1UC(I+VA−1UC)−1A−1VA−1=A−1−A−1UCV(I+A−1UCV)−1A−1A−1=A−1−A−1UCVA−1(I+UCVA−1)−1
Take one of iterations of the last step to find lemma (A+UCV)−1=A−1−A−1U(I+CVA−1U)−1A−1CVA−1=A−1−A−1U(C−1+VA−1U)−1)VA−1
Rank one updates:
One of the simplest changes that can be performed on a matrix is a so-called rank one update.
If U and V are k×1 column vectors u and v, the perturbation on A is a rank one update.
The reason why the transformation is called rank one is that the rank of the k×k matrix uvH is equal to 1 (because a single vector, u, spans all the columns of uvH).
Matrix inversion lemma for rank one updates:
The matrix inversion lemma with the column vectors u and v is simplified from its general form.
Where B=A+uvH and u and v are k×1 column vectors.
B−1=(A+uvH)−1=A−1−A−1uvHA−11+vHA−1u
Here is some simple python code that shows the concepts of matrix inversion lemma. This code uses the numpy inverse of the update matrix and compares it with matrix inversion lemma version of the updated inverted matrix. The difference should be 0.
# Matrix inversion lemma, rank one update
import numpy as np
def matrix_inversion_lemma(A_Inv,u,v):
v_transpose = np.transpose(v)
uvt = np.dot(u, v_transpose)
numerator = np.dot(np.dot(A_Inv, uvt), A_Inv)
denominator = np.add(1, np.dot(np.dot(v_transpose, A_Inv), u))
B_Inv_lemma = np.subtract(A_Inv, np.divide(numerator, denominator))
return B_Inv_lemma
# Known Values:
A = np.array([[1, 2, 3, 4],[1,-2,6,2],[4,2,1,0],[8,4,3,1]]) #Random values in 4x4
A_Inv = np.linalg.inv(A) # Inverse of A
# Updated Values
u = np.array([1,4,3,1]) #Random values
u = np.expand_dims(u, axis=1)
v = np.array([3,-1,1,2]) #Random values
v = np.expand_dims(v, axis=1)
# Computational Heavy Matrix/ Old Method
B = np.add(A,np.dot(u, np.transpose(v))) # Updated Matrix
B_Inv = np.linalg.inv(B) #Calculate inverse
# Matrix Inversion Lemma Method:
B_Inv_lemma = matrix_inversion_lemma(A_Inv,u,v)
#Find the difference between the Inverse caluclated directly and the inverse found uisng the Matrix inversion lemma
Diff = np.abs(np.subtract(B_Inv,B_Inv_lemma).round(10))
print(Diff)
[[0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.]]
A missle is travelling through the air and uses sensors to determine its position. The sensor data is stored in matricies and measures how far the missile has moved. Every nanosecond there is a new reading from the sensor. It is computationally expensive to recompute the whole matrix of the readings for course correction, calibrations, etc. The computer in the missile can not keep up with the pace of the sensor readings if the whole matrix is recomputed everytime.
The position of the missle is a linear system problem where Ax=b. Where x is the position of the missle. If we have a change in Matrix A from a sensor reading we get (A+uvT)x=b
Using the matrix inverstion lemma we find that x=(A+uvT)−1b=A−1b−A−1uvTA−11+vTA−1ub
Lets say that the original matrix A = (24−249−3−2−37)
B=A+uvT
The a3,2 element changed form -3 to -1
This means u and v can be:
u= (002)
v= (010)
Lets say the updated linear system is
(24−249−3−2−17)(x1x2x3)=(2810)
In this example A−1 is already known.
A−1= (6.75−2.750.75−2.751.25−0.250.75−0.250.25)
The code below finds the position vector x without inverting the matrix B−1
import numpy as np
def matrix_inversion_lemma(A_Inv,u,v):
v_transpose = np.transpose(v)
uvt = np.dot(u, v_transpose)
numerator = np.dot(np.dot(A_Inv, uvt), A_Inv)
denominator = np.add(1, np.dot(np.dot(v_transpose, A_Inv), u))
B_Inv_lemma = np.subtract(A_Inv, np.divide(numerator, denominator))
return B_Inv_lemma
# Known Values:
A = np.array([[2,4,-2],[4,9,-3],[-2,-3,7]])
A_Inv = np.array([[6.75,-2.75,0.75],[-2.75,1.25,-0.25],[0.75,-0.25,0.25]])
b = np.array([2,8,10])
b = np.expand_dims(b, axis=1)
# Updated Values
u = np.array([0,0,2])
u = np.expand_dims(u, axis=1)
v = np.array([0,1,0]) #Random values
v = np.expand_dims(v, axis=1)
# Matrix Inversion Lemma Method:
B_Inv_lemma = matrix_inversion_lemma(A_Inv,u,v)
x = np.dot(B_Inv_lemma,b)
print(x)
[[-7.] [ 4.] [ 0.]]
Consider the matrix A = (10−244−3−1−3−2)
Matrix A is updated into matrix B = (20−284−3−2−3−2)
A−1 = (17−6−8−11458−3−4)
Find B−1 using the matrix inversion lemma