import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import scipy as sp
import sympy as sy
sy.init_printing()
np.set_printoptions(precision=3)
np.set_printoptions(suppress=True)
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all" # display multiple results
def round_expr(expr, num_digits):
return expr.xreplace({n : round(n, num_digits) for n in expr.atoms(sy.Number)})
Matrix addition operations are straightforward:
They are as obvious as it looks, so no proofs are provided. And the matrix multiplication properties are:
Note that we need to differentiate two kinds of multiplication, Hadamard multiplication (element-wise multiplication) and matrix multiplication:
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
A*B # this is Hadamard elementwise product
array([[ 5, 12], [21, 32]])
A@B # this is matrix product
array([[19, 22], [43, 50]])
We show explicitly the matrix multiplication rule:
np.sum(A[0,:]*B[:,0]) # (1, 1)
np.sum(A[1,:]*B[:,0]) # (2, 1)
np.sum(A[0,:]*B[:,1]) # (1, 2)
np.sum(A[1,:]*B[:,1]) # (2, 2)
19
43
22
50
Let's define all the letters as symbols in case we might use them repetitively.
a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z = sy.symbols('a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z', real = True)
A = sy.Matrix([[a, b, c], [d, e, f]])
A + A
A - A
B = sy.Matrix([[g, h, i], [j, k, l]])
A + B
A - B
The matrix multiplication rules can be clearly understood by using symbols.
A = sy.Matrix([[a, b, c], [d, e, f]])
B = sy.Matrix([[g, h, i], [j, k, l], [m, n, o]])
A
B
AB = A*B; AB
The matrix multiplication usually do not commute, such that ${AB} \neq {BA}$. For instance, consider $ A$ and $ B$:
A = sy.Matrix([[3, 4], [7, 8]])
B = sy.Matrix([[5, 3], [2, 1]])
A*B
B*A
How do we find a commutable matrix?
A = sy.Matrix([[a, b], [c, d]])
B = sy.Matrix([[e, f], [g, h]])
A*B
B*A
To show ${AB} = {BA}$, we need to prove ${AB} - {BA} = 0$
M = A*B - B*A; M
That means \begin{align} b g - c f&=0 \ a f - b e + b h - d f&=0\
\end{align}
If we treat $a, b, c, d$ as coefficients of the system, we and extract an augmented matrix
A_aug = sy.Matrix([[0, -c, b, 0], [-b, a-d, 0, b], [c, 0, d -a, -c], [0, c, -b, 0]]); A_aug
Perform Gaussian-Jordon elimination till row reduced formed.
A_aug.rref()
The general solution is \begin{align} e - \frac{a-d}cg - h &=0\\ f - \frac{b}{c} & =0\\ g &= free\\ h & =free \end{align}
if we set coefficients $a = 10, b = 12, c = 20, d = 8$, or $ A = \left[\begin{matrix}10 & 12\\20 & 8\end{matrix}\right]$ then general solution becomes
\begin{align} e - .1g - h &=0\\ f - .6 & =0\\ g &= free\\ h & =free \end{align}Then try a special solution when $g = h = 1$ \begin{align} e &=1.1\\ f & =.6\\ g &=1 \\ h & =1 \end{align} And this is a commutable matrix of $A$, we denote $ C$.
C = sy.Matrix([[1.1, .6], [1, 1]]);C
Now we can see that ${AC}={CA}$.
A = sy.Matrix([[10, 12], [20, 8]])
A*C
C*A
Matrix $A_{n\times m}$ and its transpose is
A = np.array([[1, 2, 3], [4, 5, 6]]); A
A.T # transpose
array([[1, 2, 3], [4, 5, 6]])
array([[1, 4], [2, 5], [3, 6]])
A = sy.Matrix([[1, 2, 3], [4, 5, 6]]); A
A.transpose()
The properties of transpose are
We can show why the last property holds with SymPy, define $A$ and $B$, multiply them, then transpose, that means $(AB)^T$
A = sy.Matrix([[a, b], [c, d], [e, f]])
B = sy.Matrix([[g, h, i], [j, k, l]])
AB = A*B
AB_tr = AB.transpose(); AB_tr
Transpose each of them, then multiply, that means $B^TA^T$
B_tr_A_tr = B.transpose()*A.transpose()
B_tr_A_tr
To see if they are equal
AB_tr - B_tr_A_tr
This is an identity matrix $I_5$
sy.eye(5)
Identity matrix properties:
$$ AI=IA = A $$Let's generate $ I$ and $ A$ and show if it holds
I = np.eye(5); I
array([[1., 0., 0., 0., 0.], [0., 1., 0., 0., 0.], [0., 0., 1., 0., 0.], [0., 0., 0., 1., 0.], [0., 0., 0., 0., 1.]])
A = np.around(np.random.rand(5, 5)*100); A # generate a random matrix
array([[59., 75., 69., 73., 34.], [94., 8., 73., 6., 10.], [73., 23., 55., 27., 67.], [ 1., 36., 25., 68., 75.], [35., 10., 33., 31., 79.]])
Apparently it holds
A@I - I@A
array([[0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]])
An elementary matrix is a matrix that can be obtained from a single elementary row operation on an identity matrix. Such as:
where $R_1\leftrightarrow R_2$ means exchanging row $1$ and row $2$, and we denote the transformed matrix as ${E}$, then left multiply $ E$ onto a matrix $ A$. Generate $ A$ first
A = sy.randMatrix(3, percent = 80); A # generate a random matrix with 80% of entries being nonzero
Create an elementary matrix with $R_1\leftrightarrow R_2$
E = sy.Matrix([[0, 1, 0], [1, 0, 0], [0, 0, 1]]);E
It turns out that by left-multiplying $ E$ onto $ A$, $ A$ also switches the row $1$ and $2$.
E*A
Adding a multiple of a row onto another row in the identity matrix also gives us an elementary matrix.
$$ \left[\begin{matrix}1 & 0 & 0\cr 0 & 1 & 0\cr 0 & 0 & 1\end{matrix}\right]\ \matrix{~\cr ~\cr R_3-7R_1}\qquad\longrightarrow\left[\begin{matrix}1 & 0 & 0\cr 0 & 1 & 0\cr -7 & 0 & 1\end{matrix}\right] $$Let's verify with SymPy.
A = sy.randMatrix(3, percent = 80); A
E = sy.Matrix([[1, 0, 0], [0, 1, 0], [-7, 0, 1]]); E
E*A
We can also show this by explicit row operation on $ A$.
EA = sy.matrices.MatrixBase.copy(A)
EA[2,:]=-7*EA[0,:]+EA[2,:]
EA
In next section, we will see an important conclusion of elementary matrices multiplication is that an invertible matrix is a product of a series of elementary matrices.
If ${AB}={BA}=\mathbf{I}$, $ B$ is called the inverse of matrix $ A$, denoted as $ B= A^{-1}$.
NumPy has convenient function np.linalg.inv()
for computing inverse matrices. Generate $ A$
A = np.round(10*np.random.randn(5,5)); A
array([[ -1., -8., -6., -2., -16.], [ -1., -2., 8., -13., 4.], [ 1., 11., -13., 2., 3.], [ -1., -13., 6., 8., 9.], [-12., 9., -4., -1., 7.]])
Ainv = np.linalg.inv(A); Ainv
array([[-0.026, 0.013, 0.042, -0.005, -0.078], [-0.048, -0.038, -0.034, -0.064, 0.01 ], [-0.05 , -0.034, -0.096, -0.043, 0.003], [-0.026, -0.08 , -0.04 , -0.002, 0.006], [-0.015, 0.041, 0.055, 0.049, -0.002]])
Verify if they are truly inverse of each other
A@Ainv
array([[ 1., 0., -0., 0., -0.], [-0., 1., 0., -0., 0.], [-0., 0., 1., 0., 0.], [ 0., 0., 0., 1., -0.], [ 0., -0., -0., 0., 1.]])
The -0.
means there are more digits after point, but omitted here.
A convenient way of calculating inverse is that we can construct an augmented matrix $[ A\,|\,\mathbf{I}]$, then multiply a series of $ E$'s which are elementary row operations till matrix $A$ becomes row reduced form, if full rank it means $ A \rightarrow \mathbf{I}$. Then $\mathbf{I}$ on the RHS of augmented matrix will be converted into $ A^{-1}$ automatically.
We can show with SymPy's .rref()
function on the augmented matrix $[A\,|\,I]$.
AI = np.hstack((A, I)) # stack the matrix A and I horizontally
AI = sy.Matrix(AI); AI
AI_rref = AI.rref(); AI_rref
Extract the RHS block, this is the $A^{-1}$.
Ainv = AI_rref[0][:,5:];Ainv # extract the RHS block
I wrote a function to round the float numbers to the $4$th digits, on the top of this file, but this is not absolutely necessary.
round_expr(Ainv, 4)
We can verify if $AA^{-1}=\mathbf{I}$
A = sy.Matrix(A)
round_expr(A*Ainv, 4)
We got $\mathbf{I}$, which means the RHS block is indeed $A^{-1}$.
Determine the values of $\lambda$ such that the matrix $$A=\left[ \begin{matrix}3 &\lambda &1\cr 2 & -1 & 6\cr 1 & 9 & 4\end{matrix}\right]$$ is not invertible.
Still,we are using SymPy to solve the problem.
lamb = sy.symbols('lamda') # SymPy will automatically render into LaTeX greek letters
A = np.array([[3, lamb, 1], [2, -1, 6], [1, 9, 4]])
I = np.eye(3); A
array([[3, lamda, 1], [2, -1, 6], [1, 9, 4]], dtype=object)
Form the augmented matrix.
AI = np.hstack((A, I))
AI = sy.Matrix(AI); AI
AI_rref = AI.rref()
AI_rref
To make the matrix $A$ invertible we notice that is one conditions to be satisfied (in the denominators): \begin{align} -6\lambda -465 &\neq0\\ \end{align}
Solve for $\lambda$'s.
sy.solvers.solve(-6*lamb-465, lamb)
So this is one of $\lambda$ that cause the matrix invertible. Let's test with determinant. If $| A|=0$, then the matrix is not invertible. Don't worry about determinants, we will get back to this.
A = np.array([[3, -155/2, 1], [2, -1, 6], [1, 9, 4]])
np.linalg.det(A)
The $| A|$ is $0$.
So we found that one condition, as long as $\lambda \neq -\frac{155}{2}$, the matrix $A$ is invertible.
The first property is straightforward \begin{align} ABB^{-1}A^{-1}=AIA^{-1}=I=AB(AB)^{-1} \end{align}
The trick of second property is to show that $$ A^T(A^{-1})^T = I $$ We can use the property of transpose $$ A^T(A^{-1})^T=(A^{-1}A)^T = I^T = I $$
The third property is to show $$ A^{-1}B = (A^{-1}B)^T $$ Again use the property of transpose $$ (A^{-1}B)^{T}=B^T(A^{-1})^T=B(A^T)^{-1}=BA^{-1} $$ We use the $AB = BA$ condition to proceed \begin{align} AB&=BA\\ A^{-1}ABA^{-1}&=A^{-1}BAA^{-1}\\ BA^{-1}&=A^{-1}B \end{align} The plug in the previous equation, we have $$ (A^{-1}B)^{T}=BA^{-1}=A^{-1}B $$