This notebook will follow section 4.2 Operator norms in Mathematical Methods and Algorithms for Signal Processing by Todd K. Moon.
Before learning about matrix norms, it is beneficial to look at the definition of a linear operator norm. We will then use this definition to analyze the geometry and boundedness of an operator norm. Then we will look at matrix norms and finish with an example.
There are many ways of defining this norm, and one way to think about it is how much the operator can change the length of the vector it operates on.
Definition Let X and Y be normed vector spaces, and let A be a linear operator A:X→Y. The p operator norm, or p norm, or lp norm, of A is
||A||p=supx∈X,≠0||Ax||p||x||p=supx∈X,||x||p=1||Ax||pwhere ||⋅||p is the p norm.
A good way to visually look at how an operator norm changes the length of a vector is to look at how the subordinate matrix norm ||A||p changes the unit circle.
Let x∈X,||x||p=1 and A∈R2x2 then ||A||p=maxy where y∈Y and y=Ax
%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt
def visualize(A = np.matrix([[2,0],[0,0.5]]), norm_type = 2):
# Construct x s.t. ||x||_1 = 1
if norm_type == 1:
num_points = 5
x1 = np.concatenate((np.linspace(0,1,num_points),np.linspace(1,0,num_points),
np.linspace(0,-1,num_points), np.linspace(-1,0,num_points)))
x2 = np.concatenate((np.linspace(-1,0,num_points),np.linspace(0,1,num_points),
np.linspace(1,0,num_points),np.linspace(0,-1,num_points)))
x = np.vstack([x1,x2])
# Construct x s.t. ||x||_2 = 1
elif norm_type == 2:
angle = np.linspace(-np.pi, np.pi,20)
x = np.vstack([np.sin(angle),np.cos(angle)])
else:
num_points = 5
x1 = np.concatenate((np.linspace(-1,1,num_points),np.linspace(1,1,num_points),
np.linspace(1,-1,num_points), np.linspace(-1,-1,num_points)))
x2 = np.concatenate((np.linspace(-1,-1,num_points),np.linspace(-1,1,num_points),
np.linspace(1,1,num_points),np.linspace(1,-1,num_points)))
x = np.vstack([x1,x2])
# Compute y
y = A*x
# Get maximum x and y axis values for plotting
x_max = max(abs(np.concatenate((x[0,:],np.array(y[0,:]).reshape(-1,)) ) ))
y_max = max(abs(np.concatenate((x[1,:],np.array(y[1,:]).reshape(-1,)))))
dim_max = x_max if x_max > y_max else y_max
dim_max += 0.1
# Plot results
plt.figure()
plt.subplot(131,aspect='equal')
plt.plot(x[0,:],x[1,:])
plt.ylim([-dim_max,dim_max])
plt.xlim([-dim_max,dim_max])
plt.xlabel('$x_1$')
plt.ylabel('$x_2$')
plt.title('x')
plt.subplot(133,aspect='equal')
plt.plot(np.array(y[0,:]).reshape(-1,),np.array(y[1,:]).reshape(-1,))
plt.ylim([-dim_max,dim_max])
plt.xlim([-dim_max,dim_max])
plt.xlabel('$y_1$')
plt.ylabel('$y_2$')
plt.title('y=Ax')
visualize(norm_type = 2)
print("Matrix Norm", np.linalg.norm(np.matrix([[2,0],[0,0.5]])))
Matrix Norm 2.0615528128088303
By visual inspection of the plots above, we can see how A stretches the elements in X when mapping them to elements in Y. Notice that the elements along the x1 axis are stretched by a factor of 2 and the elements along the x2 axis are squeezed by a factor of 0.5. Since the linear operator A stretched the elements of x∈X∣||x||=1 at most by 2 then ||A||=2. It is apparent now that the norm doesn't indicate how the linear operator will stretch all of the elements in X, rather it indicates the maximum extent it will stretch an element in X. i.e.
||Ax||≤||A||||x||,∀x∈XKnowing the most a linear operator will stretch the elements it operates on can help us know if a bounded input x will produce a bounded output y.
The norm of an operator can be used to determine if a linear transformation is bounded, i.e. A will map all finite values in X to finite values in Y.
Definition 4.3 If the norm of a transformation is finite, the transformation is said to be bounded.
Using example 4.2.2 from the book.
Let A:C1[0,1]→C[0,1] be the operator
Ax=ddtxThe function x(t)=sin(w0t)∈C1[0,1] has uniform norm 1 for any value of w0, but
||Ax||=maxt∈[0,1]w0|cosw0t|may have norm arbitrarily large by choosing w0 to be arbitrarily large. Thus the operator A bounded ⟺w0 is not bounded.
from __future__ import print_function
from ipywidgets import *
import time
fig,ax = plt.subplots()
# Create x(t)
dt = 0.001
t = np.arange(0.0, 1.0, dt)
w_0 = 10;
x = np.sin(w_0*t)
# Create Ax(t)
y = w_0*np.cos(w_0*t)
l, = ax.plot(t, y, lw=2, color='red')
plt.ion()
def update(w_0 = 10):
l.set_ydata( w_0*np.cos(w_0*t))
ax.axis(ymax=w_0, ymin=-w_0)
fig.canvas.draw()
interact(update, w_0 = widgets.IntSlider(min=10,max=1000,step=10,value=10))
interactive(children=(IntSlider(value=10, description='w_0', max=1000, min=10, step=10), Output()), _dom_class…
<function __main__.update(w_0=10)>
By moving the slider in the above figure, you can see as w0→∞,||Ax||→∞ while x stays finite between 0 and 1. Thus the norm of the operator is not bounded for every finite value of x. The example at the end will show the importance of boundedness.
Suppose you have a system described by h(t) that amplifies audio input X. Unfortunately, clipping that is not modeled by the system occurs when the output ˆY is above a certain threshold ym i.e.
ˆy=h(t)∗xy=sat(ˆy,ym)whenever clipping occurs, the signal is degraded. So to prevent clipping we want to design h(t) so that clipping never occurs and verify that ||Ax||≤ym∀x∈X. This task could be very difficult if you verified it by brute force. Fortunately, there is a simpler way.
Suppose you knew the largest norm of all the elements in X and you knew h(t), then you could deduce
||h(t)∗x||p≤||h(t)||||xm||∀x∈Xwhere ||xm||≥||x||∀x∈X. This idea uses the submultiplicative property.
Definition Let X and Y be lp or Lp, and let A and B be linear operators. The p norm has the property for all x∈X that ||Ax||p≤||A||p||x||p
Also, the p norm satisfies the submultiplicative property ||AB||p≤||A||p||B||p
From the previous equations we can deduce ||ABx||p≤||A||p||B||p||x||p
So far we have generalized our study to any linear operator. The rest of this notebook will specialize on the matrix p=1, p=2, p=∞ norms and the Frobenius norm.
We will start by creating a basic, python, matrix class and add to it as different matrix norms are introduced.
class Matrix:
def __init__(self,A=np.eye(3)):
self.A = A
Let A∈Cnxm, y∈Cn and x∈Cm the infinity norm is defined as ||A||∞=max||x||∞=1||Ax||∞=maxi∑j|aij|
class Matrix(Matrix):
def pInfNorm(self):
dim = self.A.shape
pInf_max = 0
# Find the abs, max, row sum
for i in range(dim[0]):
r_sum = np.sum(abs(self.A[i,:]))
if r_sum > pInf_max:
pInf_max = r_sum
return pInf_max
M = np.matrix([[4,2],[0,5]])
m = Matrix(M)
print("Infinity Norm:", m.pInfNorm())
visualize(m.A, norm_type=3)
Infinity Norm: 6
By inspecting the above images, you can see that the infinity norm indicates how much A stretches x along the axes of y
The p=1 norm is defined as ||A||1=max||x||1=1||Ax||1=maxj∑i|aij|
class Matrix(Matrix):
def pOneNorm(self):
dim = self.A.shape
pOne_max = 0
# Find the abs, max, row sum
for i in range(dim[1]):
c_sum = np.sum(abs(self.A[:,i]))
if c_sum > pOne_max:
pOne_max = c_sum
return pOne_max
M = np.matrix([[4,2],[0,5]])
m = Matrix(M)
print("norm p=1:", m.pOneNorm())
visualize(m.A, norm_type=1)
norm p=1: 7
The p=2 norm is defined as
||A||2=√ρ(AHA)where
ρ(A)=maxi|λi|i.e. the square-root of the largest eigenvalue of AHA.
Note * The book goes through the derivation of this norm using optimization and principles of eigenvalue that has not yet been covered, so I will skip the derivation of it.
class Matrix(Matrix):
def pTwoNorm(self):
# Compose A^HA
B = self.A.getH()*self.A
# Find the eigen values
w,v = np.linalg.eig(B)
# get the square root of the largest eigen value
rho = np.sqrt(w.max())
return rho
M = np.matrix([[4,2],[0,5]])
m = Matrix(M)
print("norm p=2:", m.pTwoNorm())
visualize(m.A, norm_type=2)
norm p=2: 5.727806217396338
The above figures show how much the unit circle stretches along the eigenvectors of A
The Frobenius Norm is not a p norm. This norm is often used in matrix analysis since it is relatively easy to compute. It is defined as
||A||F=√tr(AHA)i.e. the square root of the sum of all of the elements of A squared.
class Matrix(Matrix):
def frobNorm(self):
# Square all of the elements of A
A_2 = np.square(self.A)
# Sum all of the elemenst of A
f_sum = np.sum(A_2)
# Square root the sum of all of the elements of A squared
f_norm = np.sqrt(f_sum)
return f_norm
M = np.matrix([[4,2],[0,5]])
m = Matrix(M)
print("norm p=2:", m.frobNorm())
norm p=2: 6.708203932499369
The Frobenius norm is often used when comparing how similar two matrices A and B are, using ||A−B||F.
The follow relationships exist between the norms for an m×n matrix A
||A||c≤||A||F≤√n||A||2maxi,j|aij|≤||A||2≤√mnmaxi,j|aij|1√n||A||∞≤||A||2≤√m||A||∞1√m||A||1≤||A||2≤√n||A||1Consider a second order mass spring system
m¨z+b˙z+kz=Fwhere m is the mass, b is the damping coefficient, k is the spring constant, F the force being applied and z the displacement of the mass from equilibrium.
The state space representation of the system can be expressed as
˙x=Ax+Buwhere A represents the internal dynamics of the system, B represents the external dynamics of the system, x represents the position and velocity of the mass, and u represents the input. If we ignore the external dynamics of the system, the state space representation of the system can be expressed as
˙x=Axwhere
A=[01−k/m−b/m]The solution to the second order system is
x(t)=exp−Atx0by inspection, we can analyze A to determine the rate of decay or how fast the perturbed system returns to equilibrium by analyzing the norm of A. The larger ||A|| is, the faster the system moves to equilibrium. In essence, ||A|| measures how fast a system can change. You can interact with the graph below to see the effect the matrix norm of ||A|| on the system's response.
from scipy.linalg import expm
fig2, ax2 = plt.subplots()
k = 10
b = 5
# Create time
dt = 0.2
t_end = 10
t = np.arange(0.0,t_end,dt)
# Initial state of the system
x_0 = np.matrix([[10],[0]])
# Calculate the dynamics
def calcDynamics(time,A):
# State dynamics
z = []
z_dt = []
# Calculate the state at each time step
for t in time:
x_t = expm(A*t)*x_0
z.append(x_t[0,0].item())
z_dt.append(x_t[1,0].item())
return z,z_dt
l_dynamics, = ax2.plot(t, t, lw=2, color='blue')
def updateDynamics(m = 1):
if (m < 1e-3):
m = 1e-3
# Internal system dynamics
A = np.matrix([[0,1],[-k/m,-b/m]])
A_norm = np.linalg.norm(A)
z,z_dt = calcDynamics(t,A)
l_dynamics.set_ydata(z)
ax2.axis(ymax=max(z)+1, ymin=min(z)-1)
fig2.canvas.draw()
plt.title("Matrix Norm %1.3f" % A_norm)
interact(updateDynamics, m = widgets.FloatSlider(min=1e-3,max=20,step=1e-2,value=1))
interactive(children=(FloatSlider(value=1.0, description='m', max=20.0, min=0.001, step=0.01), Output()), _dom…
<function __main__.updateDynamics(m=1)>