Lecture 3.2: Scalar product, matrix norms, unitary matrices

Matrices and norms

How to measure distances between matrices?

A trivial answer is that there is no big differences between matrices and vectors, and here comes the Frobenius norm of the matrix: $$ \Vert A \Vert_F = \Big(\sum_{i=1}^n \sum_{j=1}^m |a_{ij}|^2\Big)^{1/2} $$

Matrix norms

$\Vert \cdot \Vert$ is called a matrix norm if it is a vector norm on the linear space of $n \times m$ matrices, and it also is consistent with the matrix-by-matrix product, i.e.

$$\Vert A B \Vert \leq \Vert A \Vert \Vert B \Vert$$

The multiplicative property is needed in many places, for example in the estimates for the error of solution of linear systems (we will cover this subject later).

Can you think of some matrix norms, and is Frobenius norm a matrix norm?

Operator norms

The most important class of the norms is the class of operator norms. Mathematically, they are defined as

$$ \Vert A \Vert_* = \sup_{x \ne 0} \frac{\Vert A x \Vert_*}{\Vert x \Vert_*}, $$

where $\Vert \cdot \Vert_*$ is a vector norm.

Matrix p-norms

It is not diffcult to show that operator norm is a matrix norm. Among all operator norms $p$-norms are used, where $p$-norm is used as the vector norm. Among all $p$-norms three norms are the most common ones:

  • $p = 2, \quad$ spectral norm, denoted by $\Vert A \Vert_2$.
  • $p = \infty, \quad \Vert A \Vert_{\infty} = \max_i \sum_j |A_{ij}|$.
  • $p = 1, \quad \Vert A \Vert_{1} = \max_j \sum_i |A_{ij}|$.

Spectral norm

Spectral norm, $\Vert A \Vert_2$ is undoubtedly the most used matrix norm. It can not be computed directly from the entries using a simple formula, like the Euclidean norm, however, there are efficient algorithm to compute it. It is directly related to the singular value decomposition (SVD) of the matrix. It holds

$$ \Vert A \Vert_2 = \sigma_1(A) $$

where $\sigma_1(A)$ is the largest singular value of the matrix $A$. We will soon learn all about this. Meanwhile, we can already compute the norm in Python.

In [32]:
import numpy as np
n = 100
a = np.random.randn(n, n) #Random n x n matrix
s1 = np.linalg.norm(a, 2) #Spectral
s2 = np.linalg.norm(a, 'fro') #Frobenius
s3 = np.linalg.norm(a, 1) #1-norm
s4 = np.linalg.norm(a, np.inf) #It was trick to find the infinity
print 'Spectral:', s1, 'Frobenius:', s2, '1-norm', s3, 'infinity', s4
Spectral: 19.4195674332 Frobenius: 99.5002249595 1-norm 94.7398844201 infinity 96.8093380047

Scalar product

If norm is a measure of distances, then the scalar product takes angle into account.

The scalar product is defined as $$ (x, y) = \sum_{i=1}^n \overline{x}_i y_i, $$ where $\overline{x}$ denotes the complex conjugate of $x$. The Euclidean norm is then

$$ \Vert x \Vert^2 = (x, x), $$

or it is said the the norm is induced by scalar product.

Remark. For the angle between two vectors is defined as $$ \cos \phi = \frac{(x, y)}{\Vert x \Vert \Vert y \Vert} $$

An important property of the scalar product is the Cauchy-Bunyakovski inequality: $$ |(x, y)| \leq \Vert x \Vert \Vert y \Vert, $$ and thus the angle between two vectors is defined properly.

The scalar product can be written as a matrix-by-matrix product
$$ (x, y) = x^* y, $$ where $^*$ is a conjugate transpose of the matrix:
$$ B = A^*, \quad B_{ij} = \overline{A_{ji}}. $$

Norm conservation

For stability it is really important that the error does not grow. Suppose that you approximately get a vector,

$$ \Vert x - \widehat{x} \Vert \leq \varepsilon. $$

Let final result is (some) linear transformation of $x$:
$$ y = Ux, \quad \widehat{y} = U \widehat{x}. $$ If we want to estimate a difference between $\widehat{y}$ and $y$:

$$ \Vert y - \widehat{y} \Vert = \Vert U ( x - \widehat{x}) \Vert \leq \Vert U \Vert \varepsilon. $$

Matrices, preserving the norm

The question is for which kind of matrices the norm of the vector will not change.

For the euclidean norm this produces a very important class of matrices: unitary (or orthogonal in the real case) matrices.

Unitary (orthogonal) matrices

Let $U$ be an $n \times r$ matrix, and $\Vert U z \Vert = \Vert z \Vert$ for all $z$. This can happen if and only if

$$ U^* U = I, $$

where $I$ is an identity matrix

Indeed, $$\Vert Uz \Vert^2 = (Uz, Uz) = (Uz)^* Uz = z^* (U^* U) z = z^* z,$$

which can also hold if $U^* U = I$.

Unitary matrices

In the real case, when $U^* = U^{\top}$, the matrix is called orthogonal.

Are there many unitary matrices? First of all, a product of two unitary matrices is a unitary matrix:

$$(UV)^* UV = V^* (U^* U) V = V^* V = I,$$

thus if we give some non-trivial examples of unitary matrices, we will be able to get any unitary transformation.

Examples of unitary matrices

There are two important classes of unitary matrices, using those we can make any unitary matrix

  1. Householder matrices
  2. Givens (Jacobi) matrices

Householder matrices

Householder matrix is the matrix of the form
$$H = I - 2 vv^*,$$ where $u$ is an $n \times 1$ matrix and $v^* v = 1$. Can you show that $H$ is unitary? It is also a reflection.
A simple proof: $H^* H = (I - 2 vv^*)(I - 2 v v^*) = I$

Givens (Jacobi) matrix

A Givens matrix is a matrix

$$ A = \begin{bmatrix} \cos \alpha & \sin \alpha \\ -\sin \alpha & \cos \alpha \end{bmatrix}, $$

which is a rotation. For a general case, we select two $(i, j)$ planes and rotate only in those:

$$ x'_i = \cos \alpha x_i + \sin \alpha x_j, \quad x'_j = -\sin \alpha x_i + \cos\alpha x_j, $$

with all other $x_i$ remain unchanged.

Summary on unitary matrices

  • Unitary matrices preserve the norm
  • There are two "basic" classes of unitary matrices, Householder and Givens.
  • Every unitary matrix can be represented as a product of those.

Take home message

  • Matrix multiplication, idea of blocking, memory hiearchy
  • Scalar product, unitary matrices, basic classes of unitary matrices

Next week

  • TA week
  • You got PSet 1
  • Think of course projects (i.e. oil&gas, power networks, social networks)
  • There is a course on Mathematics of the Internet going now at Skoltech, you are welcome to visit.
In [14]:
from IPython.core.display import HTML
def css_styling():
    styles = open("./styles/custom.css", "r").read()
    return HTML(styles)
In [ ]: