%matplotlib inline
import numpy as np;
import matplotlib
import matplotlib.pyplot as plt
$\LaTeX \text{ commands here} \newcommand{\R}{\mathbb{R}} \newcommand{\im}{\text{im}\,} \newcommand{\norm}[1]{||#1||} \newcommand{\inner}[1]{\langle #1 \rangle} \newcommand{\span}{\mathrm{span}} \newcommand{\proj}{\mathrm{proj}} \newcommand{\OPT}{\mathrm{OPT}} \newcommand{\grad}{\nabla} \newcommand{\eps}{\varepsilon} $
Given $A \in \R^{m \times n}$ and $b \in \R^{m}$, want to compute $x \in \R^{n}$ such that $Ax = b$.
For now, we will focus on square matrices $A \in \R^{m \times m}$, which we also assume to be invertible.
You already know of a few algorithms for solving linear systems...
Part A: Set $x_k = \frac{b_k}{a_{kk}}$. Requires $O(m)$ operations.
Part B: Use "back substitution", requiring $O(m^2)$ operations. Starting with $x_m = b_m / a_{mm}$, the next equation from the bottom gives $$ a_{m-1,m-1} x_{m-1} + a_{m-1,m} x_m = b_{m-1} \\ \implies x_{m-1} = (b_{m-1} - a_{m-1,m} x_m) / a_{m-1,m-1} $$ Continuing in this fashion, we find $$ x_j = \frac{1}{a_{jj}} \left( b_j - \sum_{k=j+1}^m x_k a_{jk} \right) $$
The idea is to approximate $A^{-1}$ by splitting $A = M - N$ into two parts,
For example, splitting $A = D + L + U$ into the diagonal, strictly-upper-triangular, and strictly-lower-triangular parts respectively,
Let $x^* \in \R^m$ be the exact solution to $A x^* = b$. Then $$ \begin{align} Ax^* = (M-N)x^* &= b \\ Mx^* &= Nx^* + b \\ x^* &= M^{-1}(Nx^* + b) \end{align} $$
Therefore, $x^*$ is a fixed point of $\Phi(x) = M^{-1}(Nx + b)$. We'll try to use fixed point iteration to approximate $x^*$:
$$ x_{n+1} = \Phi(x_n) = M^{-1}(Nx + b) $$Want to solve: $Ax = b$. Fixed point iteration: split $A = M - N$, where $M$ is easy-to-invert, $N$ is aribtrary. The fixed point iteration tries to find approximate $x^*$ via update: $$ x_{t+1} = \Phi(x_t) = M^{-1}(Nx_t + b) $$
Recall that the spectral radius of a matrix $M$ is $\rho(M) := \max\{|\lambda_1(M)|, \ldots, |\lambda_n(M)|\}$.
(Part A): Show that if $\rho(M^{-1} N) > 1$, then the fixed point scheme above diverges for some initial $x_0$.
Hint: think of a good vector to start with...
Let $x_0 = v$ be an eigenvector for $M^{-1}N$ whose eigenvalue $\lambda$'s absolute value is larger than 1. If we assume that $b = 0$, then fixed point iteration is going to generate the point $x_t = (M^{-1}N)^t x_0 = \lambda^t x_0$. The former clearly diverges!
Want to solve: $Ax = b$. Fixed point iteration: split $A = M - N$, where $M$ is easy-to-invert, $N$ is aribtrary. The fixed point iteration tries to find approximate $x^*$ via update: $$ x_{t+1} = \Phi(x_t) = M^{-1}(Nx_t + b) $$
Recall that the spectral radius of a matrix $M$ is $\rho(M) := \max\{|\lambda_1(M)|, \ldots, |\lambda_n(M)|\}$.
(Part B): Show that if $\rho(M^{-1} N) < 1$, the fixed point scheme above converges to the solution $x$ for any initial $x_0$.
Hint: Let's say $x^*$ satisfies the linear system, $Ax^* = b$. You want to look at how the error terms $\text{err}_t := x_t - x^*$ evolve after each iteration. Can you show $\|\text{err}_t\|$ is shrinking from round to round?
If you consider the error term $\text{err}_t := x_t - x^*$, then it is easy to check that $$\text{err}_{t+1} = (M^{-1}N) \text{err}_t$$ Now let us see what happens to the norm of $\|\text{err}_{t+1}\|$: $$\|\text{err}_{t+1}\| = \|(M^{-1}N) \text{err}_t\| = \|\text{err}_t\| \|(M^{-1}N) \frac{\text{err}_t}{\|\text{err}_t\|}\| \leq \|\text{err}_t\| \max_{\|z\| = 1} \|M^{-1}N z\|$$ However, $\max_{\|z\| = 1} \|M^{-1}N z\|$ is indeed the spectral radius $\rho(M^{-1}N)$, hence we have shown that $$ \|\text{err}_{t+1}\| \leq \|\text{err}_t\| \rho(M^{-1}N), $$ which implies that the error term is always shrinking at a geometric rate!
A matrix $A \in \R^{m \times m}$ is strictly diagonally dominant if within every row, the diagonal entry is larger in absolute value than than the sum of the absolute values of the other terms,
$$ |a_{ii}| > \sum_{j \neq i} |a_{ij}| $$Hint: Use the Gershgorin Circle Theorem, which states that every eigenvalue of $A$ lies within at least one of the discs $D(a_{ii}, R_i) \subset \mathbb{C}$ of radius $R_i=\sum_{j\neq i} |a_{ij}|$ centered at $a_{ii}$ in the complex plane.
Solve the system $Au = 0$ where $A$ has the form:
def poisson_matrix(n):
# tridiagonal matrix
result = -2 * np.eye(n);
result += np.diag(np.ones(n-1), -1);
result += np.diag(np.ones(n-1), +1);
return result;
poisson_matrix(5)
array([[-2., 1., 0., 0., 0.], [ 1., -2., 1., 0., 0.], [ 0., 1., -2., 1., 0.], [ 0., 0., 1., -2., 1.], [ 0., 0., 0., 1., -2.]])
def jacobi(A, b, x0, max_iter=100):
x = np.copy(x0);
for k in range(max_iter):
x = # FILL IN!!
return x;
File "<ipython-input-170-3419913a410a>", line 5 x = # FILL IN!! ^ SyntaxError: invalid syntax
n = 100
A = poisson_matrix(n);
# sinusoidal error
k = 2;
x = np.linspace(0, 1, n);
u0 = np.sin(2*np.pi*x * k)
plt.plot(x,u0)
[<matplotlib.lines.Line2D at 0x7ffbb31ff080>]
u_jacobi = jacobi(A, np.zeros(n), u0, max_iter=10)
plt.plot(np.arange(n), u0, label="initial")
plt.plot(np.arange(n), u_jacobi, label="jacobi")
plt.legend()
<matplotlib.legend.Legend at 0x7ffbb323c358>