Notebook

Error Analysis for Iterative Methods¶

We've discussed three iterative methods:

Bisection
Fixed-point iteration
Newton's method

For Bisection, we saw that $p_n = p + O(1/2^n)$. For, fixed-point iteration we have $p_n = p + O(K^n)$ where $K$ is the upper bound on the absolute value of the derivative. And for Newton's method $p_n = p + O(K^n)$ for all $0 < K < 1$.

We introduce some definitions to classify these methods.

Definition (Order of convergence)¶

Suppose $\{p_n\}_{n=0}^\infty$ is a sequence that converges to $p$ with $p_n \neq p$ for all $n$. If positive constants $\lambda$ and $\alpha$ exist such that

$$ \lim_{n \to \infty} \frac{|p_{n+1}-p|}{|p_{n}-p|^\alpha} = \lambda,$$

then $\{p_n\}_{n=0}^\infty$ converges to $p$ to order $\alpha$, with asymptotic order constant $\lambda$.

Two cases are of particular interest:

If $\alpha = 1$ (and $\lambda < 1$) the sequence is linearly convergent.
If $\alpha = 2$, the sequence is quadratically convergent.

Quadratic convergence is faster than linear convergence¶

Assume $p_n \to p$, linearly, with constant $\lambda = .5$. Also assume $q_n \to p$, quadratically, with the same constant. Then

$$ \lim_{n \to \infty} \frac{|p_{n+1} - p|}{|p_{n}-p|} = .5, \quad \lim_{n \to \infty} \frac{|q_{n+1} - p|}{|q_{n}-p|^2} = .5.$$

Let's assume that $|p_0 -p|, ~|q_0-q| < 1$. Then

$$ \frac{|p_{n+1} - p|}{|p_{n}-p|} \approx .5, \quad \frac{|q_{n+1} - p|}{|q_{n}-p|^2} \approx .5$$

Then

$$|p_{n+1} - p|\approx (.5){|p_{n}-p|}, \quad |q_{n+1} - p|\approx (.5){|q_{n}-p|^2}.$$

$$|p_{n+1} - p|\approx (.5){|p_{n}-p|} \approx (.5)^2{|p_{n-1}-p|} \approx (.5)^{3}{|p_{n-2}-p|} \approx \cdots $$

$$|q_{n+1} - p|\approx (.5){|q_{n}-p|^2} \approx (.5)^3{|q_{n-1}-p|^4} \approx (.5)^{7}{|q_{n-2}-p|^{8}} \approx \cdots $$

Theorem 1 (order of convergence of arbitrary fixed-point iteration scheme)¶

Let $g \in C[a,b]$ be such that $g(x) \in [a,b]$ for all $x \in [a,b]$. Suppose, in addition, that $g'$ exists on $(a,b)$ and there exists $0 \leq K < 1$ such that

$$ |g'(x)| \leq K, \quad \text{ for all } ~~ x \in [a,b]. $$

If $p$ is the unique fixed point and $g'(p) \neq 0$ then for $p_0 \in [a,b]$ the sequence $p_n = g(p_{n-1})$ converges to $p$ only linearly with asymptotic error constant

$$ \lambda = |g'(p)|$$

(if $p_n \neq p$ for all $n$).

Proof¶

Based on the previous theorem for convergence of fixed-point iteration, we know that $p_{n} \to p$. It then follows that

$$ \lim_{n \to \infty} \frac{g(p_n)- g(p)}{p_n - p} = g'(p).$$

$$ \lim_{n \to \infty} \frac{|g(p_n)- g(p)|}{|p_n - p|} = |g'(p)|$$

$$ \lim_{n \to \infty} \frac{|p_{n+1} - p|}{|p_n - p|} = |g'(p)|$$

It is immediately clear that something special happens when $g'(p) = 0$ for a fixed point $p$. This next theorem helps describe this behavior.

Theorem 2 (order of convergence of special fixed-point scheme)¶

Let $p = (a+b)/2 \in (a,b)$ be a fixed point of $g \in C^2[a,b]$. Suppose that $g'(p) = 0$ and $|g'(x)| \leq K < 1$ for $x \in [a,b]$. For $p_0 \in [a,b]$, the sequence $p_n = g(p_{n-1})$ converges at least quadradically to $p$ with asymptotic rate constant $\lambda = |g''(p)|/2$.

Proof¶

Let $x \in [a,b]$. By the Mean-Value Theorem, for some $\xi$ between $x$ and $p$

$$ g(x) = g(p) + g'(\xi)(x-p) = p + g'(\xi)(x-p).$$

Then using that $|x-p| \leq (b-a)/2$ $$g(x) = p + g'(\xi)(x-p) \leq p + |g'(\xi)||x-p| \leq (b+a)/2 + (b-a)/2 = b\\ g(x) = p + g'(\xi)(x-p) \geq p - |g'(\xi)||x-p| \geq (b+a)/2 - (b-a)/2 = a.$$

So, $g(x) \in [a,b]$ for $x \in [a,b]$. We now know that $p_n \to p$, the unique fixed point.

We use Taylor's theorem to state that for some $\xi \in [a,b]$ (between $p$ and $x$ actually)

$$ g(x) = g(p) + g'(p)(x-p) + \frac{g''(\xi)}{2}(x-p)^2 \\= g(p) + \frac{g''(\xi)}{2}(x-p)^2.$$

If $x = p_{n}$ we find for some $\xi_n$ between $p$ and $p_n$.

$$\frac{p_{n+1}-p}{(p_n-p)^2} = \frac{g''(\xi_n)}{2}.$$

Then $\xi_n \to p$ as $n \to \infty$ (it is closer to $p$ than $p_n$ is) and so

$$\lim_{n \to \infty} \frac{|p_{n+1}-p|}{|p_n-p|^2} = \lim_{n \to \infty} \frac{|g''(\xi_n)|}{2} = \frac{|g''(p)|}{2}.$$

Summary: A fixed-point iteration scheme converges quadratically if $g'(p) = 0$.¶

This is something satisfied by Newton's method with $g(x) = x - f(x)/f'(x)$ because then

$$g'(x) = \frac{f(x)f''(x)}{[f'(x)]^2}$$

where is zero at the fixed point.

Conclusion: Newton converges faster than an arbitrary fixed-point scheme¶

Multiple roots¶

A function $f(x)$ has a zero of mutiplicity $m$ (or of order $m$) at $x = p$ if for $x \neq p$ we can write $f(x) = (x-p)^m q(x)$ where $q(p) \neq 0$.

The following can be proved using Taylor's Theorem:

Theorem 3¶

A function $f \in C^m[a,b]$ has a zero of multiplicity $m$ at $p \in (a,b)$ if and only if

$$ 0 = f(p) = f'(p) = \cdots = f^{(m-1)}(p), \quad \text{but} \quad f^{(m)}(p) \neq 0.$$

Suppose the solution to $f(x)= 0$ is $p$, but $f'(p) = 0$. Solving this equation is a problem for Newton's method, if we use the iteration function,

$$ g(x) = x - f(x)/f'(x),$$

because it will force the computer to divide by small numbers, magnifying the effect of round-off error.

We also cannot guarantee that $g'(p) = f(p)f''(p)/(f'(p))^2$ is zero, i.e. we cannot guarantee quadratic convergence (by Theorem 2).

One way around this problem is to make use of the fact that there are other functional iteration schemes we could use.

In particular, we will construct a new function $\mu(x)$ out of $f(x)$ that has a simple zero at $x=p$, i.e. $\mu'(p) \neq 0$, and then apply Newton to $\mu(x)$.

Modified Newton's method¶

More generally, let $f(x)$ have a zero of multiplicity $m$ at $x = p$. Let's take the problematic $f/f'$ and turn it to our advantage by considering the function

$$\mu(x) = \frac{f(x)}{f'(x)} = \frac{(x-p)^m q(x)}{m(x-p)^{m-1} q(x) + (x-p)^m q'(x)} \\ = (x-p) \frac{q(x)}{m q(x) + (x-p) q'(x)}$$

It follows that $\lim_{x \to p} \mu(x)/(x-p) = 1/m \neq 0$, or $\mu(x)$ has a simple zero at $x=p$, which is equivalent to saying that $\mu'(p) \neq 0 $ (by Theorem 3).

And so, to find a zero of $f(x)$ when $f$ has a zero of higher multiplicity we can apply Newton's method to $\mu(x)$, i.e. choose $g(x)$ to be

$$g(x) = x - \frac{\mu(x)}{\mu'(x)} = x - \frac{f(x)f'(x)}{[f'(x)]^2 - f(x)f''(x)}.$$

This can work well in some cases.