Lecture 19: Error Analysis

The last few lectures have studied stability: the sensitivity of a mathematical problem to perturbations in the input. We now turn our attention to the actual error in numerical algorithms. We model this as follows: given a problem $f : Y \rightarrow Z$ between two normed vector spaces, consider an algorithm as another problem $\tilde f : Y \rightarrow Z$ that we hope satisfies $f(x) \approx \tilde f(x)$.

There are two types of errors we can look at: forward and backward error. If these errors are "small" we say that the algorithm is forward stable or backward stable. The definition of "small" depends on the context.

Forward error

The absolute forward error is defined as

$$\|\tilde f(x) - f(x)\|_Z$$

while the relative forward error is

$$\|\tilde f(x) - f(x)\|_Z \over \|f(x)\|_Z$$

Backward error

Suppose that there exists a $\Delta x$ so that $\tilde f(x) = f(x+\Delta x)$. Then the absolute backward error is defined as

$$\|\Delta x\|_Y$$

and the relative backward error isdefined as

$$\|\Delta x\|_Y \over \|x\|_Y.$$

Warning: Backward error may not always be defined: for example, if $f(x) = 1$ and $\tilde f(x) = 0$ we have $f(x+\Delta x) = 1 \neq 0$ for all $\Delta x$. In this case, we can define the backward error as $\infty$.

Floating point error analysis

We now consider the forward and backward error in algorithms arising from floating point arithmetic.

Here ${\rm fl}(x)$ denotes the operator of rounding a number to floating point. This is always exact up to the last bit in the significand. If there are $p$ bits used to represent the significand, define machine epsilon as

$$\epsilon_m \triangleq 2^{-p}.$$

Since we round to nearest bit, we have the property that

$${\rm fl}(x) = x (1+ \delta_x)$$

where $|\delta_x\| \leq {\epsilon_m \over 2}$.

We can verify this numerically. eps(Float32) and eps(Float64) give machine epsilon for Float32 and Float64 respectively. For Float32 we have $p=23$, so this means it returns $2^{-23}$:

In [24]:
eps(Float32),2.0f0^(-23)
Out[24]:
(1.1920929f-7,1.1920929f-7)

We have ${\rm fl}(x) = x (1+ \delta_x)$, which implies that $$|{x - {\rm fl}(x)}| = |x||\delta_x| \leq |x| {\epsilon_m \over 2}.$$

We confirm this for a simple example:

In [13]:
x=1/3
=Float32(x)

abs(x-)  abs(x)*εm/2
Out[13]:
true

We can explain this using the case $x=1$. If we add $\epsilon_m/2$ to one we round back to one:

In [25]:
Float32(1.0+2.0^(-24))
Out[25]:
1.0f0

Any perturbation above this, no matter how small, rounds up:

In [26]:
Float32(1.0+2.0^(-24)+2.0^(-30))
Out[26]:
1.0000001f0

Thus the worst case is off by the last bit, but within half:

In [27]:
bits(Float32(1.0+2.0^(-24)+2.0^(-25)))
Out[27]:
"00111111100000000000000000000001"
In [28]:
bits(Float32(1.0))
Out[28]:
"00111111100000000000000000000000"

Example 1: error analysis for rounding a number

We now consider the backward and forward error of rounding. Define $f(x) = x$ and $\tilde f(x) = {\rm fl}(x)$, where $f,\tilde f : \mathbb R \rightarrow \mathbb R$ with the absolute value norm attached, which denote as $|\cdot | \rightarrow |\cdot |$.

Forward error:

$${|f(x) - \tilde f(x)| \over |f(x)|} = {|x -x(1+\delta_x) | \over |x|} = {|\delta_x|} \leq {\epsilon_m \over 2}.$$

Backward error: Since $\tilde f(x) = x(1+\delta_x) = f(x+x \delta_x)=f(x+\Delta x)$ for $\Delta x = x \delta_x$, we have the (relative) backward error

$${| \Delta x| \over | x|}= |\delta_x| \leq {\epsilon_m \over 2}$$

Example 2: error analysis for adding two floating point numbers

Assume $x$ and $y$ are floating point numbers. Consider the problem $f(x,y) = x+y$ calculated via the algorithm $\tilde f(x,y) = x \oplus y = {\rm fl}(x+y)= (x+y)(1+\delta_{x+y})$ where $f,\tilde f : \mathbb R^2 \rightarrow \mathbb R$ with norms $\| \cdot \|_\infty \rightarrow |\cdot |$.

Forward error:

$${|f(x) - \tilde f(x)| \over |f(x)|} = {|x+y -(x+y)(1+\delta_{x+y}) | \over |x+y|} = {|\delta_{x+y}|} \leq {\epsilon_m \over 2}.$$

Backward error: Since $\tilde f(x,y) = f(x+ x\delta_{x+y} x,y + y\delta_{x+y}) = f(x+\Delta x,y+\Delta y)$, we have the backward error

$${\| \begin{pmatrix}\Delta x\cr\Delta y\end{pmatrix} \|_\infty \over \| \begin{pmatrix} x\cr y\end{pmatrix} \|_\infty} = |\delta_{x+y}|\leq {\epsilon_m \over 2}.$$

Example 3: error analysis for adding two real numbers

Assume $x$ and $y$ are general real numbers. Again consider the problem $f(x,y) = x+y$, but now calculated via the algorithm

$$\tilde f(x,y) = {\rm fl}(x) \oplus {\rm fl}(y) = x(1+\delta_x) \oplus y(1+\delta_y) = (x(1+\delta_x) + y(1+\delta_y))(1+\delta_z) = x+y+x(\delta_x +\delta_z+\delta_x\delta_z) + y(\delta_y +\delta_z+\delta_y\delta_z)$$

where $z = x(1+\delta_x) + y(1+\delta_y)$. As before, $f,\tilde f : \mathbb R^2 \rightarrow \mathbb R$ with norms $\| \cdot \|_\infty \rightarrow |\cdot |$.

Backward error: Since $\tilde f(x,y) = f(x+ x\delta_{x+y} x,y + y\delta_{x+y}) = f(x+\Delta x,y+\Delta y)$, we have the backward error

$${\| \begin{pmatrix}\Delta x\cr\Delta y\end{pmatrix} \|_\infty \over \| \begin{pmatrix} x\cr y\end{pmatrix} \|_\infty} = |\delta_{x+y}|\leq {\epsilon_m \over 2}.$$

Forward error:

$${|f(x) - \tilde f(x)| \over |f(x)|} = {|x+y -(x+y)(1+\delta_{x+y}) | \over |x+y|} = {|\delta_{x+y}|} \leq {\epsilon_m \over 2}.$$
In [ ]: