for arbitrary real constants a and b. The unknown is a number x. All other algebraic equations, e.g., x2+ax+b=0, are nonlinear. The typical feature in a nonlinear algebraic equation is that the unknown appears in products with itself, like x2 or in functions that are infinite sums of products, like ex=1+x+12x2+13!x3+⋯.
We know how to solve a linear algebraic equation, x=−b/a, but there are no general closed formulas for finding the exact solutions of nonlinear algebraic equations, except for very special cases (quadratic equations constitute a primary example). A nonlinear algebraic equation may have no solution, one solution, or many solutions. The tools for solving nonlinear algebraic equations are iterative methods, where we construct a series of linear equations, which we know how to solve, and hope that the solutions of the linear equations converge to a solution of the nonlinear equation we want to solve. Typical methods for nonlinear algebraic equation equations are Newton's method, the Bisection method, and the Secant method.
The unknown in a differential equation is a function and not a number. In a linear differential equation, all terms involving the unknown function are linear in the unknown function or its derivatives. Linear here means that the unknown function, or a derivative of it, is multiplied by a number or a known function. All other differential equations are non-linear.
The easiest way to see if an equation is nonlinear, is to spot nonlinear terms where the unknown function or its derivatives are multiplied by each other. For example, in
the terms involving the unknown function u are linear: u′ contains the derivative of the unknown function multiplied by unity, and au contains the unknown function multiplied by a known function. However,
is nonlinear because of the term −u2 where the unknown function is multiplied by itself. Also
is nonlinear because of the term uux where the unknown function appears in a product with its derivative. (Note here that we use different notations for derivatives: u′ or du/dt for a function u(t) of one variable, ∂u∂t or ut for a function of more than one variable.)
Another example of a nonlinear equation is
because sin(u) contains products of u, which becomes clear if we expand the function in a Taylor series:
Mathematical proof of linearity.
To really prove mathematically that some differential equation in an unknown u is linear, show for each term T(u) that with u=au1+bu2 for constants a and b,
For example, the term T(u)=(sin2t)u′(t) is linear because
However, T(u)=sinu is nonlinear because
A series of forthcoming examples will explain how to tackle nonlinear differential equations with various techniques. We start with the (scaled) logistic equation as model problem:
This is a nonlinear ordinary differential equation (ODE) which will be solved by different strategies in the following. Depending on the chosen time discretization of (1), the mathematical problem to be solved at every time level will either be a linear algebraic equation or a nonlinear algebraic equation. In the former case, the time discretization method transforms the nonlinear ODE into linear subproblems at each time level, and the solution is straightforward to find since linear algebraic equations are easy to solve. However, when the time discretization leads to nonlinear algebraic equations, we cannot (except in very rare cases) solve these without turning to approximate, iterative solution methods.
The next subsections introduce various methods for solving nonlinear differential equations, using (1) as model. We shall go through the following set of cases:
explicit time discretization methods (with no need to solve nonlinear algebraic equations)
implicit Backward Euler time discretization, leading to nonlinear algebraic equations solved by
an exact analytical technique
Picard iteration based on manual linearization
a single Picard step
Newton's method
implicit Crank-Nicolson time discretization and linearization via a geometric mean formula
Thereafter, we compare the performance of the various approaches. Despite the simplicity of (1), the conclusions reveal typical features of the various methods in much more complicated nonlinear PDE problems.
Time discretization methods are divided into explicit and implicit methods. Explicit methods lead to a closed-form formula for finding new values of the unknowns, while implicit methods give a linear or nonlinear system of equations that couples (all) the unknowns at a new time level. Here we shall demonstrate that explicit methods may constitute an efficient way to deal with nonlinear differential equations.
The Forward Euler method is an explicit method. When applied to (1), sampled at t=tn, it results in
which is a linear algebraic equation for the unknown value un+1 that we can easily solve:
The nonlinearity in the original equation poses in this case no difficulty in the discrete algebraic equation. Any other explicit scheme in time will also give only linear algebraic equations to solve. For example, a typical 2nd-order Runge-Kutta method for (1) leads to the following formulas:
The first step is linear in the unknown u∗. Then u∗ is known in the next step, which is linear in the unknown un+1 . For this scheme we have an explicit formula for the unknown and the scheme is therefore called an explicit scheme.
Switching to a Backward Euler scheme for (1),
results in a nonlinear algebraic equation for the unknown value un. The equation is of quadratic type:
and may be solved exactly by the well-known formula for such equations. Before we do so, however, we will introduce a shorter, and often cleaner, notation for nonlinear algebraic equations at a given time level. The notation is inspired by the natural notation (i.e., variable names) used in a program, especially in more advanced partial differential equation problems. The unknown in the algebraic equation is denoted by u, while u(1) is the value of the unknown at the previous time level (in general, u(ℓ) is the value of the unknown ℓ levels back in time). The notation will be frequently used in later sections. What is meant by u should be evident from the context: u may be 1) the exact solution of the ODE/PDE problem, 2) the numerical approximation to the exact solution, or 3) the unknown solution at a certain time level.
The quadratic equation for the unknown un in (2) can, with the new notation, be written
The solution is readily found to be
Now we encounter a fundamental challenge with nonlinear
algebraic equations:
the equation may have more than one solution. How do we pick the right
solution? This is in general a hard problem.
In the present simple case, however, we can analyze the roots mathematically
and provide an answer. The idea is to expand the roots
in a series in Δt and truncate after the linear term since
the Backward Euler scheme will introduce an error proportional to
Δt anyway. Using sympy
we find the following Taylor series
expansions of the roots:
import sympy as sym
dt, u_1, u = sym.symbols('dt u_1 u')
r1, r2 = sym.solve(dt*u**2 + (1-dt)*u - u_1, u) # find roots
r1
r2
print((r1.series(dt, 0, 2))) # 2 terms in dt, around dt=0
print((r2.series(dt, 0, 2)))
We see that the r1
root, corresponding to
a minus sign in front of the square root in
(4),
behaves as 1/Δt and will therefore
blow up as Δt→0! Since we know that u takes on
finite values, actually it is less than or equal to 1,
only the r2
root is of relevance in this case: as Δt→0,
u→u(1), which is the expected result.
For those who are not well experienced with approximating mathematical formulas by series expansion, an alternative method of investigation is simply to compute the limits of the two roots as Δt→0 and see if a limit unreasonable:
print((r1.limit(dt, 0)))
print((r2.limit(dt, 0)))
When the time integration of an ODE results in a nonlinear algebraic equation, we must normally find its solution by defining a sequence of linear equations and hope that the solutions of these linear equations converge to the desired solution of the nonlinear algebraic equation. Usually, this means solving the linear equation repeatedly in an iterative fashion. Alternatively, the nonlinear equation can sometimes be approximated by one linear equation, and consequently there is no need for iteration.
Constructing a linear equation from a nonlinear one requires linearization of each nonlinear term. This can be done manually as in Picard iteration, or fully algorithmically as in Newton's method. Examples will best illustrate how to linearize nonlinear problems.
Let us write (3) in a more compact form
with a=Δt, b=1−Δt, and c=−u(1). Let u− be an available approximation of the unknown u.
Then we can linearize the term u2 simply by writing u−u. The resulting equation, ˆF(u)=0, is now linear and hence easy to solve:
Since the equation ˆF=0 is only approximate, the solution u does not equal the exact solution ue of the exact equation F(ue)=0, but we can hope that u is closer to ue than u− is, and hence it makes sense to repeat the procedure, i.e., set u−=u and solve ˆF(u)=0 again. There is no guarantee that u is closer to ue than u−, but this approach has proven to be effective in a wide range of applications.
The idea of turning a nonlinear equation into a linear one by using an approximation u− of u in the nonlinear terms is a widely used approach that goes under many names: fixed-point iteration, the method of successive substitutions, nonlinear Richardson iteration, and Picard iteration. We will stick to the latter name.
Picard iteration for solving the nonlinear equation arising from the Backward Euler discretization of the logistic equation can be written as
The ← symbols means assignment (we set u− equal to the value of u). The iteration is started with the value of the unknown at the previous time level: u−=u(1).
Some prefer an explicit iteration counter as superscript in the mathematical notation. Let uk be the computed approximation to the solution in iteration k. In iteration k+1 we want to solve
Since we need to perform the iteration at every time level, the time level counter is often also included:
with the start value un,0=un−1 and the final converged value un=un,k for sufficiently large k.
However, we will normally apply a mathematical notation in our final formulas that is as close as possible to what we aim to write in a computer code and then it becomes natural to use u and u− instead of uk+1 and uk or un,k+1 and un,k.
The iteration method can typically be terminated when the change in the solution is smaller than a tolerance ϵu:
or when the residual in the equation is sufficiently small (<ϵr),
Instead of iterating until a stopping criterion is fulfilled, one may iterate a specific number of times. Just one Picard iteration is popular as this corresponds to the intuitive idea of approximating a nonlinear term like (un)2 by un−1un. This follows from the linearization u−un and the initial choice of u−=un−1 at time level tn. In other words, a single Picard iteration corresponds to using the solution at the previous time level to linearize nonlinear terms. The resulting discretization becomes (using proper values for a, b, and c)
which is a linear algebraic equation in the unknown un, making it easy to solve for un without any need for any alternative notation.
We shall later refer to the strategy of taking one Picard step, or equivalently, linearizing terms with use of the solution at the previous time step, as the Picard1 method. It is a widely used approach in science and technology, but with some limitations if Δt is not sufficiently small (as will be illustrated later).
Notice.
Equation (5) does not correspond to a "pure" finite difference method where the equation is sampled at a point and derivatives replaced by differences (because the un−1 term on the right-hand side must then be un). The best interpretation of the scheme (5) is a Backward Euler difference combined with a single (perhaps insufficient) Picard iteration at each time level, with the value at the previous time level as start for the Picard iteration.
We consider now a Crank-Nicolson discretization of (1). This means that the time derivative is approximated by a centered difference,
written out as
The first term un+12 is normally approximated by an arithmetic mean,
such that the scheme involves the unknown function only at the time levels where we actually compute it. The same arithmetic mean applied to the second term gives
which is nonlinear in the unknown un+1. However, using a geometric mean for (un+12)2 is a way of linearizing the nonlinear term in (6):
Using an arithmetic mean on the linear un+12 term in (6) and a geometric mean for the second term, results in a linearized equation for the unknown un+1:
which can readily be solved:
This scheme can be coded directly, and since there is no nonlinear algebraic equation to iterate over, we skip the simplified notation with u for un+1 and u(1) for un. The technique with using a geometric average is an example of transforming a nonlinear algebraic equation to a linear one, without any need for iterations.
The geometric mean approximation is often very effective for linearizing quadratic nonlinearities. Both the arithmetic and geometric mean approximations have truncation errors of order Δt2 and are therefore compatible with the truncation error O(Δt2) of the centered difference approximation for u′ in the Crank-Nicolson method.
Applying the operator notation for the means and finite differences, the linearized Crank-Nicolson scheme for the logistic equation can be compactly expressed as
Remark.
If we use an arithmetic instead of a geometric mean for the nonlinear term in (6), we end up with a nonlinear term (un+1)2. This term can be linearized as u−un+1 in a Picard iteration approach and in particular as unun+1 in a Picard1 iteration approach. The latter gives a scheme almost identical to the one arising from a geometric mean (the difference in un+1 being 14Δtun(un+1−un)≈14Δt2u′u, i.e., a difference of size Δt2).
The Backward Euler scheme (2) for the logistic equation leads to a nonlinear algebraic equation (3). Now we write any nonlinear algebraic equation in the general and compact form
Newton's method linearizes this equation by approximating F(u) by its Taylor series expansion around a computed value u− and keeping only the linear part:
The linear equation ˆF(u)=0 has the solution
Expressed with an iteration index in the unknown, Newton's method takes on the more familiar mathematical form
When the method converges, it can be shown that the error in iteration k+1 of Newton's method is proportional to the square of the error in iteration k, a result referred to as quadratic convergence. This means that for small errors the method converges very fast, and in particular much faster than Picard iteration and other iteration methods. (The proof of this result is found in most textbooks on numerical analysis.) However, the quadratic convergence appears only if uk is sufficiently close to the solution. Further away from the solution the method can easily converge very slowly or diverge. The reader is encouraged to do nonlin:exer:Newton:problems1 to get a better understanding for the behavior of the method.
Application of Newton's method to the logistic equation discretized by the Backward Euler method is straightforward as we have
and then
The iteration method becomes
At each time level, we start the iteration by setting u−=u(1). Stopping criteria as listed for the Picard iteration can be used also for Newton's method.
An alternative mathematical form, where we write out a, b, and c, and use a time level counter n and an iteration counter k, takes the form
for k=0,1,…. A program implementation is much closer to (7) than to (8), but the latter is better aligned with the established mathematical notation used in the literature.
One iteration in Newton's method or Picard iteration consists of solving a linear problem ˆF(u)=0. Sometimes convergence problems arise because the new solution u of ˆF(u)=0 is "too far away" from the previously computed solution u−. A remedy is to introduce a relaxation, meaning that we first solve ˆF(u∗)=0 for a suggested value u∗ and then we take u as a weighted mean of what we had, u−, and what our linearized equation ˆF=0 suggests, u∗:
The parameter ω is known as a relaxation parameter, and a choice ω<1 may prevent divergent iterations.
Relaxation in Newton's method can be directly incorporated in the basic iteration formula:
The program logistic.py
contains
implementations of all the methods described above.
Below is an extract of the file showing how the Picard and Newton
methods are implemented for a Backward Euler discretization of
the logistic equation.
def BE_logistic(u0, dt, Nt, choice='Picard',
eps_r=1E-3, omega=1, max_iter=1000):
if choice == 'Picard1':
choice = 'Picard'
max_iter = 1
u = np.zeros(Nt+1)
iterations = []
u[0] = u0
for n in range(1, Nt+1):
a = dt
b = 1 - dt
c = -u[n-1]
if choice == 'Picard':
def F(u):
return a*u**2 + b*u + c
u_ = u[n-1]
k = 0
while abs(F(u_)) > eps_r and k < max_iter:
u_ = omega*(-c/(a*u_ + b)) + (1-omega)*u_
k += 1
u[n] = u_
iterations.append(k)
elif choice == 'Newton':
def F(u):
return a*u**2 + b*u + c
def dF(u):
return 2*a*u + b
u_ = u[n-1]
k = 0
while abs(F(u_)) > eps_r and k < max_iter:
u_ = u_ - F(u_)/dF(u_)
k += 1
u[n] = u_
iterations.append(k)
return u, iterations
The Crank-Nicolson method utilizing a linearization based on the geometric mean gives a simpler algorithm:
def CN_logistic(u0, dt, Nt):
u = np.zeros(Nt+1)
u[0] = u0
for n in range(0, Nt):
u[n+1] = (1 + 0.5*dt)/(1 + dt*u[n] - 0.5*dt)*u[n]
return u
We may run experiments with the model problem (1) and the different strategies for dealing with nonlinearities as described above. For a quite coarse time resolution, Δt=0.9, use of a tolerance ϵr=0.05 in the stopping criterion introduces an iteration error, especially in the Picard iterations, that is visibly much larger than the time discretization error due to a large Δt. This is illustrated by comparing the upper two plots in Figure. The one to the right has a stricter tolerance ϵ=10−3, which leads to all the curves involving backward Euler, using iterative solution by either Picard or Newton iterations, to be on top of each other (and no changes can be visually observed by reducing ϵr further). The reason why Newton's method does much better than Picard iteration in the upper left plot is that Newton's method with one step comes far below the ϵr tolerance, while the Picard iteration needs on average 7 iterations to bring the residual down to ϵr=10−1, which gives insufficient accuracy in the solution of the nonlinear equation. It is obvious that the Picard1 method gives significant errors in addition to the time discretization unless the time step is as small as in the lower right plot.
The BE exact curve corresponds to using the exact solution of the quadratic equation at each time level, so this curve is only affected by the Backward Euler time discretization. The CN gm curve corresponds to the theoretically more accurate Crank-Nicolson discretization, combined with a geometric mean for linearization. Visually, this curve appears more accurate in all the plots, especially if we take the plot in the lower right with a small Δt and an appropriately small ϵr value as the reference curve.
When it comes to the need for iterations, Figure displays the number of iterations required at each time level for Newton's method and Picard iteration. The smaller Δt is, the better starting value we have for the iteration, and the faster the convergence is. With Δt=0.9 Picard iteration requires on average 32 iterations per time step for the stricter convergence criterion, but this number is dramatically reduced as Δt is reduced.
However, introducing relaxation and a parameter ω=0.8 immediately reduces the average of 32 to 7, indicating that for the large Δt=0.9, Picard iteration takes too long steps. An approximately optimal value for ω in this case is 0.5, which results in an average of only 2 iterations! An even more dramatic impact of ω appears when Δt=1: Picard iteration does not convergence in 1000 iterations, but ω=0.5 again brings the average number of iterations down to 2.
Impact of solution strategy and time step length on the solution.
Comparison of the number of iterations at various time levels for Picard and Newton iteration.
Let us see how the various methods in the previous sections can be applied to the more generic model
where f is a nonlinear function of u.
Explicit ODE methods like the Forward Euler scheme, various Runge-Kutta methods, Adams-Bashforth methods all evaluate f at time levels where u is already computed, so nonlinearities in f do not pose any difficulties.
Approximating u′ by a backward difference leads to a Backward Euler scheme, which can be written as
or alternatively
A simple Picard iteration, not knowing anything about the nonlinear structure of f, must approximate f(u,tn) by f(u−,tn):
The iteration starts with u−=u(1) and proceeds with repeating
until a stopping criterion is fulfilled.
Explicit vs implicit treatment of nonlinear terms.
Evaluating f for a known u− is referred to as explicit treatment of f, while if f(u,t) has some structure, say f(u,t)=u3, parts of f can involve the unknown u, as in the manual linearization like (u−)2u, and then the treatment of f is "more implicit" and "less explicit". This terminology is inspired by time discretization of u′=f(u,t), where evaluating f for known u values gives explicit formulas for the unknown and hence explicit schemes, while treating f or parts of f implicitly, meaning that equations must be solved in terms of the unknown, makes f contribute to the unknown terms in the equation at the new time level.
Explicit treatment of f usually means stricter conditions on Δt to achieve stability of time discretization schemes. The same applies to iteration techniques for nonlinear algebraic equations: the "less" we linearize f (i.e., the more we keep of u in the original formula), the faster the convergence may be.
We may say that f(u,t)=u3 is treated explicitly if we evaluate f as (u−)3, semi or partially implicit if we linearize as (u−)2u and fully implicit if we represent f by u3. (Of course, the fully implicit representation will require further linearization, but with f(u,t)=u2 a fully implicit treatment is possible if the resulting quadratic equation is solved with a formula.)
For the ODE u′=−u3 with f(u,t)=−u3 and coarse time resolution Δt=0.4, Picard iteration with (u−)2u requires 8 iterations with ϵr=10−3 for the first time step, while (u−)3 leads to 22 iterations. After about 10 time steps both approaches are down to about 2 iterations per time step, but this example shows a potential of treating f more implicitly.
A trick to treat f implicitly in Picard iteration is to
evaluate it as f(u−,t)u/u−. For a polynomial f, f(u,t)=um,
this corresponds to (u−)mu/u−=(u−)m−1u. Sometimes this more implicit
treatment has no effect, as with f(u,t)=exp(−u) and f(u,t)=ln(1+u),
but with f(u,t)=sin(2(u+1)), the f(u−,t)u/u− trick
leads to 7, 9, and 11 iterations during the first three steps, while
f(u−,t) demands 17, 21, and 20 iterations.
(Experiments can be done with the code ODE_Picard_tricks.py
.)
Newton's method applied to a Backward Euler discretization of u′=f(u,t) requires the computation of the derivative
Starting with the solution at the previous time level, u−=u(1), we can just use the standard formula
The standard Crank-Nicolson scheme with arithmetic mean approximation of f takes the form
We can write the scheme as a nonlinear algebraic equation
A Picard iteration scheme must in general employ the linearization
We may write a system of ODEs
as
if we interpret u as a vector u=(u0(t),u1(t),…,uN(t)) and f as a vector function with components (f0(u,t),f1(u,t),…,fN(u,t)).
Most solution methods for scalar ODEs, including the Forward and Backward Euler schemes and the Crank-Nicolson method, generalize in a straightforward way to systems of ODEs simply by using vector arithmetics instead of scalar arithmetics, which corresponds to applying the scalar scheme to each component of the system. For example, here is a backward difference scheme applied to each component,
which can be written more compactly in vector form as
This is a system of algebraic equations,
or written out
We shall address the 2×2 ODE system for oscillations of a pendulum subject to gravity and air drag. The system can be written as
where β is a dimensionless parameter (this is the scaled, dimensionless version of the original, physical model). The unknown components of the system are the angle θ(t) and the angular velocity ω(t). We introduce u0=ω and u1=θ, which leads to
A Crank-Nicolson scheme reads
This is a coupled system of two nonlinear algebraic equations in two unknowns un+10 and un+11.
Using the notation u0 and u1 for the unknowns un+10 and un+11 in this system, writing u(1)0 and u(1)1 for the previous values un0 and un1, multiplying by Δt and moving the terms to the left-hand sides, gives
Obviously, we have a need for solving systems of nonlinear algebraic equations, which is the topic of the next section.
Implicit time discretization methods for a system of ODEs, or a PDE, lead to systems of nonlinear algebraic equations, written compactly as
where u is a vector of unknowns u=(u0,…,uN), and F is a vector function: F=(F0,…,FN). The system at the end of the section Systems of ODEs fits this notation with N=2, F0(u) given by the left-hand side of (18), while F1(u) is the left-hand side of (19).
Sometimes the equation system has a special structure because of the underlying problem, e.g.,
with A(u) as an (N+1)×(N+1) matrix function of u and b as a vector function: b=(b0,…,bN).
We shall next explain how Picard iteration and Newton's method can be applied to systems like F(u)=0 and A(u)u=b(u). The exposition has a focus on ideas and practical computations. More theoretical considerations, including quite general results on convergence properties of these methods, can be found in Kelley [Kelley_1995].
We cannot apply Picard iteration to nonlinear equations unless there is some special structure. For the commonly arising case A(u)u=b(u) we can linearize the product A(u)u to A(u−)u and b(u) as b(u−). That is, we use the most previously computed approximation in A and b to arrive at a linear system for u:
A relaxed iteration takes the form
In other words, we solve a system of nonlinear algebraic equations as a sequence of linear systems.
Algorithm for relaxed Picard iteration.
Given A(u)u=b(u) and an initial guess u−, iterate until convergence:
solve A(u−)u∗=b(u−) with respect to u∗
u=ωu∗+(1−ω)u−
u− ← u
"Until convergence" means that the iteration is stopped when the change in the unknown, ||u−u−||, or the residual ||A(u)u−b||, is sufficiently small, see the section Stopping criteria for more details.
The natural starting point for Newton's method is the general nonlinear vector equation F(u)=0. As for a scalar equation, the idea is to approximate F around a known value u− by a linear function ˆF, calculated from the first two terms of a Taylor expansion of F. In the multi-variate case these two terms become
where J is the Jacobian of F, defined by
So, the original nonlinear system is approximated by
which is linear in u and can be solved in a two-step procedure: first solve Jδu=−F(u−) with respect to the vector δu and then update u=u−+δu. A relaxation parameter can easily be incorporated:
Algorithm for Newton's method.
Given F(u)=0 and an initial guess u−, iterate until convergence:
solve Jδu=−F(u−) with respect to δu
u=u−+ωδu
u− ← u
For the special system with structure A(u)u=b(u),
one gets
We realize that the Jacobian needed in Newton's method consists of A(u−) as in the Picard iteration plus two additional terms arising from the differentiation. Using the notation A′(u) for ∂A/∂u (a quantity with three indices: ∂Ai,k/∂uj), and b′(u) for ∂b/∂u (a quantity with two indices: ∂bi/∂uj), we can write the linear system to be solved as
or
Rearranging the terms demonstrates the difference from the system solved in each Picard iteration:
Here we have inserted a parameter γ such that γ=0 gives the Picard system and γ=1 gives the Newton system. Such a parameter can be handy in software to easily switch between the methods.
Combined algorithm for Picard and Newton iteration.
Given A(u), b(u), and an initial guess u−, iterate until convergence:
solve (A+γ(A′(u−)u−+b′(u−)))δu=−A(u−)u−+b(u−) with respect to δu
u=u−+ωδu
u− ← u
γ=1 gives a Newton method while γ=0 corresponds to Picard iteration.
Let ||⋅|| be the standard Euclidean vector norm. Four termination criteria are much in use:
Absolute change in solution: ||u−u−||≤ϵu
Relative change in solution: ||u−u−||≤ϵu||u0||, where u0 denotes the start value of u− in the iteration
Absolute residual: ||F(u)||≤ϵr
Relative residual: ||F(u)||≤ϵr||F(u0)||
To prevent divergent iterations to run forever, one terminates the iterations when the current number of iterations k exceeds a maximum value kmax.
For stationary problems, the relative criteria are most used since they are not sensitive to the characteristic size of u, which may depend on the underlying mesh and its resolution. Nevertheless, the relative criteria can be misleading when the initial start value for the iteration is very close to the solution, since an unnecessary reduction in the error measure is enforced. For time-dependent problems, if the time-step is small then the previous solution may be a quite good guess for the unknown and in such cases the absolute criteria works better. It is common to combine the absolute and relative measures of the size of the residual, as in
where ϵrr is the tolerance in the relative criterion and ϵra is the tolerance in the absolute criterion. With a very good initial guess for the iteration (typically the solution of a differential equation at the previous time level), the term ||F(u0)|| is small and ϵra is the dominating tolerance. Otherwise, ϵrr||F(u0)|| and the relative criterion dominates.
With the change in solution as criterion we can formulate a combined absolute and relative measure of the change in the solution:
The ultimate termination criterion, combining the residual and the change in solution with a test on the maximum number of iterations, can be expressed as
In the present section, our aim is to discretize this problem in time and then present techniques for linearizing the time-discrete PDE problem "at the PDE level" such that we transform the nonlinear stationary PDE problem at each time level into a sequence of linear PDE problems, which can be solved using any method for linear PDEs. This strategy avoids the solution of systems of nonlinear algebraic equations. In the section 1D stationary nonlinear differential equations we shall take the opposite (and more common) approach: discretize the nonlinear problem in time and space first, and then solve the resulting nonlinear algebraic equations at each time level by the methods of the section "Systems of nonlinear algebraic equations" (for nonlinear ODEs). Very often, the two approaches are mathematically identical, so there is no preference from a computational efficiency point of view. The details of the ideas sketched above will hopefully become clear through the forthcoming examples.
The nonlinearities in the PDE are trivial to deal with if we choose an explicit time integration method for (1), such as the Forward Euler method:
or written out,
which is a linear equation in the unknown un+1 with solution
Written out,
This is a nonlinear PDE for the unknown function un(x). Such a PDE can be viewed as a time-independent PDE where un−1(x) is a known function.
We introduce a Picard iteration with k as iteration counter. A typical linearization of the ∇⋅(α(un)∇un) term in iteration k+1 is to use the previously computed un,k approximation in the diffusion coefficient: α(un,k). The nonlinear source term is treated similarly: f(un,k). The unknown function un,k+1 then fulfills the linear PDE
The initial guess for the Picard iteration at this time level can be taken as the solution at the previous time level: un,0=un−1.
We can alternatively apply the implementation-friendly notation where u corresponds to the unknown we want to solve for, i.e., un,k+1 above, and u− is the most recently computed value, un,k above. Moreover, u(1) denotes the unknown function at the previous time level, un−1 above. The PDE to be solved in a Picard iteration then looks like
At the beginning of the iteration we start with the value from the previous time level: u−=u(1), and after each iteration, u− is updated to u.
Remark on notation.
The previous derivations of the numerical scheme for time discretizations of PDEs have, strictly speaking, a somewhat sloppy notation, but it is much used and convenient to read. A more precise notation must distinguish clearly between the exact solution of the PDE problem, here denoted ue(x,t), and the exact solution of the spatial problem, arising after time discretization at each time level, where (4) is an example. The latter is here represented as un(x) and is an approximation to ue(x,tn). Then we have another approximation un,k(x) to un(x) when solving the nonlinear PDE problem for un by iteration methods, as in (5).
In our notation, u is a synonym for un,k+1 and u(1) is a synonym for un−1, inspired by what are natural variable names in a code. We will usually state the PDE problem in terms of u and quickly redefine the symbol u to mean the numerical approximation, while ue is not explicitly introduced unless we need to talk about the exact solution and the approximate solution at the same time.
At time level n, we have to solve the stationary PDE (4). In the previous section, we saw how this can be done with Picard iterations. Another alternative is to apply the idea of Newton's method in a clever way. Normally, Newton's method is defined for systems of algebraic equations, but the idea of the method can be applied at the PDE level too.
Let un,k be an approximation to the unknown un. We seek a better approximation on the form
We can Taylor expand α(un,k+δu) and f(un,k+δu):
Inserting the linear approximations of α and f in (8) results in
The term α′(un,k)δu∇δu is of order δu2 and is therefore omitted since we expect the correction δu to be small (δu≫δu2). Reorganizing the equation gives a PDE for δu that we can write in short form as
where
Note that δF is a linear function of δu, and F contains only terms that are known, such that the PDE for δu is indeed linear.
Observations.
The notational form δF=−F resembles the Newton system Jδu=−F for systems of algebraic equations, with δF as Jδu. The unknown vector in a linear system of algebraic equations enters the system as a linear operator in terms of a matrix-vector product (Jδu), while at the PDE level we have a linear differential operator instead (δF).
We can rewrite the PDE for δu in a slightly different way too if we define un,k+δu as un,k+1.
Note that the first line is the same PDE as arises in the Picard iteration, while the remaining terms arise from the differentiations that are an inherent ingredient in Newton's method.
For coding we want to introduce u for un, u− for un,k and u(1) for un−1. The formulas for F and δF are then more clearly written as
The form that orders the PDE as the Picard iteration terms plus the Newton method's derivative terms becomes
The Picard and full Newton versions correspond to γ=0 and γ=1, respectively.
Some may prefer to derive the linearized PDE for δu using the more compact notation. We start with inserting un=u−+δu to get
Taylor expanding,
and inserting these expressions gives a less cluttered PDE for δu:
The standard technique is to apply an arithmetic average for quantities defined between two mesh points, e.g.,
However, with nonlinear terms we have many choices of formulating an arithmetic mean:
A big question is whether there are significant differences in accuracy between taking the products of arithmetic means or taking the arithmetic mean of products. nonlin:exer:products:arith:mean investigates this question, and the answer is that the approximation is O(Δt2) in both cases.
The section Linearization at the differential equation level presented methods for linearizing time-discrete PDEs directly prior to discretization in space. We can alternatively carry out the discretization in space of the time-discrete nonlinear PDE problem and get a system of nonlinear algebraic equations, which can be solved by Picard iteration or Newton's method as presented in the section "Systems of nonlinear algebraic equations" (for nonlinear ODEs). This latter approach will now be described in detail.
We shall work with the 1D problem
The problem (21) arises from the stationary limit of a diffusion equation,
Introducing u(x) for un(x), u(1) for un−1, and defining f(u) in (21) to be f(u) in (23) plus un−1/Δt, gives (21) with a=1/Δt.
The nonlinearity in the differential equation (21) poses no more difficulty than a variable coefficient, as in the term (α(x)u′)′. We can therefore use a standard finite difference approach to discretizing the Laplace term with a variable coefficient:
Writing this out for a uniform mesh with points xi=iΔx, i=0,…,Nx, leads to
This equation is valid at all the mesh points i=0,1,…,Nx−1. At i=Nx we have the Dirichlet condition ui=D. The only difference from the case with (α(x)u′)′ and f(x) is that now α and f are functions of u and not only of x: (α(u(x))u′)′ and f(u(x)).
The quantity αi+12, evaluated between two mesh points, needs a comment. Since α depends on u and u is only known at the mesh points, we need to express αi+12 in terms of ui and ui+1. For this purpose we use an arithmetic mean, although a harmonic mean is also common in this context if α features large jumps. There are two choices of arithmetic means:
Equation (24) with the latter approximation then looks like
or written more compactly,
At mesh point i=0 we have the boundary condition α(u)u′=C, which is discretized by
meaning
The fictitious value u−1 can be eliminated with the aid of (27) for i=0. Formally, (27) should be solved with respect to ui−1 and that value (for i=0) should be inserted in (28), but it is algebraically much easier to do it the other way around. Alternatively, one can use a ghost cell [−Δx,0] and update the u−1 value in the ghost cell according to (28) after every Picard or Newton iteration. Such an approach means that we use a known u−1 value in (27) from the previous iteration.
The nonlinear algebraic equations (27) are of the form A(u)u=b(u) with
The matrix A(u) is tridiagonal: Ai,j=0 for j>i+1 and j<i−1.
The above expressions are valid for internal mesh points 1≤i≤Nx−1. For i=0 we need to express ui−1=u−1 in terms of u1 using (28):
This value must be inserted in A0,0. The expression for Ai,i+1 applies for i=0, and Ai,i−1 does not enter the system when i=0.
Regarding the last equation, its form depends on whether we include the Dirichlet condition u(L)=D, meaning uNx=D, in the nonlinear algebraic equation system or not. Suppose we choose (u0,u1,…,uNx−1) as unknowns, later referred to as systems without Dirichlet conditions. The last equation corresponds to i=Nx−1. It involves the boundary value uNx, which is substituted by D. If the unknown vector includes the boundary value, (u0,u1,…,uNx), later referred to as system including Dirichlet conditions, the equation for i=Nx−1 just involves the unknown uNx, and the final equation becomes uNx=D, corresponding to Ai,i=1 and bi=D for i=Nx.
The obvious Picard iteration scheme is to use previously computed values of ui in A(u) and b(u), as described more in detail in the section nonlin:systems:alg. With the notation u− for the most recently computed value of u, we have the system F(u)≈ˆF(u)=A(u−)u−b(u−), with F=(F0,F1,…,Fm), u=(u0,u1,…,um). The index m is Nx if the system includes the Dirichlet condition as a separate equation and Nx−1 otherwise. The matrix A(u−) is tridiagonal, so the solution procedure is to fill a tridiagonal matrix data structure and the right-hand side vector with the right numbers and call a Gaussian elimination routine for tridiagonal linear systems.
It helps on the understanding of the details to write out all the mathematics in a specific case with a small mesh, say just two cells (Nx=2). We use u−i for the i-th component in u−.
The starting point is the basic expressions for the nonlinear equations at mesh point i=0 and i=1 are
Equation (30) written out reads
We must then replace u−1 by (29). With Picard iteration we get
where
We must now move the u2 term to the right-hand side and replace all occurrences of u2 by D:
The two equations can be written as a 2×2 system:
where
The system with the Dirichlet condition becomes
with
Other entries are as in the 2×2 system.
The Jacobian must be derived in order to use Newton's method. Here it means that we need to differentiate F(u)=A(u)u−b(u) with respect to the unknown parameters u0,u1,…,um (m=Nx or m=Nx−1, depending on whether the Dirichlet condition is included in the nonlinear system F(u)=0 or not). Nonlinear equation number i of (27) has the structure
Computing the Jacobian requires careful differentiation. For example,
The complete Jacobian becomes
The explicit expression for nonlinear equation number i, Fi(u0,u1,…), arises from moving the f(ui) term in (27) to the left-hand side:
At the boundary point i=0, u−1 must be replaced using the formula (29). When the Dirichlet condition at i=Nx is not a part of the equation system, the last equation Fm=0 for m=Nx−1 involves the quantity uNx−1 which must be replaced by D. If uNx is treated as an unknown in the system, the last equation Fm=0 has m=Nx and reads
Similar replacement of u−1 and uNx must be done in the Jacobian for the first and last row. When uNx is included as an unknown, the last row in the Jacobian must help implement the condition δuNx=0, since we assume that u contains the right Dirichlet value at the beginning of the iteration (uNx=D), and then the Newton update should be zero for i=0, i.e., δuNx=0. This also forces the right-hand side to be bi=0, i=Nx.
We have seen, and can see from the present example, that the linear system in Newton's method contains all the terms present in the system that arises in the Picard iteration method. The extra terms in Newton's method can be multiplied by a factor such that it is easy to program one linear system and set this factor to 0 or 1 to generate the Picard or Newton system.