Toy example for Data Assimilation

The aim of data assimilation is to reduce the uncertainties in models with the help of observations. In this notebook the Kalman Filter and the 4D-Var methods are compared. First with a linear model and then with the classical Lorenz 1969 model.

In [1]:
using DataAssim
using DIVAnd
using LinearAlgebra
using PyPlot
using Random
using Test

State vector

$\newcommand{\e}[1]{{\bf \color{red}{#1}}}$ $\newcommand{\tindex}{k}$ $\renewcommand{\vec}[1] {{\mathbf #1}}$ $\renewcommand{\mat}[1] {{\mathbf #1}}$

The state vector $\vec x_\tindex$ containing all prognostic variables at time $t_\tindex$ (time of the $\tindex$-th time step).

Model

A model need to define the following operations:

  • a foreward step: ℳ(t,x,η)
  • tangent linear model: tgl(ℳ,t,x,dx)
  • adjoint model: adj(ℳ,t,x,dx2)

η is random noise for a stochastic model or zero for a detministic model.

In [2]:
function check(::AbstractModel,n,t = 0,ϵ = 1e-5)
    dx = randn(n)
    x = randn(n)
    dx2 = randn(n)

    @test ((t,x + ϵ*dx) - (t,x - ϵ*dx)) / (2*ϵ)   tgl(,t,x,dx) atol=10*ϵ^2
    @test dx2  tgl(,t,x,dx)  adj(,t,x,dx2)  dx   atol=1e-7

    dX = randn(n,3)
    MdX = tgl(,t,x,dX)
    @test tgl(,t,x,dX[:,1])  MdX[:,1]
end
Out[2]:
check (generic function with 3 methods)

Simple model can be defined using matrix ModelMatrix(M)

$$ \vec x_{\tindex+1} = \mat M \; \vec x_\tindex $$
In [3]:
 = ModelMatrix(2*I)

x = randn(4)
@test (0,x)  2*x
@test tgl(,0,0,x)  2*x
@test adj(,0,0,x)  2*x
check(,4)
Out[3]:
Test Passed

Or using a operator $\cal M$, potentially non-linear:

$$ \vec x_{\tindex+1} = \cal M(t_\tindex, \vec x_\tindex) $$

In this case, the tagent linear and adjoint model have to be provided too (or derived from automatic differential tools)

In [4]:
?ModelFun
search: ModelFun

Out[4]:
ℳ = ModelFun(nonlinear_forecast,tangent_linear_model,adjoint_model)

Model defined by the functions nonlinear_forecast,tangent_linear_model and adjoint_model.

In [5]:
 = ModelFun((t,x,η) -> 2*x,(t,x,dx) -> 2*dx,(t,x,dx) -> 2*dx)

x = randn(4)
@test (0,x)  2*x
@test tgl(,0,0,x)  2*x
@test adj(,0,0,x)  2*x
check(,4)
Out[5]:
Test Passed

Kalman Filter

In [6]:
# state size x
n = 2;

# number of observation per time instance
m = 1;

# observation operator
H = [1 0];
𝓗 = ModelMatrix(H)

# initial condition
xi = [1; 1];

# error covariance of the initial condition
Pi = Matrix(I,n,n)

# error covariance of the observations
R = 0.1 * Matrix(I,m,m)
Out[6]:
1×1 Array{Float64,2}:
 0.1
In [7]:
 = ModelMatrix([1 0.1; -0.1 1])

nmax = 100;
no = 10:5:nmax
yo = randn(m,length(no))


xai, = fourDVar(xi,Pi,,yo,R,H,nmax,no);
Q = zeros(size(Pi))
xa, = FreeRun(,xai,Q,H,nmax,no)

#𝓗
#𝓜
xa3, = KalmanFilter(xi,Pi,,Q,yo,R,H,nmax,no);
# should be ~0
@test xa[:,end]  xa3[:,end]  atol=1e-5
time = 1:nmax+1

plot(time,xa3[1,:],label="KF")
plot(time,xa[1,:],label="4DVar")
plot(time[no],yo[1,:],"*";label="observations")
legend()
grid("on")
In [8]:
size(xa)
Out[8]:
(2, 101)
In [9]:
?fourDVar
search: fourDVar

Out[9]:
x,J = fourDVar(
        xi,Pi,ℳ,yo,R,H,nmax,no;
        innerloops = 10,
        outerloops = 2,
        tol = 1e-5)

Incremental 4D-Var with the model and nmax time-steps starting at the initial condition xi and error covariance Pi with the specified numbers of inner and outer loops. Observations yo (and error covariance R) at the time steps given in no are assimilated with the observation operator H.

Lorenz-63 Model

In [10]:
?Lorenz63Model
search: Lorenz63Model

Out[10]:
ℳ = Lorenz63Model(dt,σ=10.,β = 8/3.,ρ = 28.)

Lorenz, 1963 model[1] integrated with a 2nd order Runge-Kutta scheme.

[1] https://doi.org/10.1175/1520-0469(1963)020%3C0130:DNF%3E2.0.CO;2

In [11]:
Random.seed!(12343)
 = Lorenz63Model(0.01)

x = zeros(3,10000)
x[:,1] = randn(3)

for k = 1:size(x,2)-1
    x[:,k+1] = (k,x[:,k])
end

plot(x[1,:])
grid("on");
In [12]:
 = Lorenz63Model(0.05)

nmax = 20;
no = 5:nmax;
n = 3;
sigma=10;
beta = 8/3;
rho = 28;
dt = 0.02;

xit = [5.; 0.; 0.];
H = [1 0 0];
m = size(H,1)
Pi = Matrix(3*I,n,n)
Q = zeros(n,n);
R = ones(m,m)
Out[12]:
1×1 Array{Float64,2}:
 1.0
In [13]:
function LorenzTest(nmax,method)
    no = 5:nmax
    xt,xfree,xa,yt,yo,diag_ = TwinExperiment(,xit,Pi,Q,R,H,nmax,no,method)

    time = 1:nmax+1
    plot(time,xt[1,:],"b",label = "true")
    plot(time,xfree[1,:],"r",label = "free")
    plot(time[no],yo[1,:],".",label = "observation")
    plot(time,xa[1,:],"g", label = "assim")
    legend()
    title("$method applied to the Lorenz-63 model")
    grid("on")
end

nmax = 50;
method = "KF";
LorenzTest(nmax,method);
In [14]:
nmax = 50;
method = "4DVar";
LorenzTest(nmax,method);

Note for longer time integration, the 4D-Var method can no longer track the observations. The minimization algorithm is trapped in a local minima.

In [15]:
nmax = 150;
method = "KF";
LorenzTest(nmax,method);
In [16]:
nmax = 150;
method = "4DVar";
LorenzTest(nmax,method);