This lecture will cover the following topics:
# Imports
import warnings
warnings.filterwarnings("ignore")
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import cv2
np.random.seed(0) # for reproducibility
State
Observation model
Noisy measurements
Example:
The tradeoff between the influence of the model and the measurements is determined by noise.
KF is a recursive algorithm:
The Kalman filter approach is based on three fundamental assumptions:
Each variable at the current time is a linear function of the variables at previous times.
The noise values are not correlated in time.
📝 Note White noise contains a mix of all the different frequencies at the same intensity blended together.
This is similar to how white light contains all the colors of the rainbow combined.
At any point in time, the probability density of the noise is a Gaussian.
Assume a car has a certain initial position $x_0=0$ and initial velocity $\dot{x}_0=60 km/h$.
If the speed is constant, we have:
$$x_{t} = 1\cdot x_{t-1} + \delta t \cdot \dot{x}_{t-1}$$$$\dot{x}_{t} = 0 \cdot x_{t-1} + 1 \cdot \dot{x}_{t-1}$$In matrix form:
$$ \boldsymbol{x}_{t} = \begin{bmatrix} x_{t} \\ \dot{x}_{t} \end{bmatrix} = \begin{bmatrix} 1 & \delta t \\ 0 & 1 \end{bmatrix} \begin{bmatrix} x_{t-1} \\ \dot{x}_{t-1} \end{bmatrix} = \mathbf{A} \boldsymbol{x}_{t-1} $$Where the car will be after 1 minute?
If we rely only on our simple sistem $\boldsymbol{x}_{t+1} = \mathbf{A} \boldsymbol{x}_{t}$ without even consider the noise, the car will keep moving at $60 km/h$ and will be at $1 km$ distance.
Let say our car has a GPS.
In summary, we have two different sources of information:
How do we combine them to get the best possible estimate of our system variables?
The KF is an interative procedure that consists of two steps:
These two steps are used to update two quantities:
Before looking at the iterative procedure let's introduce this second quantity.
Covariance estimate
There are two sources of error in the estimate of the state of our system.
Where:
Each type of error is associated with a covariance matrix, which reflects the amount of uncertainty in the state estimates.
State estimate update:
$$\hat{\boldsymbol{x}}_{t}^{-} = \mathbf{A} \hat{\boldsymbol{x}}_{t-1}$$Covariance estimate update:
$$ \begin{aligned} \mathbf{P}_{t}^{-} &= \mathbb{E}[e^{-}_t e^{-T}_t] \\ &= \mathbb{E}[(\boldsymbol{x}_{t} - \hat{\boldsymbol{x}}^{-}_{t})(\boldsymbol{x}_{t} - \hat{\boldsymbol{x}}^{-}_{t})^T] \\ &= \mathbb{E}[(\mathbf{A} \boldsymbol{x}_{t-1} + \boldsymbol{w} - \mathbf{A} \hat{\boldsymbol{x}}_{t-1})(\mathbf{A} \boldsymbol{x}_{t-1} + \boldsymbol{w} - \mathbf{A} \hat{\boldsymbol{x}}_{t-1})^T] \\ &= \mathbb{E}[(\mathbf{A}e_{t-1})(e_{t-1}^T \mathbf{A}^T)] + \mathbb{E}[\boldsymbol{w}\boldsymbol{w}^T]\\ &= \mathbf{A}\mathbb{E}[e_{t-1}e_{t-1}^T ]\mathbf{A}^T + \mathbf{Q} \\ &= \mathbf{A}\mathbf{P}_{t-1}\mathbf{A}^T + \mathbf{Q} \end{aligned} $$In summary, the Predict step consists of two equations to update:
The measurement update equations are responsible for the feedback, i.e. for incorporating a new measurement into the prior estimate to obtain an improved posterior estimate.
The posterior estimate $\hat{\boldsymbol{x}}_{t}$ is a linear combination of:
The matrix $\mathbf{K} \in \mathbb{R}^{N \times M}$ is chosen to be the gain that minimizes the posterior error covariance.
Derivation (sketch):
The previous derivation give us the following result:
$$\mathbf{K}_t = \mathbf{P}^{-}_t \mathbf{H}^T (\mathbf{H} \mathbf{P}^{-}_t \mathbf{H}^T + \mathbf{R})^{-1}= \frac{\mathbf{P}^{-}_t \mathbf{H}^T}{\mathbf{H} \mathbf{P}^{-}_t \mathbf{H}^T + \mathbf{R}}$$Back to our car example, we have:
$$\mathbf{K}_t = \frac{\begin{bmatrix} p^{-}_{xx} & p^{-}_{\dot x x} \\ p^{-}_{\dot x x} & p^{-}_{\dot x \dot x} \end{bmatrix} \begin{bmatrix} 1 \\ 0 \end{bmatrix}}{\begin{bmatrix} 1 & 0 \end{bmatrix} \begin{bmatrix} p^{-}_{xx} & p^{-}_{\dot x x} \\ p^{-}_{\dot x x} & p^{-}_{\dot x \dot x} \end{bmatrix} \begin{bmatrix} 1 \\ 0 \end{bmatrix} + \begin{bmatrix} r_{xx} \end{bmatrix}} = \frac{\begin{bmatrix} p^{-}_{xx} \\ p^{-}_{x \dot x} \end{bmatrix}}{p^{-}_{xx} + r_{xx}}$$Once again, the index $t$ has been dropped from the elements of $\mathbf{P}^{-}_t$ for readability.
The update equations for the position and speed become:
Let's consider the two extreme cases for the values of $\mathbf{K}_t$
The measurement error covariance $\mathbf{R}$ approaches zero, i.e., there is no noise in the measurements.
Summary of the Correct step:
The whole process is summarized below.
In general, both the measurement noise covariance $\mathbf{R}$ and the process noise covariance $\mathbf{Q}$ are unknown.
This is the workflow of the algorithm we are going to implement:
For one-dimensional examples, there are a couple of simplifications:
Therefore, only normal multiplication (as opposed to matrix multiplication) is involved.
mu = 124.5 # Actual position
R = 0.1 # Actual standard deviation of actual measurements (R)
# Generate measurements
n_measurements = 1000 # Change the number of points to see how the convergence changes
Z = np.random.normal(mu, np.sqrt(R), size=n_measurements)
plt.plot(Z,'k+',label='measurements $z_t$',alpha=0.2)
plt.title('Noisy position measurements')
plt.xlabel('Time')
plt.ylabel('$z_t$')
plt.tight_layout();
# Estimated covariances
Q_est = 1e-4
R_est = 2e-2
def kalman_1d(x, P, measurement, R_est, Q_est):
# Prediction
x_pred = x
P_pred = P + Q_est
# Update
K = P_pred / (P_pred + R_est)
x_est = x_pred + K * (measurement - x_pred)
P_est = (1 - K) * P_pred
return x_est, P_est
# initial guesses
x = 123 # Use an integer (imagine the initial guess is determined with a meter stick)
P = 0.04 # error covariance P
KF_estimate=[] # To store the position estimate at each time point
KF_error=[] # To store estimated error at each time point
for z in Z:
x, P = kalman_1d(x, P, z, R_est, Q_est)
KF_estimate.append(x)
KF_error.append(P)
def plot_1d_comparison(measurements_made, estimate, true_value, axis):
axis.plot(measurements_made,'k+',label='measurements',alpha=0.3)
axis.plot(estimate,'-',label='KF estimate')
if not isinstance(true_value, (list, tuple, np.ndarray)):
# plot line for a constant value
axis.axhline(true_value,color='r',label='true value', alpha=0.5)
else:
# for a list, tuple or array, plot the points
axis.plot(true_value,color='r',label='true value', alpha=0.5)
axis.legend(loc = 'lower right')
axis.set_title('Estimated position vs. time step')
axis.set_xlabel('Time')
axis.set_ylabel('$x_t$')
def plot_1d_error(estimated_error, lower_limit, upper_limit, axis):
# lower_limit and upper_limit are the lower and upper limits of the vertical axis
axis.plot(estimated_error, label='KF estimate for $P$')
axis.legend(loc = 'upper right')
axis.set_title('Estimated error vs. time step')
axis.set_xlabel('Time')
axis.set_ylabel('$P_t$')
plt.setp(axis,'ylim',[lower_limit, upper_limit])
fig, axes = plt.subplots(1,2, figsize=(12, 5))
plot_1d_comparison(Z, KF_estimate, mu, axes[0])
plot_1d_error(KF_error, 0, 0.015, axes[1])
plt.tight_layout();
Question: How much the KF estimate fluctuates around the true value?
Answer: The fluctuations around the true value are approximately the size of the standard deviation of the estimate, which is $\sqrt{P_t}$.
# initial parameters
v0 = 0.3
x0 = 0.0
R = 4.0
# generate noisy measurements
n_measurements = 1000
Zv = np.zeros(n_measurements) # velocity measurements
Zx = np.zeros(n_measurements) # position measurements
for t in range(0, n_measurements-1):
Zv[t] = np.random.normal(v0, np.sqrt(R))
Zx[t+1] = Zx[t] + Zv[t] * 1 # delta_t = 1
# generate true positions
Xt = np.zeros(n_measurements)
for t in range(0, n_measurements):
Xt[t]= x0 + v0*t
plt.plot(Zx,'k+',label='measurements $z_t$',alpha=0.2)
plt.title('Noisy position measurements')
plt.xlabel('Time')
plt.ylabel('$z_t$')
plt.tight_layout();
# initial guesses and estimates
x = 0
P = 0.5
Q_est = 4
R_est = 4
KF_estimate = [] # To store the position estimate at each time point
KF_error = [] # To store estimated error at each time point
# Kalman filter
for z in Zx:
x, P = kalman_1d(x, P, z, R_est, Q_est)
KF_estimate.append(x)
KF_error.append(P)
fig, axes = plt.subplots(1,2, figsize=(12, 5))
plot_1d_comparison(Zx, KF_estimate, Xt, axes[0])
plot_1d_error(KF_error, min(KF_error)-0.1, max(KF_error)+0.1, axes[1])
plt.tight_layout();
Question: Neither the measurements nor the KF estimate are close to the true value. Why?
Answer: The measurements are not close to the true value because the variance of the measurements error $R$ is large compared to the velocity $v_0$.
Our system is defined by the following quantities:
transitionMatrix
in OpenCV).measurementMatrix
in OpenCV).processNoiseCov
in OpenCV).measurementNoiseCov
in OpenCV).For simplicity, we use diagonal covariances in this example.
The constructor has has the following sintax:
KalmanFilter(state_size, measurements_size, control_size)
where:
state_size
is the dimension of the state vector $\boldsymbol{x}$, which is 4 in our case.measurements_size
is the dimension of the state vector $\boldsymbol{x}$, which is 2 in our case.control_size
, which is 0 since we do not have the optional control $\boldsymbol{u}$ (not discussed in this lecture).kalman = cv2.KalmanFilter(4,2,0) # 4 states, 2 measurements, 0 control vector
q = 1 # the variance in the model
r = 20 # the variance in the measurement
dtime = 1 # size of time step
kalman.measurementMatrix = np.array([[1,0,0,0],
[0,1,0,0]],np.float32) # H
kalman.transitionMatrix = np.array([[1,0,dtime,0],
[0,1,0,dtime],
[0,0,1,0],
[0,0,0,1]],np.float32) # A
kalman.processNoiseCov = np.array([[1,0,0,0],
[0,1,0,0],
[0,0,1,0],
[0,0,0,1]],np.float32) * q # Q
kalman.measurementNoiseCov = np.array([[1,0],
[0,1]],np.float32) * r # R
KF_estimate_xy = [] # To store the position estimate at each time point
xy_motion = pd.read_csv('https://zenodo.org/records/10951538/files/kf_ts1.csv?download=1',
header = None).values.astype('float32')
Generate Sample
.from IPython.display import IFrame
IFrame('https://guoguibing.github.io/librec/datagen.html', width=900, height=800)
print(len(xy_motion))
print(xy_motion[0:10]) # Print first 10 items for clarity, but can look at all 205 or plot them
205 [[2.35 1.45] [3.55 1.8 ] [3. 2.9 ] [2.65 4.2 ] [2.95 5.15] [3.7 5.4 ] [4.8 5.55] [5.3 5.25] [5.7 5.25] [5.8 5.85]]
for i in xy_motion:
pred = kalman.predict() # predicts new state using the model
kalman.correct((i)) # updates estimated state with the measurement
KF_estimate_xy.append(((pred[0]),(pred[1]))) # store the estimated position
# Quick check: estimate has same length as measurement data
print(len(KF_estimate_xy))
205
Now plot the data and the KF estimate.
x_est, y_est = zip(*KF_estimate_xy)
x_true, y_true = zip(*xy_motion)
plt.scatter(x_est, y_est, marker= '.', label = 'KF estimate', alpha = 0.5)
plt.scatter(x_true, y_true,marker= '.', label = 'true value', alpha = 0.5)
plt.legend(loc = 'lower center')
plt.title('2D position')
plt.xlabel('x coordinate')
plt.ylabel('y coordinate')
plt.tight_layout();
What we covered in this lecture:
This exercise refers to Example 1.
Q_est
that is larger than estimated measurement variance R_est
.This exercise refers to Example 2.
In the case we examined above, the KF estimate was close to the measurements and both were different from the true value. Change the parameters of the algorithm until you find some combinations that achieve the following:
Discuss your findings.
This exercise refers to Example 3.
In the original example, we used both Position and Velocity (PV model) in the state vector, i.e., $\boldsymbol{x} = \begin{bmatrix} x & y & \dot x & \dot y \end{bmatrix}^T$.
What happens if we use only the Position (P model) to describe the state? After all, our measurements only provide position. Do we really need to include the velocity?
💡 Hint: what is the size of the state vector in this case? What are the dimensions of the matrices that characterize the system?