Kalman Filter: Filtering Algorithm
Kalman Filter is by far the most important algorithm in robotics and it is hard to imagine running any robot without the Kalman fiter or some variant of it. Our goal is to compute the best estimate of the state after multiple observations i.e., $P(x_k | y_1, ..., y_k)$. Think of it as a problem where state $x_k$ is the position of a car (which could be stationary or moving) and observations $y_k$ give us some estimate of the this position. Using filtering, we want to find the probability of the car being at a location at time k given all previous observations.
Consider a linear dynamical system with linear observations, \begin{align*} x_{k+1} &= A x_k + B u_k + \epsilon \tag{1.1}\\ y_{k} &= C x_k + \upsilon \tag{1.2} \end{align*} where $u_k$ denotes the control input at time step $k$ and noise vectors $\epsilon \sim N(0,R)$ and $\upsilon \sim N(0,Q)$. We assume that dynamics noise and observation noise are “white”, i.e., uncorrelated in time.
Next, we assume that the intial distribution of $x_0$ is Gaussian, since our system is linear. Then our new estimate of the state at time $k$ is also a Gaussian denoted by, \begin{equation*} \hat{x}_{k|k} \sim P(x_k|y_1, ..., y_k) = N(\mu_{k|k}, \Sigma_{k|k}) \end{equation*} The term $\hat{x}_{k|k}$ denotes the estimate of the state at $k$ time step given $k$ observations.
Step 1: Propagating the dynamics by one timestep
Suppose we had this estimate $\hat{x}_{k|k}$. We can write the linear dynamical system equation (Equation 1.1) as, \begin{equation*} \hat{x}_{k+1|k} = A \hat{x}_{k|k} + B u_k + \epsilon \\ \end{equation*} The term $\hat{x}_{k+1|k}$ denotes the estimate of the state at $k+1$ time step given $k$ observations. This can also be perceived as using the prediction problem to compute the estimate of the state at time $k + 1$ before the next observation arrives, i.e. $P(x_{k+1} | y_1, ..., y_k)$. The mean and covariance of this estimate are given by, \begin{equation*} \mu_{k+1|k} = E[\hat{x}_{k+1|k}] = E[A \hat{x}_{k|k} + B u_k + \epsilon] = A \mu_{k|k} + B u_k \tag{2} \end{equation*} \begin{equation*} \Sigma_{k+1|k} = cov(\hat{x}_{k+1|k}) = \textrm{cov}(A \hat{x}_{k|k} + B u_k + \epsilon) = A \Sigma_{k|k} A^T + R \tag{3} \end{equation*}
Step 2: Incorporating the observation
The dynamics propogation step gives us the state of the system at $k+1$ time step that we believe is true after $k$ observations. We are going to use our development of the Kalman gain (from the previous post) to incorporate the new observation $y_{k+1}$ in order to update the estimate to get $P(x_{k+1} | y_1, ..., y_{k+1})$. The linear observation model (Equation 1.2) can be written as, \begin{equation*} y_{k+1} = C x_{k+1} + \upsilon \end{equation*} Since we are combining the old estimate $\hat{x}_{k+1|k}$ and the new observation $y_{k+1}$ linearly, we can compute the Kalman gain $K_{k+1}$ (refer to Equation 4 of Part 1) as, \begin{equation*} K_{k+1} = \Sigma_{k+1|k} C^T (C \Sigma_{k+1|k} C^T + Q)^{-1} \tag{4} \end{equation*} The updated estimator can be written as (refer to Equation 2.2 of Part 1), \begin{equation*} \hat{x}_{k+1|k+1} = \hat{x}_{k+1|k} + K_{k+1} (y_{k+1} - C\hat{x}_{k+1|k}) \end{equation*} The new estimate is the previous estimate plus Kalman gain times the innovation term. The mean of this new estimate is given by, \begin{align*} \mu_{k+1|k+1} &= E [\hat{x}_{k+1|k+1}] = E[\hat{x}_{k+1|k} + K_{k+1} (y_{k+1} - C\hat{x}_{k+1|k})] \\ &= \mu_{k+1|k} + K_{k+1} (y_{k+1} - C \mu_{k+1|k}) \tag{5} \end{align*} And the covariance can be written as (refer to Equation 3 and Equation 5 of Part 1), \begin{align*} \Sigma_{k+1|k+1} &= (I - K_{k+1}C) \Sigma_{k+1|k} (I - K_{k+1}C)^T + K_{k+1} Q K_{k+1}^T \tag{6.1} \\ &= (I - K_{k+1}C) \Sigma_{k+1|k} \tag{6.2} \end{align*} The new estimate of the state is, \begin{equation*} \hat{x}_{k+1|k+1} \sim P(x_{k+1}|y_1, ..., y_{k+1}) = N(\mu_{k+1|k+1}, \Sigma_{k+1|k+1}) \end{equation*}
Assuming that our initial estimate for the state $\hat{x}_{0|0}$ has a known distribution $N(\mu_{0|0}, \Sigma_{0|0})$, we can repetitively run Step 1 and Step 2 to obtain the estimates of all the states, i.e the Kalman filter carries out recursive updates to compute the best estimate given all past observations. The KF is special because it is the optimal linear filter, i.e., KF estimates have the smallest mean squared error with respect to the true state for linear dynamical systems with Gaussian noise. But, if we had a nonlinear dynamical system or a non-Gaussian noise with a linear dynamics/observations, there are other filters that can give a smaller error than KF which we will talk about in the next blog post!
Comments
Post a Comment