Kalman Filter: Deriving Kalman Gain
State Estimation is a very important problem in robotics wherein the probability distribution over the possible states of a robot is computed using the sensor readings (or observations). Let's look at the most basic linear state estimation problem where $X \in \mathbb{R}^{d}$ denotes the true state of a system. We would like to build an estimator for this state (which indicates our belief of the state) denoted by $\hat{X}$, which is a random variable. We would like the estimator to be unbiased, i.e., \begin{equation*} E[\hat{X}] = X \end{equation*} as it expresses the concept that if we were to measure the state of the system many times, say using many sensors or multiple observations from the same sensor, the resultant estimator is correct on average. The covariance of this estimator is denoted by $\Sigma_{\hat{X}}$. Note that a covariance matrix is a symmetric and positive semi-definite.
Let's imagine that we have a sensor that gives us observations of the state as, \begin{equation*} Y = C X + \upsilon \end{equation*} where $Y \in \mathbb{R}^{p}$, which is a linear function of the true state $X \in \mathbb{R}^{d}$ with sensor intrinsic matrix $C \in \mathbb{R}^{p \times d}$. This observation is not precise and we will model the sensor as having zero-mean Gaussian noise $\upsilon \sim \mathbb{N}(0, Q)$.
If we have an existing estimator $\hat{X'}$, we can combine it (linearly) with the independent observation $Y$ to obtain a better estimate (in the sense of reducing the variance) of what the state could be. \begin{equation*} \hat{X} = K' \hat{X'} + K Y \tag{1} \end{equation*} Since the overall estimator should also be unbiased, \begin{align*} E[\hat{X}] &= E[K' \hat{X'} + K Y] \\ &= K' E[\hat{X'}] + K E [Y] \\ &= K' X + K E [CX + \upsilon] \\ &= (K' + KC)X \Rightarrow X \\ & K' + KC = I_{d \times d} \\ \end{align*} replacing $K'$ in Equation 1, we get \begin{align*} \hat{X} &= (I - KC) \hat{X'} + K Y \tag{2.1} \\ &= \hat{X'} + K(Y - C\hat{X'}) \tag{2.2} \end{align*} The term ($Y - C\hat{X'}$) is called the "Innovation". The covariance of $\hat{X}$ is given by, \begin{align*} \Sigma_{\hat{X}} &= \textrm{cov}((I - KC) \hat{X'} + K Y) \\ &= (I - KC) \Sigma_{\hat{X'}} (I - KC)^T + K \textrm{cov}(Y) K^T \\ &= (I - KC) \Sigma_{\hat{X'}} (I - KC)^T + K \textrm{cov}(CX + \upsilon) K^T \\ &= (I - KC) \Sigma_{\hat{X'}} (I - KC)^T + K Q K^T \tag{3} \end{align*} Note that $X$ is the true state of the system and therefore its covariance is zero. In order to achieve a better estimate, we minimize the variance (which is the trace of this covariance matrix), i.e. differentiate wrt $K$ and assign it to zero. We will use the following identity for the partial derivative of a matrix product, \begin{equation*} \frac{\partial}{\partial{A}} tr(ABA^T) = 2AB \end{equation*} Applying this on Equation 3 we get, \begin{align*} \frac{\partial}{\partial{K}} tr(\Sigma_{\hat{X}}) = -2(I - KC) \Sigma_{\hat{X'}} C^T + 2KQ &= 0 \\ - \Sigma_{\hat{X'}} C^T + KC \Sigma_{\hat{X'}} C^T + KQ &= 0 \end{align*} \begin{equation*} \therefore K = \Sigma_{\hat{X'}} C^T (C \Sigma_{\hat{X'}} C^T + Q)^{-1} \tag{4} \end{equation*} The matrix $K \in \mathbb{R}^{d \times p}$ is called the “Kalman gain” after Rudoph Kalman who developed this method in the 1960s.
Appendix
Given, \begin{align*} K &= \Sigma_{\hat{X'}} C^T (C \Sigma_{\hat{X'}} C^T + Q)^{-1} \\ (I - KC) &= Q (C \Sigma_{\hat{X'}} C^T + Q)^{-1} \\ \end{align*} The covariance $\Sigma_\hat{X}$ (from Equation 3) can further be simplfied as, \begin{align*} \Sigma_{\hat{X}} &= (I - KC) \Sigma_{\hat{X'}} (I - KC)^T + K Q K^T \\ &= \frac{Q \Sigma_{\hat{X'}} Q^T + (\Sigma_{\hat{X'}} C^T) Q (\Sigma_{\hat{X'}} C^T)^T}{(C \Sigma_{\hat{X'}} C^T + Q) (C \Sigma_{\hat{X'}} C^T + Q)^T} \\ &= \frac{Q \Sigma_{\hat{X'}} Q^T + \Sigma_{\hat{X'}} C^T Q C \Sigma_{\hat{X'}}^T}{(C \Sigma_{\hat{X'}} C^T + Q) (Q^T + C \Sigma_{\hat{X'}}^T C)} \\ &= \frac{Q \Sigma_{\hat{X'}} (Q^T + C \Sigma_{\hat{X'}}^T C)}{(C \Sigma_{\hat{X'}} C^T + Q) (Q^T + C \Sigma_{\hat{X'}}^T C)} \\ &= Q \Sigma_{\hat{X'}} (C \Sigma_{\hat{X'}} C^T + Q)^{-1} \\ &= (I - KC) \Sigma_{\hat{X'}} \tag{5} \end{align*}
Comments
Post a Comment