Optimal Fusion of Two Sensors

Optimal Fusion of Two Sensors#

This demo makes inverse-variance fusion concrete. Two scalar sensors — a precise radar altimeter and a noisier barometric altimeter — both measure the same true altitude. The demo shows the three resulting probability densities (radar, baro, fused) on a common axis and a variance-vs-weight curve underneath. The variance curve is the headline pedagogy: drag the fusion weight \(w_R\) around and the dot walks along a parabola whose minimum sits at exactly the inverse-variance optimum. Move away from that minimum in either direction and the fused \(\sigma\) visibly inflates.

The math#

Two unbiased independent measurements with variances \(\sigma_R^2\) and \(\sigma_B^2\) combine as

\[ \hat{x} = w_R\, z_R + (1 - w_R)\, z_B, \qquad \mathrm{Var}(\hat{x}) = w_R^2\, \sigma_R^2 + (1 - w_R)^2\, \sigma_B^2. \]

Minimizing \(\mathrm{Var}(\hat{x})\) with respect to \(w_R\) gives the inverse-variance weights

\[ w_R^{\text{opt}} = \frac{1/\sigma_R^2}{1/\sigma_R^2 + 1/\sigma_B^2}, \qquad \sigma_{\text{opt}}^2 = \left( \frac{1}{\sigma_R^2} + \frac{1}{\sigma_B^2} \right)^{-1}. \]

Equivalently, in information form (\(I = 1/\sigma^2\)), \(I_{\text{opt}} = I_R + I_B\). Independent sensors add information; you cannot lose information by fusing in a second sensor, only by refusing to use it.

Interactive demo#

Open in full screen

Walkthrough#

The demo opens at the canonical Block 3 scenario: \(\sigma_R = 2\) m, \(\sigma_B = 10\) m, no biases, fusion weight at the optimum \(w_R = 25/26 \approx 0.962\). Try the following:

Read the headline metrics. \(\sigma_{\text{fused}} \approx 1.961\) m at the optimum — slightly tighter than the radar alone (\(\sigma_R = 2\) m). The information ratio \(I_R / I_B = 25 : 1\) is the reason: the radar carries 25× the information of the baro. The optimum weight is essentially the radar’s information share.
Drag \(w_R\) down to 0.5. The fused PDF (green) widens dramatically; the dot on the variance-vs-weight curve walks up the parabola; \(\sigma_{\text{fused}}\) jumps to about 5.1 m, worse than either sensor alone taken independently. The “Equal weights” chip is a fast way to land here. Wrong weights actively destroy precision.
Click “Snap to optimal.” The fused PDF tightens back to its minimum width; the dot returns to the parabola’s minimum.
Slide \(\sigma_B\) up to its slider max. As the barometer gets noisier, the optimum weight pushes further toward 1.0 (radar) and the optimal \(\sigma_{\text{fused}}\) approaches \(\sigma_R\). In the limit \(\sigma_B \to \infty\), the fused estimate is just the radar — a useless sensor contributes no information.
Slide \(\sigma_R\) and \(\sigma_B\) to be equal, say both at 5 m. The optimum is now \(w_R = 0.5\) (the simple average is optimal in this case) and \(\sigma_{\text{fused}} = 5/\sqrt{2} \approx 3.54\) m. The variance-vs-weight parabola is now symmetric about \(w_R = 0.5\).
Slide bias_R up to +5 m. The radar PDF shifts right; the fused PDF inherits a weighted bias of \(w_R \cdot 5 \approx 4.81\) m. The “Fused estimate vs truth” stat turns warning-color. Optimal fusion only minimizes variance, not bias. If your sensors disagree systematically, fusion will not save you — that is the entire motivation for fault detection in Block 8.
Toggle “overlay Monte Carlo histograms.” 10,000 samples from each sensor and the fused estimate are drawn behind the smooth Gaussian PDFs. They should agree closely with the theoretical curves; “Reseed” cycles to a new realization.

Key observations#

The variance-vs-weight parabola is the fundamental picture. A minimum exists; that minimum is the inverse-variance weighting; moving in either direction makes things strictly worse. This same picture is the mechanism behind the Kalman gain in Block 4, which trades off the prior variance \(P_k^-\) against the measurement variance \(R_k\) to land on an optimum every step.
You cannot lose information by adding an independent sensor. \(I_{\text{opt}} = I_R + I_B \geq I_R\) for any \(I_B \geq 0\). The fused \(\sigma\) is always less than or equal to the smaller of the two input \(\sigma\)s. Even a sensor with \(\sigma_B = 50\) m still buys you a small but real precision improvement when fused optimally.
Information is what’s additive, not variance. \(\sigma_{\text{opt}}^2 = (I_R + I_B)^{-1}\), not \(\sigma_R^2 + \sigma_B^2\). This is why information form (Kalman filter cousin: the information filter) is the natural framework when you have many sensors to fuse.
Bias propagates through fusion, variance shrinks. The fused mean is the weighted average of sensor biases; the fused variance is bounded above by the weighted-quadratic combination. The estimator only minimizes the second moment of the error, not the first.

From fusion to the Kalman filter#

The two-sensor fusion in this demo is the same operation a Kalman filter performs at every measurement update — but with one of the “sensors” replaced by the prior estimate \(\hat{x}_k^-\) propagated forward from the last step. The Kalman gain \(K_k = P_k^- / (P_k^- + R_k)\) is just the inverse-variance weight on the measurement. Block 4 derives this rigorously and adds the covariance update; the fusion math here is the building block.

Source#

MATLAB · code/OptimalFusionDemo.m↓

Table of Contents