Exercise set

Data Assimilation and Digital Twin Exercises

Worked mathematical engineering exercises for data assimilation and digital twins covering residuals, sensor fusion, Kalman updates, uncertainty propagation, sampling, validation metrics, and drift.

These exercises practise data assimilation and digital twin calculations for engineering systems. The focus is not on building a large software platform. The focus is on the small numerical checks that decide whether a connected model should be trusted, down-weighted, recalibrated, or rejected for a decision.

Assume scalar examples unless an exercise states otherwise. Real digital twins may require multivariable estimation, nonlinear models, delayed measurements, missing data handling, correlated uncertainty, cybersecurity review, model governance, and independent validation.

How to Use These Exercises

For each problem, state:

  1. the engineering quantity being estimated;
  2. the measurement boundary and data-quality assumptions;
  3. the model prediction before the measurement update;
  4. the uncertainty attached to measurement and model;
  5. the decision that would change if the estimate changes.

The most common mistake is treating a digital twin as credible because it is connected to live data. A twin is credible only when residuals, uncertainty, validation range, data lineage, and decision limits are visible.

Use the exercises as model-governance gates: accept a measurement update, down-weight a sensor, flag a residual, widen an uncertainty interval, trigger drift review, reject a recommendation outside the validation envelope, or require engineering review before recalibration changes the model baseline.

Exercise 1: Normalized Residual

A thermal digital twin predicts a bearing temperature of 82^\circ\text{C}. The measured temperature is 88^\circ\text{C}. The expected one-sigma residual uncertainty from sensor accuracy and model error is 2.5^\circ\text{C}.

Find the residual and normalized residual.

Solution

Residual:

r=y-\hat{y}
r=88-82=6^\circ\text{C}

Normalized residual:

\displaystyle z_r=\frac{r}{\sigma_r}
\displaystyle z_r=\frac{6}{2.5}=2.4

Engineering Comment

A normalized residual of 2.4 does not prove a fault by itself, but it is large enough to justify review. The engineer should check sensor calibration, timestamp alignment, operating mode, load change, cooling condition, and whether the uncertainty budget is realistic.

Exercise 2: Weighted Sensor Fusion

Two independent sensors estimate the same pressure. Sensor A reports 102\ \text{kPa} with standard uncertainty 1.5\ \text{kPa}. Sensor B reports 98\ \text{kPa} with standard uncertainty 3.0\ \text{kPa}.

Find the inverse-variance weighted estimate and its standard uncertainty.

Solution

Weights:

\displaystyle w_i=\frac{1}{\sigma_i^2}
\displaystyle w_A=\frac{1}{1.5^2}=0.444
\displaystyle w_B=\frac{1}{3.0^2}=0.111

Weighted estimate:

\displaystyle \hat{x}=\frac{w_A x_A+w_B x_B}{w_A+w_B}
\displaystyle \hat{x}=\frac{0.444(102)+0.111(98)}{0.444+0.111}=101.2\ \text{kPa}

Combined standard uncertainty:

\displaystyle \sigma_{\hat{x}}=\sqrt{\frac{1}{w_A+w_B}}
\displaystyle \sigma_{\hat{x}}=\sqrt{\frac{1}{0.555}}=1.34\ \text{kPa}

Engineering Comment

The estimate is closer to Sensor A because Sensor A has lower uncertainty. This calculation assumes independent, unbiased sensors. If both sensors share a calibration bias, pipe location error, or filtering delay, inverse-variance weighting can create false confidence.

Exercise 3: Scalar Kalman Update

A simple estimator predicts a tank level of \hat{x}^- = 50\ \text{cm} with prediction variance P^- = 4\ \text{cm}^2. A level sensor measures z = 56\ \text{cm} with measurement variance R = 9\ \text{cm}^2.

Compute the Kalman gain, updated estimate, and updated variance.

Solution

Kalman gain for a scalar direct measurement:

\displaystyle K=\frac{P^-}{P^-+R}
\displaystyle K=\frac{4}{4+9}=0.308

Updated estimate:

\hat{x}^+=\hat{x}^-+K(z-\hat{x}^-)
\hat{x}^+=50+0.308(56-50)=51.85\ \text{cm}

Updated variance:

P^+=(1-K)P^-
P^+=(1-0.308)(4)=2.77\ \text{cm}^2

Engineering Comment

The update moves toward the measurement but does not jump all the way because the sensor uncertainty is larger than the prediction uncertainty. If the prediction model is overconfident, the update may underreact to real changes.

Exercise 4: Root-Mean-Square Error and Bias

A digital twin forecast is tested against five independent validation points. Forecast errors are:

2,\ -1,\ 3,\ -2,\ 0

in engineering units. Compute the mean error and root-mean-square error.

Solution

Mean error:

\displaystyle \bar e=\frac{2-1+3-2+0}{5}=0.4

Root-mean-square error:

\displaystyle RMSE=\sqrt{\frac{2^2+(-1)^2+3^2+(-2)^2+0^2}{5}}
\displaystyle RMSE=\sqrt{\frac{18}{5}}=1.90

Engineering Comment

The mean error shows small positive bias, while RMSE shows typical error magnitude. A validation report should not use only one aggregate number. It should also check operating mode, load range, outliers, time horizon, and whether errors grow near decision limits.

Exercise 5: Prediction-Interval Coverage

A digital twin reports 90 percent prediction intervals for a pressure forecast. In an independent validation set of 40 samples, 34 measured values fall inside the reported intervals.

Estimate empirical coverage and compare it with the nominal 90 percent target.

Solution

Empirical coverage:

\displaystyle C=\frac{34}{40}=0.85=85\%

Coverage shortfall:

90\%-85\%=5\ \text{percentage points}

Engineering Comment

The intervals are under-covering in this validation set. They may be too narrow, the validation conditions may be outside the model range, or the uncertainty budget may be missing an error source. A twin that under-reports uncertainty can encourage unsafe decisions.

Exercise 6: Sampling Check

A vibration feature relevant to a digital twin can contain frequencies up to 0.18\ \text{Hz}. A sensor records one sample every 2\ \text{s}.

Check whether the sampling rate satisfies the Nyquist criterion for this feature.

Solution

Sampling rate:

\displaystyle f_s=\frac{1}{2}=0.5\ \text{Hz}

Nyquist frequency:

\displaystyle f_N=\frac{f_s}{2}=0.25\ \text{Hz}

The highest feature frequency is:

0.18\ \text{Hz}<0.25\ \text{Hz}

The sampling rate satisfies the basic Nyquist condition for this feature.

Engineering Comment

Passing the Nyquist check is not enough for a robust monitoring design. Engineers should also consider anti-alias filtering, timestamp jitter, phase delay, transient events, sensor mounting, missing samples, and whether the feature remains visible under realistic noise.

Exercise 7: Uncertainty Propagation for a Heat-Balance Mismatch

A heat exchanger twin compares hot-side and cold-side heat duties. The hot-side duty is 165\ \text{kW} with standard uncertainty 5\ \text{kW}. The cold-side duty is 154\ \text{kW} with standard uncertainty 4\ \text{kW}. Assume independent uncertainties.

Find the heat-balance mismatch and its standard uncertainty.

Solution

Mismatch:

\Delta Q=Q_h-Q_c
\Delta Q=165-154=11\ \text{kW}

For independent uncertainty:

\sigma_{\Delta Q}=\sqrt{\sigma_h^2+\sigma_c^2}
\sigma_{\Delta Q}=\sqrt{5^2+4^2}=6.4\ \text{kW}

Normalized mismatch:

\displaystyle z=\frac{11}{6.4}=1.72

Engineering Comment

The mismatch is not negligible, but it is less than two standard uncertainties. The engineer should treat it as a warning rather than immediate proof of fouling. Repeated mismatches under similar conditions would be stronger evidence than one isolated value.

Exercise 8: Drift Trigger from Rolling Bias

A pump digital twin has a validation rule: if the rolling mean residual over a stable operating window exceeds 4\ \text{kPa} in magnitude, the model must be reviewed. The last six residuals are:

3,\ 5,\ 4,\ 6,\ 5,\ 4\ \text{kPa}

Compute the rolling mean residual and decide whether review is triggered.

Solution

Rolling mean:

\displaystyle \bar r=\frac{3+5+4+6+5+4}{6}
\displaystyle \bar r=\frac{27}{6}=4.5\ \text{kPa}

The rule threshold is 4\ \text{kPa}. Since:

4.5>4

model review is triggered.

Engineering Comment

The response should not be automatic recalibration. The correct first action is to review sensor drift, maintenance changes, operating mode, pump wear, fluid properties, and data pipeline changes. Recalibrating too quickly can hide a real equipment change.

Review Checklist

Before accepting a data-assimilation or digital-twin calculation, check:

  • whether the residual is interpreted with a stated uncertainty;
  • whether sensor fusion assumes independence and unbiased measurements;
  • whether filter gains reflect realistic process and measurement uncertainty;
  • whether validation data are independent from calibration data;
  • whether prediction intervals have empirical coverage;
  • whether sampling and delay are adequate for the physics;
  • whether drift triggers lead to engineering review, not blind model tuning;
  • whether missing data, timestamp shifts, shared calibration bias, and pipeline changes could explain the apparent model error;
  • whether every recommendation records model version, data window, and validity limits.
  • whether any control, maintenance, or operating recommendation is blocked when the uncertainty is larger than the decision margin.

Good data assimilation makes uncertainty operational. A digital twin becomes useful when it tells engineers not only what it estimates, but how much confidence it has, why the estimate changed, and when the evidence is no longer strong enough for the decision.

REF

See also