Exercise set
Engineering Statistical Inference, Process Capability, and DOE Exercises
Worked engineering-statistics exercises for confidence intervals, sample size, DOE, regression, capability, control charts, guard bands and release.
These exercises practise engineering statistics as release evidence for measurements, experiments and processes. They cover confidence intervals, sample size, design-change comparison, factorial effects, regression margin, model drift, Monte Carlo precision, process capability, control charts, guard bands, binomial defect bounds, statistical power, tolerance bounds, gage variation, subgrouping, false rejection, field-sample representativeness and release gates.
The focus is narrower than reliability life data. Here the central question is whether measured samples, experiments and process evidence support a design, process or acceptance decision without hiding uncertainty, bias or sampling weakness.
How to Use These Exercises
For each calculation, define:
- the measured characteristic, unit, lot, run, operator or scenario boundary;
- the statistical model and what engineering claim it supports;
- the consequence of false acceptance and false rejection;
- the required confidence, coverage, power or guard band;
- the release action when the bound is weaker than the point estimate.
The common mistake is reporting a sample average, p-value or capability index without connecting it to the release decision and sampling boundary.
Release Evidence Notes
Sampling evidence should preserve representativeness. A large sample can be weak when it excludes the high-temperature condition, worst operator, marginal supplier lot or field configuration that controls release.
Confidence evidence should state what is bounded. A confidence interval for a mean is not a tolerance bound for future units, and a process capability index is not a reliability claim.
Process evidence should separate common-cause control from specification compliance. A stable process can be off target; an on-spec sample can come from an unstable process.
Guard-band evidence should account for measurement uncertainty and decision risk. Tightening the acceptance limit can reduce false accepts while increasing false rejects.
Engineering Boundary Notes
These exercises use simplified statistical formulas. Real engineering release may require distribution checks, nonparametric methods, mixed-effects models, randomization, independence review, measurement-system analysis, outlier policy, missing-data treatment and subject-matter judgement.
A statistical pass does not override a physical failure mode. If test data pass statistically but failures cluster by mechanism, the mechanism controls the release decision.
Scenario Map
| Scenario | Exercises | Primary calculation | Engineering decision |
|---|---|---|---|
| Estimation and comparison | 1-6, 16 | confidence intervals, sample size, two-sample comparison, DOE, regression and power | Decide whether measured evidence supports a change or estimate. |
| Process and measurement release | 7-12, 14-15 | model drift, Monte Carlo precision, capability, control charts, guard bands, defect bounds and gage contribution | Decide whether production or inspection can release. |
| Population coverage | 13, 17-18 | tolerance bounds, representativeness and weighted release gates | Decide whether future units are covered by the evidence. |
Exercise 1: Confidence Interval for Sensor Bias
A calibration check measures sensor bias on:
units. The sample mean bias is:
and sample standard deviation is:
Use t=2.13 for a two-sided 95\% confidence interval.
Solution
Standard error:
Half-width:
Confidence interval:
Engineering Comment
The interval estimates mean bias, not worst-case individual error. A guard band or tolerance bound may still be needed for acceptance decisions.
Plausibility Check
Sixteen samples reduce the standard deviation by a factor of four, so a half-width near 0.15 mm is plausible.
Exercise 2: Sample Size for a Mean Estimate
An engineer wants mean force estimated within:
at approximately 95\% confidence. Historical standard deviation is:
Use:
Solution
Sample size:
Round up:
Engineering Comment
The sample should represent lots, operators and environmental states. A large sample from one easy condition is not strong evidence for release.
Plausibility Check
The desired error is much smaller than the standard deviation, so dozens of samples are expected.
Exercise 3: Two-Test Comparison for a Design Change
Old and revised parts are tested. The mean strength values are:
The standard error of the difference is:
Use t=2.0 for a simplified 95\% interval. Calculate the difference interval.
Solution
Difference:
Half-width:
Interval:
Engineering Comment
The interval excludes zero, so the change appears beneficial under this test. Release still needs failure-mode review and representativeness.
Plausibility Check
The observed difference is about three standard errors, so a positive interval is reasonable.
Exercise 4: Two-Factor DOE Main Effects
A 2^2 screening experiment measures yield for factors A and B:
| A | B | Yield |
|---|---|---|
| low | low | 82 |
| high | low | 88 |
| low | high | 85 |
| high | high | 95 |
Calculate main effects for A and B.
Solution
Mean at high A:
Mean at low A:
Effect of A:
Effect of B:
Engineering Comment
Factor A has the larger main effect, but interaction and replication should be checked before final process changes.
Plausibility Check
The best run is high-high, and high A improves both low and high B cases, so a positive A effect is expected.
Exercise 5: Interaction Effect in a Screening DOE
Use the DOE data from Exercise 4. Calculate the interaction effect:
Solution
Substitute:
Engineering Comment
The interaction is smaller than the main effects but not zero. Confirmation runs should check whether the high-high condition is stable and repeatable.
Plausibility Check
The high-high yield is better than adding main effects alone would suggest slightly, so a small positive interaction is plausible.
Exercise 6: Regression Prediction with Engineering Margin
A regression predicts temperature rise:
at the release load. Prediction standard error is:
Use a guarded prediction:
The limit is:
Check release.
Solution
Guarded prediction:
Margin:
The release passes with narrow guarded margin.
Engineering Comment
Regression evidence should include residual checks and the range of data used to fit the model. Extrapolation can invalidate the margin.
Plausibility Check
The nominal value is 8^\circC below the limit, and the guard consumes 7^\circC of that margin.
Exercise 7: Residual Z-Score for Model Drift
A digital model predicts:
The measured value is:
The residual standard deviation from validation is:
Calculate residual z-score and compare with a drift trigger of |z|>2.5.
Solution
Residual:
Z-score:
The drift trigger is exceeded.
Engineering Comment
Model drift should trigger an engineering review of sensors, operating state, configuration and physics assumptions, not only a statistical alert.
Plausibility Check
The residual is almost three times the validation standard deviation, so a trigger is expected.
Exercise 8: Monte Carlo Failure Probability Precision
A Monte Carlo simulation has:
runs and observes:
failures. Estimate failure probability and approximate standard error:
Solution
Failure probability:
Standard error:
Engineering Comment
Rare-event simulations need enough failures to make the estimate stable. Input distributions and dependence assumptions may dominate the numerical standard error.
Plausibility Check
One hundred forty failures is enough for a small standard error but not enough to ignore modelling assumptions.
Exercise 9: Process Capability and Off-Center Mean
A process has specification:
Measured mean is:
and standard deviation is:
Calculate C_p and C_{pk}.
Solution
Specification width:
Capability:
Upper-side capability:
Lower-side capability:
Therefore:
Engineering Comment
The process has good spread capability but is shifted toward the upper limit. Centering can improve release margin without reducing variation.
Plausibility Check
C_p is larger than C_{pk} because the mean is off center.
Exercise 10: X-Bar Control Chart Signal
A process has historical mean:
and standard deviation:
Subgroup size is:
Calculate three-sigma control limits for subgroup means. A new subgroup mean is 52.1. Check the signal.
Solution
Standard error:
Control limits:
The subgroup mean 52.1 exceeds 51.8, so it signals.
Engineering Comment
A control-chart signal is a process stability issue. It should not be averaged away just because individual units are still inside specification.
Plausibility Check
The subgroup mean is slightly above the upper control limit, so the signal is narrow but real.
Exercise 11: Guard-Band Acceptance with Measurement Uncertainty
A dimension must not exceed:
Expanded measurement uncertainty is:
The acceptance rule is:
A measured part has:
Check acceptance.
Solution
Guarded value:
Since:
the part is not accepted under the guard-band rule.
Engineering Comment
Guard bands reduce false acceptance risk but can increase false rejects. The rule should match product risk and measurement capability.
Plausibility Check
The nominal measurement is only 0.05 mm below the limit, smaller than the uncertainty.
Exercise 12: Binomial Defect-Rate Upper Bound
A supplier lot sample tests:
parts and finds:
defects. Use the simple point estimate and an approximate one-sided upper bound:
Solution
Point defect rate:
Standard error:
Upper bound:
Therefore:
Engineering Comment
The point defect rate looks low, but the upper bound is wider because only 120 parts were sampled and defects were observed.
Plausibility Check
Two defects in 120 is about 1.7\%, and a conservative bound roughly doubles it.
Exercise 13: One-Sided Tolerance Bound for Release
A sample has:
For the selected confidence and coverage, use one-sided factor:
Upper tolerance bound is:
The release limit is 100 N. Check release.
Solution
Tolerance bound:
Margin:
The release passes.
Engineering Comment
A tolerance bound addresses population coverage. It is wider than a confidence interval for the mean because future units must be covered.
Plausibility Check
The mean is far enough below the limit that adding a few standard deviations still passes.
Exercise 14: Measurement Variation and Apparent Capability
A measured process standard deviation is:
Gage R&R standard deviation is:
Estimate process-only standard deviation:
Solution
Process standard deviation:
Engineering Comment
Measurement error inflates observed spread. Correcting the estimate helps diagnosis, but inspection decisions still need the actual measurement uncertainty.
Plausibility Check
Removing measurement variation should reduce the standard deviation from 0.060 mm to a smaller value.
Exercise 15: False-Rejection Rate from Guard Band
Assume true part values are normally distributed with:
Specification limit is 25.00 mm, but the guarded acceptance limit is:
Estimate the probability a conforming part exceeds the guarded limit using:
and P(Z>1.6)=0.0548.
Solution
Z-score:
False-rejection probability under this simplified model:
Engineering Comment
Guard bands reduce false acceptance risk but may reject good product. The economic and safety trade-off should be explicit.
Plausibility Check
The guarded limit is 1.6 standard deviations above the process mean, so a few percent tail probability is plausible.
Exercise 16: Statistical Power for a Design-Change Test
A test should detect a true improvement of:
with standard deviation:
and equal sample sizes:
per group. Approximate signal-to-noise for the difference:
Solution
Standard error of difference:
Signal-to-noise:
Engineering Comment
A larger Z supports good power, but final power depends on alpha level, one-sided versus two-sided test and the practical decision threshold.
Plausibility Check
Twenty-five per group reduces noise enough that a 4 unit change is nearly three standard errors.
Exercise 17: Field-Sample Representativeness Weight
A validation plan requires coverage of three operating states:
| State | Required share | Sampled share |
|---|---|---|
| normal | 50\% | 70\% |
| hot | 30\% | 20\% |
| vibration | 20\% | 10\% |
Use the minimum sampled-to-required ratio as a representativeness score.
Solution
Ratios:
The representativeness score is:
Engineering Comment
Over-sampling normal cases does not compensate for under-sampling vibration if vibration controls failure or performance.
Plausibility Check
The vibration state has only half of the required coverage, so the score should be 0.50.
Exercise 18: Statistical Release Gate
A statistical release package has five gates:
| Gate | Weight | Result |
|---|---|---|
| sampling representativeness | 0.25 | 0.88 |
| confidence or tolerance bound | 0.25 | 0.94 |
| measurement-system evidence | 0.20 | 0.91 |
| process stability and capability | 0.20 | 0.93 |
| decision-risk documentation | 0.10 | 0.96 |
The weighted release threshold is:
and sampling representativeness may not be below 0.90. Calculate the decision.
Solution
Weighted score:
The score is:
The score fails the 92\% threshold, and sampling also fails its floor. Release is held.
Engineering Comment
Statistical strength cannot be rescued by averaging if the sample misses the state that controls the engineering claim.
Plausibility Check
Several gates are good, but the largest-weight gate is below floor and pulls the score just under threshold.
Validation Package Checklist
- Sampling plan names units, lots, operators, environments, time windows and exclusion rules.
- Confidence intervals, tolerance bounds and prediction intervals are not used interchangeably.
- DOE evidence includes factor levels, randomization, replication and interaction review.
- Regression evidence includes residual checks and no unguarded extrapolation.
- Capability and control-chart evidence separate stability from specification compliance.
- Guard bands state false-accept and false-reject consequences.
- Release decisions document the action when new data move the bound.
Common Release Mistakes
- Treating sample mean as population guarantee.
- Reporting a p-value without a practical engineering effect size.
- Using a confidence interval for the mean when the claim is about future units.
- Accepting C_p while ignoring off-center mean and C_{pk}.
- Averaging normal-condition samples to hide weak hot or vibration coverage.
- Removing measurement uncertainty from acceptance decisions after using it to diagnose process spread.
- Closing release when a guard-band, tolerance-bound or representativeness floor fails.