Exercise set
Reliability Availability, Redundancy, and Proof-Test Exercises
Solved reliability exercises for failure rates, MTBF, availability, redundancy, Weibull models, proof-test coverage, common cause and release gates.
These exercises practise reliability evidence for industrial systems: failure-rate estimates, MTBF, mission reliability, availability, series systems, parallel redundancy, common-cause limits, Weibull models, zero-failure evidence, proof-test coverage and release gates.
The goal is to make the reliability claim reproducible. A number is weak if the exposure basis is hidden, failure definitions are mixed, redundant channels are not independent, common support systems dominate, or proof tests do not cover latent failure modes.
Assume simplified screening models unless an exercise states otherwise. Real release work should also check duty cycle, censoring, maintenance resets, environmental severity, configuration control, diagnostic delay, repair quality, common-cause mechanisms and the consequence of each failed function.
Release Evidence Notes
Reliability evidence should name the asset, function, operating mode, exposure basis and failure definition. Calendar months, operating hours, starts, cycles, missions and demand events cannot be mixed without normalization.
Availability evidence should separate inherent availability from operational availability. A high MTBF and low MTTR result can still fail if logistics delay, spare shortage, permit delay or restart validation is outside the calculation.
Redundancy evidence should prove independence. Two channels that share power, software, cooling, calibration, maintenance procedure or environment may have much less benefit than a simple parallel formula suggests.
Proof-test evidence should map test cases to latent failure modes and required safety or service functions. Counting test steps is not enough if the steps do not expose the hidden failures that matter.
Engineering Boundary Notes
This page covers reliability modelling, availability screening, redundancy architecture and proof-test release. Maintenance interval decisions belong in the companion maintenance interval and condition-monitoring exercise set. Spare reorder points, stockout risk and repairable pools belong in the companion critical spare-parts exercise set.
The boundary is still operational engineering, not pure statistics. Use the mathematical reliability life-data set when the central question is confidence bounds, censored life data or statistical demonstration theory.
Scenario Map
| Scenario | Exercises | Primary check | Engineering decision |
|---|---|---|---|
| Failure rate and availability | 1-4, 18 | MTBF, mission reliability, MTTR and series availability | Decide whether field evidence supports service use. |
| Redundancy architecture | 5-8, 15-16 | One-out-of-two, two-out-of-three, shared controller and beta-factor limits | Decide whether redundancy is real or overstated. |
| Weibull and reliability evidence | 9-11, 17 | Weibull survival, B10 life, zero-failure lower bound and allocation margin | Validate or restrict reliability claims. |
| Proof-test release | 12-14, 18 | Coverage, latent failure exposure and residual dangerous rate | Decide whether release needs more testing or restriction. |
Exercise 1: Failure Rate from Exposure
A fleet accumulates:
and records:
functional failures. Estimate the constant failure rate.
Solution
For a first screening estimate:
Substitute:
Engineering Comment
This estimate is only meaningful for the stated failure definition. Do not combine nuisance alarms, planned shutdowns and loss-of-function failures unless the release question treats them as the same event.
Plausibility Check
Five failures in about eighteen thousand hours gives roughly one failure every few thousand hours, so a rate near 10^{-4}\ \text{h}^{-1} is plausible.
Exercise 2: MTBF from Failure Rate
Using the failure rate:
estimate MTBF.
Solution
For a constant-rate model:
Therefore:
Engineering Comment
MTBF is an average exposure measure, not a guarantee that an individual unit will survive for that duration. It should be tied to environment and configuration.
Plausibility Check
The reciprocal of 2.7\times10^{-4} is a little below 4000, so 3704 hours is consistent.
Exercise 3: Mission Reliability from MTBF
Assume constant failure rate and:
Find reliability for a:
mission.
Solution
For the exponential model:
Thus:
Engineering Comment
This is a mission reliability statement, not an availability statement. It says the function survives the mission without failure under the constant-rate assumption.
Plausibility Check
The mission is much shorter than MTBF, so reliability should be high, but not extremely close to one.
Exercise 4: Availability from MTBF and MTTR
An asset has:
Estimate inherent availability.
Solution
Use:
Substitute:
So:
Engineering Comment
This excludes logistics and waiting time. If spares, permits or restart testing dominate downtime, operational availability will be lower.
Plausibility Check
Repair time is tiny compared with MTBF, so availability close to 99.8\% is reasonable.
Exercise 5: Series Availability
Three required subsystems have availabilities:
Estimate system availability if all three are required.
Solution
For required series functions:
So:
Engineering Comment
High component availability can still produce a weaker system when many required elements are chained. The lowest-availability element usually deserves first review.
Plausibility Check
Multiplying three values below one should lower the result by several percentage points, so about 97.3\% is plausible.
Exercise 6: One-Out-of-Two Redundancy
Two independent channels each have mission reliability:
The function succeeds if at least one channel succeeds. Estimate function reliability.
Solution
Failure of one channel is:
Both fail with probability:
Reliability is:
Engineering Comment
The independence assumption is the main risk. Shared firmware, wiring, calibration or environmental exposure can invalidate the simple result.
Plausibility Check
Each channel is imperfect, but requiring both to fail makes the system failure probability below one percent.
Exercise 7: Two-Out-of-Three Redundancy
Three independent sensors each have mission reliability:
The vote succeeds if at least two sensors work. Estimate reliability.
Solution
The success probability is:
Substitute:
Engineering Comment
Voting improves random failure tolerance, but it can create other risks: common calibration bias, frozen data, voting logic faults and proof-test coverage gaps.
Plausibility Check
The result should be higher than one sensor at 94\% and lower than perfect reliability, so about 99\% is credible.
Exercise 8: Redundancy with a Shared Controller
Two redundant pumps each have availability:
A shared controller required by both pumps has availability:
Estimate system availability.
Solution
Availability of at least one pump is:
With the controller in series:
Engineering Comment
The shared controller caps performance. Redundant field equipment cannot overcome a common required element with weaker availability.
Plausibility Check
The pump pair is nearly 99.9\%, but multiplying by a 98.2\% controller pulls the system near 98.1\%.
Exercise 9: Weibull Mission Reliability
A component has Weibull parameters:
Estimate reliability at:
Solution
Use:
Substitute:
Engineering Comment
Because \beta>1, the failure rate increases with age. Release should check whether the planned mission approaches a wear-out region.
Plausibility Check
The time is below half the scale parameter, so reliability above 80\% is reasonable.
Exercise 10: Weibull B10 Life
For the same Weibull model:
find the time by which 10\% have failed.
Solution
B10 life means:
Solve:
Substitute:
Engineering Comment
B10 is useful for replacement and warranty screening, but only if the Weibull fit is based on comparable duty and censoring assumptions.
Plausibility Check
B10 should be well below the scale parameter because only 10\% failures are allowed.
Exercise 11: Zero-Failure Lower Reliability Bound
A test runs:
units for the full mission with zero failures. Use:
with:
Estimate the one-sided 95\% lower bound.
Solution
Compute:
Engineering Comment
Zero failures does not prove perfect reliability. The result supports only a lower-bound claim for the tested mission and conditions.
Plausibility Check
Thirty-two clean missions are useful, but not enough to demonstrate 99\% reliability.
Exercise 12: Proof-Test Coverage
A latent protective function has:
identified failure cases. The proof-test procedure covers:
cases. The release rule requires 90\% coverage. Check release.
Solution
Coverage is:
Since:
the proof-test package fails.
Engineering Comment
Coverage should be mapped to failure modes and requirements. Missing high-consequence cases cannot be excused by many low-value tests.
Plausibility Check
Four of twenty-eight cases are uncovered, or one seventh, so coverage near 86\% is expected.
Exercise 13: Proof-Test Interval and Average Exposure
A latent dangerous failure rate is:
The proof-test interval is:
Use the simplified low-demand approximation:
Estimate average probability of failure on demand.
Solution
Substitute:
Engineering Comment
This simplified formula assumes failures are latent, demand is rare and proof tests restore the function. Poor test coverage or repair delay would increase risk.
Plausibility Check
The product \lambda_D T is below one percent, and half of it is about 0.4\%.
Exercise 14: Diagnostic Coverage Residual Rate
A diagnostic monitors a failure mode with raw dangerous failure rate:
Estimate the undetected residual rate.
Solution
Residual dangerous undetected rate is:
Thus:
Engineering Comment
Coverage claims should be supported by fault-injection, proof-test or field evidence. A diagnostic that detects only easy failures may not control the dominant risk.
Plausibility Check
15\% of the raw rate remains, so the residual rate should be much smaller than the original.
Exercise 15: Beta-Factor Common-Cause Limit
Two channels each have independent dangerous failure probability:
A beta-factor estimate assigns:
as common-cause contribution. Estimate common-cause failure probability contribution:
Solution
Substitute:
Engineering Comment
Common-cause risk often dominates the theoretical benefit of redundancy. Separation, diversity, independent power and independent testing are engineering controls, not algebraic assumptions.
Plausibility Check
Twelve percent of 2.5\% is 0.3\%, so the result is plausible.
Exercise 16: Standby Switch Reliability
A standby unit has mission reliability:
The automatic transfer switch has reliability:
Estimate the effective standby success probability.
Solution
Both the standby unit and switch must work:
Therefore:
Engineering Comment
Standby redundancy needs switching and detection evidence. A healthy standby asset does not help if transfer logic fails on demand.
Plausibility Check
Multiplying two values below one should reduce the result below both inputs, so 93.1\% is reasonable.
Exercise 17: Reliability Allocation Margin
A system target is:
Three required functions have allocated reliabilities:
Check the allocation margin.
Solution
For series-required functions:
So:
Margin is:
Engineering Comment
The allocation barely passes. Small modelling errors, common-cause effects or untracked interfaces could consume the margin.
Plausibility Check
Three values near 98\% multiply to a value near 94\%, so the tight pass is credible.
Exercise 18: Reliability Release Gate
A release package has these results:
| Gate | Requirement | Current result |
|---|---|---|
| mission reliability lower bound | at least 0.90 | 0.911 |
| inherent availability | at least 99.5\% | 99.78\% |
| proof-test coverage | at least 90\% | 85.7\% |
| common-cause action closure | 100\% | 100\% |
Decide whether to release.
Solution
Check each gate:
The package is not releasable because proof-test coverage fails.
Engineering Comment
Reliability release should not average unrelated gates. A latent failure test gap can block release even when mission reliability and availability look acceptable.
Plausibility Check
One hard gate fails. The correct decision is hold, restrict or add proof-test evidence.
Validation Package Checklist
A strong reliability availability and proof-test solution should check:
- whether exposure basis and failure definition are explicit;
- whether MTBF and failure rate use comparable operating data;
- whether mission reliability is separated from availability;
- whether series and parallel formulas match the real functional architecture;
- whether redundancy assumptions include common-cause and switching failures;
- whether Weibull parameters come from comparable life data;
- whether zero-failure evidence is stated as a confidence bound;
- whether proof tests cover latent failure modes, not only procedure steps;
- whether all failed hard gates are resolved before release.
Common Release Mistakes
Common mistakes include treating MTBF as a guaranteed lifetime, mixing exposure bases, quoting availability while excluding logistics downtime, multiplying independent-channel formulas when channels share support systems, assuming standby redundancy without switch evidence, using Weibull parameters outside the fitted regime, treating zero failures as proof of perfect reliability, counting proof-test steps instead of covered failure modes, and releasing by averaging gates instead of fixing the failed gate.