Exercise set

Medical Device Usability, Critical Task, Use-Error, and Alarm Response Exercises

Solved medical-device usability exercises for critical-task success, use-error reduction, alarm response, false alarms, training and release gates.

Branch: Biomedical Engineering
Content: Exercise set
Updated: Jul 03, 2026
Revision: v1.1.0 · reviewed

These exercises focus on usability validation for medical devices: representative users, critical tasks, use errors, alarm response, false alarm burden, training, task time, residual use-error risk and release gates. They are engineering evidence exercises, not clinical guidance or regulatory advice.

Intended-use claim coverage and clinical/post-market evidence are handled in companion specialist exercise sets.

How to use these exercises

Use the set as a usability release review for representative users and critical tasks. Exercises 1 to 5 check critical-task success, unresolved failures, user-group balance, task-time tails and environment-specific success. Exercises 6 to 11 review use-error reduction, residual RPN, training, retest coverage, label comprehension and workload margin. Exercises 12 to 17 check alarm response, false alarm burden, missed alarms, false-to-true alarm ratio, evidence completion and residual concerns. Exercise 18 combines these usability gates into a release decision.

Before calculating, state the user group, use environment, training condition, critical task, alarm condition, success criterion and residual risk decision. A good average success rate is not enough if a safety-critical task fails for a target user group or if alarm response has a long tail. The engineering comment below each exercise identifies whether the result calls for mitigation, retest, claim narrowing, training change or hold.

Release Evidence Notes

Usability evidence should identify critical tasks, user groups, use environment, observed use errors, close calls, training condition, residual risk and whether mitigations were retested.

Critical-task success should be interpreted task by task. A high overall success rate can hide a safety-critical failure.

Alarm evidence should include detection, comprehension, response time, false alarm burden and workflow interruption.

The evidence package should separate task performance, use-error mitigation and alarm behavior. Task performance asks whether representative users can complete the intended workflow. Use-error mitigation asks whether observed errors were reduced and retested. Alarm behavior asks whether users notice, understand and respond without unacceptable false-alarm burden. A release decision needs all three streams.

Engineering Boundary Notes

These calculations do not replace a full usability engineering process, formative/summative protocol design, human-factors review, clinical workflow analysis or regulatory judgment. They are screening exercises for usability release.

The main boundary is representativeness. User groups, environments, training, lighting, noise, workload and workflow interruptions must match intended use. The second boundary is task criticality: noncritical success cannot compensate for unresolved failures on tasks that protect patient safety or device effectiveness.

Common Release Mistakes

averaging critical and noncritical tasks into one success number;
using trained users to represent novice users without justification;
counting a warning as mitigation without proving the user notices and acts;
ignoring false alarms and alarm fatigue;
closing a use error without retesting the mitigation.

Another common mistake is treating training as a universal mitigation. Training only works if it is available, retained, repeatable and realistic for the intended users. If a mitigation depends on training that users will not actually receive, the use error remains open.

Do not treat alarms as binary signals only. Alarm audibility, visibility, prioritization, false alarm rate, missed alarms, workflow context and response-time tail all affect whether the alarm supports safe use.

Scenario Map

Scenario	Exercises	Primary check	Engineering decision
Critical task performance	1, 2, 3, 4, 5	success rate, failures, group coverage and task time	Decide whether tasks can be released.
Use-error mitigation	6, 7, 8, 9, 10, 11	error reduction, residual risk, training and retest	Decide whether mitigations are effective.
Alarm usability	12, 13, 14, 15, 16, 17	response time, false alarms, missed alarms and evidence completion	Decide whether alarms support safe use.
Release gate	18	all-of usability release	Decide whether usability validation can close.

Exercise 1: Critical-Task Success Rate

A usability validation has $120$ critical-task attempts and $112$ successes. Compute success rate.

Solution

S=\dfrac{112}{120}=93.3\%

Engineering Comment

Task-level failures should be reviewed individually. A critical task may need zero unresolved failures.

Plausibility Check

Eight failures out of one hundred twenty leaves success below ninety-five percent.

Exercise 2: Critical-Task Failure Count Gate

A release rule allows no more than $2$ unresolved critical-task failures. The study has $8$ failures, $5$ mitigated and $3$ unresolved. Does it pass?

Solution

It fails because:

3>2

Engineering Comment

Mitigated failures should still be retested with representative users.

Plausibility Check

Three unresolved failures exceed a two-failure allowance by one.

Exercise 3: Representative User Group Balance

The study includes $10$ novice users, $14$ trained users and $6$ maintenance users. Target minimum is $8$ users per group. Which groups pass?

Solution

Novice and trained users pass:

10\ge8,\quad 14\ge8

Maintenance users fail:

6<8

Engineering Comment

Maintenance usability can affect cleaning, setup, calibration and service safety.

Plausibility Check

Only the smallest group is below the minimum.

Exercise 4: Task Time Margin

A critical alarm task must be completed within $90$ seconds. Median observed time is $68$ seconds and $95$ th percentile is $104$ seconds. Which statistic fails the limit?

Solution

Median passes:

68<90

$95$ th percentile fails:

104>90

Engineering Comment

Tail performance matters for alarm response and critical tasks.

Plausibility Check

Most users can be fast while a minority still exceed the limit.

Exercise 5: Use Environment Success Split

Home-use success is $48$ of $52$ tasks. Clinic success is $64$ of $68$ tasks. Compute both success rates.

Solution

Home:

S_H=\dfrac{48}{52}=92.3\%

Clinic:

S_C=\dfrac{64}{68}=94.1\%

Engineering Comment

Environment-specific results can show where labeling, lighting, noise or workflow affects use.

Plausibility Check

Both rates are high but below one hundred percent.

Exercise 6: Use-Error Reduction

Before mitigation, $14$ use errors occur in $80$ attempts. After mitigation, $5$ occur in $80$ attempts. Compute relative reduction.

Solution

Initial rate:

r_1=\dfrac{14}{80}=17.5\%

Final rate:

r_2=\dfrac{5}{80}=6.25\%

Reduction:

R=\dfrac{17.5-6.25}{17.5}=64.3\%

Engineering Comment

Reduction is encouraging, but residual errors must be evaluated by severity.

Plausibility Check

The error count falls by more than half, so reduction above sixty percent is plausible.

Exercise 7: Residual Use-Error RPN

A residual setup error has severity $7$ , occurrence $3$ and detection $4$ . Compute RPN.

Solution

RPN=7(3)(4)=84

Engineering Comment

RPN is a screen. A severe use error may require mitigation even if RPN is moderate.

Plausibility Check

The product of three single-digit scores is a two-digit value.

Exercise 8: Training Effectiveness

Before training, task success is $18$ of $25$ . After training, success is $23$ of $25$ . Compute improvement in percentage points.

Solution

Before:

S_1=\dfrac{18}{25}=72\%

After:

S_2=\dfrac{23}{25}=92\%

Improvement:

\Delta S=20\ \text{points}

Engineering Comment

Training can be a mitigation only if training is realistic, repeatable and available to intended users.

Plausibility Check

Five additional successes out of twenty-five users equals twenty percentage points.

Exercise 9: Retest Coverage

There are $5$ mitigated use errors. Retesting covers $4$ . Compute retest coverage.

Solution

C=\dfrac{4}{5}=80\%

Engineering Comment

An untested mitigation should remain open unless justified by risk.

Plausibility Check

One missing retest out of five leaves eighty percent coverage.

Exercise 10: Label Comprehension Rate

Twenty-four users interpret a warning label. Twenty-one interpret it correctly. Compute comprehension rate.

Solution

C=\dfrac{21}{24}=87.5\%

Engineering Comment

A warning that is not understood by representative users may not be an effective risk control.

Plausibility Check

Three incorrect interpretations out of twenty-four leave less than ninety percent.

Exercise 11: Workload Score Margin

Acceptable workload score is at most $45$ . Observed mean score is $41$ with uncertainty allowance $6$ . Compute guarded score and margin.

Solution

Guarded score:

W_g=41+6=47

Margin:

M=45-47=-2

Engineering Comment

Nominal workload passes, but guarded workload fails.

Plausibility Check

Adding uncertainty can turn a small apparent margin negative.

Exercise 12: Alarm Response Time

Required alarm response is $60$ seconds. Mean response is $44$ seconds and $95$ th percentile is $71$ seconds. Does the alarm response pass?

Solution

Mean passes:

44<60

$95$ th percentile fails:

71>60

The response gate fails if it uses the $95$ th percentile.

Engineering Comment

Alarm response should protect slower but representative users, not only average users.

Plausibility Check

The long-tail response exceeds the requirement.

Exercise 13: False Alarm Burden

A device produces $18$ false alarms over $72$ monitored hours. Compute false alarm rate.

Solution

r=\dfrac{18}{72}=0.25\ \text{false alarms/h}

Engineering Comment

False alarms can create alarm fatigue and reduce response reliability to true alarms.

Plausibility Check

Eighteen alarms over three days is one every four hours.

Exercise 14: Missed Alarm Fraction

During simulation, $2$ of $30$ true alarm events are missed by users. Compute missed alarm fraction.

Solution

f=\dfrac{2}{30}=6.7\%

Engineering Comment

Missed alarms should be linked to audibility, visibility, workload and workflow placement.

Plausibility Check

Two misses out of thirty is less than ten percent.

Exercise 15: Alarm Benefit-Risk Screen

True alarms are $30$ and false alarms are $18$ . Compute false-to-true alarm ratio.

Solution

R=\dfrac{18}{30}=0.60

Engineering Comment

A high false-to-true ratio may weaken user trust even if sensitivity is acceptable.

Plausibility Check

False alarms are a little over half of true alarms.

Exercise 16: Usability Evidence Completion

The release package requires task list, user groups, environment, raw observations, use-error log, mitigation list, retest evidence, training condition, alarm response and residual risk decision. Eight of ten records are complete. Compute completion.

Solution

C=\dfrac{8}{10}=80\%

Engineering Comment

Missing use-error logs, retest evidence or residual risk decisions should block release.

Plausibility Check

Eight of ten is exactly eighty percent.

Exercise 17: Residual Concern Count

There are $6$ residual usability concerns. Four are low risk and two are medium risk. A release rule allows no medium or high residual concern without explicit sign-off. Does it pass automatically?

Solution

No. Medium concerns exist:

N_{medium}=2>0

Engineering Comment

The release package needs explicit sign-off or additional mitigation for medium concerns.

Plausibility Check

Any medium concern violates a rule allowing only low concerns.

Exercise 18: Usability Release Gate

A release gate requires critical-task success above $95\%$ , all user groups above minimum, no unresolved critical-task failures above $2$ , alarm $95$ th percentile below $60$ seconds, false alarm rate below $0.2/\text{h}$ and evidence completion above $90\%$ . Current values are $93.3\%$ , maintenance group below minimum, $3$ unresolved failures, $71$ seconds, $0.25/\text{h}$ and $80\%$ . Decide release status.

Solution

All listed thresholds fail:

93.3\%<95\%

71>60

Release status:

\text{hold}

Engineering Comment

The usability package should hold for user coverage, critical tasks, alarm response, false alarms and evidence completion.

Plausibility Check

Multiple independent usability barriers fail, so release is not defensible.

Validation Package Checklist

Critical tasks are evaluated separately from noncritical tasks.
User groups and environments match the intended use.
Use-error mitigations are retested and residual concerns are risk-ranked.
Alarm response, missed alarms and false alarm burden are included in release evidence.
Training condition, label comprehension and workload assumptions are documented.
Medium or high residual concerns have explicit sign-off or additional mitigation.
Release status states accept, retest mitigation, revise interface, narrow claim or hold.

REF