Exercise set

Usability Validation, Use-Error, and Interface Release Exercises

Worked usability exercises for use-error risk, validation confidence, SUS, scenario coverage, interface layout and release gates.

Branch: Industrial and Management Engineering
Content: Exercise set
Updated: Jul 03, 2026
Revision: v1.0.0 · reviewed

These exercises focus on usability validation and interface release evidence: use-error probability, critical-task success, scenario coverage, confidence screens, SUS scoring, target acquisition, control spacing, error recovery and release gates. They are generic industrial and systems-engineering usability exercises, not medical-device-specific clinical validation.

Operator workload, physical ergonomics, handoffs, fatigue, alarm burden and field performance are handled in the companion specialist exercise set. This page stays on whether the interface, workflow and validation evidence control use error before release.

Release Evidence Notes

Usability evidence should name the intended user, task, interface state, operating context, success criterion, critical-error definition, sample boundary and release action. A favorable score is weak if critical tasks, abnormal scenarios, representative users or residual use errors are missing.

Engineering Boundary Notes

These examples use simplified rates, proportions, scores and confidence screens. Real usability engineering should use protocol design, representative users, realistic tasks, moderator controls, error taxonomy, task observations, residual-risk review and post-release monitoring.

Common Release Mistakes

reporting completion rate without separating critical and noncritical tasks;
treating zero observed use errors as proof of zero risk;
averaging user groups when one intended group is missing;
using SUS or satisfaction scores as a substitute for task success;
checking a screen visually without target size, spacing and mode-confusion evidence;
accepting a mitigation before retesting the changed interaction.

Scenario Map

Scenario	Exercises	Main calculation	Release decision
Use-error risk	1, 4, 12, 16, 17	Expected escapes, RPN change, mode confusion, confirmation error and field-entry error	Add design controls or retest when residual error remains credible.
Validation evidence	2, 3, 5, 6, 7, 13, 14	Success rate, completion gate, confidence bound, user and scenario coverage, recovery and assistance	Release only when evidence covers representative critical tasks.
Interface layout	8, 9, 10, 11, 15	SUS score, movement time, critical-control spacing, label comprehension and alarm message actionability	Redesign controls, wording or layout when interaction evidence fails.
Integrated release	18	Combined release blockers	Hold release when any critical usability gate fails.

Validation Package Checklist

intended user groups and representative participant counts;
critical task list, success rule and critical-error definition;
scenario, environment and operating-mode coverage;
use-error taxonomy, recovery path and residual-risk action;
interface controls, labels, target sizes, spacing and mode visibility;
release decision tied to evidence, not average preference alone.

Exercise 1: Expected Use-Error Escapes

A task is performed $9000$ times per month. The observed use-error probability is $0.004$ and the design detects $70\%$ of errors before harm. Estimate monthly undetected use-error escapes.

Solution

N_e=9000(0.004)(1-0.70)=10.8

Engineering Comment

Even a low per-task probability can produce frequent escapes when exposure is high. The release action should address the interaction, not only user reminders.

Plausibility Check

Four errors per thousand over nine thousand tasks gives thirty-six errors before detection; thirty percent escape.

Exercise 2: Critical-Task Success Rate

A critical task is completed successfully by $57$ of $60$ participants. Compute the success rate.

Solution

p=\dfrac{57}{60}=95.0\%

Engineering Comment

The percentage is not the whole decision. The three failures need severity, root cause and mitigation review.

Plausibility Check

Three failures out of sixty is exactly five percent failure.

Exercise 3: Validation Completion Gate

A validation protocol requires at least $94\%$ successful completion across representative attempts. The test has $185$ successful attempts out of $200$ . Does it pass?

Solution

C=\dfrac{185}{200}=92.5\%

Since $92.5\%<94\%$ , the gate fails.

Engineering Comment

The team should identify failed scenarios and retest after design control. Averaging with easy tasks is not a valid closure path.

Plausibility Check

Fifteen failures in two hundred attempts is more than the allowed twelve failures for a ninety-four percent gate.

Exercise 4: RPN Before and After a Design Control

A use error has severity $8$ , occurrence $5$ and detection $6$ . A redesign reduces occurrence to $2$ and detection rating to $3$ . Compute old and new RPN.

Solution

RPN_{old}=8(5)(6)=240

RPN_{new}=8(2)(3)=48

Engineering Comment

The RPN reduction is meaningful only if the redesigned interaction was verified and validated. Severity remains high, so residual risk still needs review.

Plausibility Check

Occurrence and detection both improve, so RPN should drop sharply.

Exercise 5: Zero Critical Errors Confidence Bound

A validation test observes zero critical use errors in $75$ independent attempts. Use the rule of three to estimate a $95\%$ upper bound on critical-error probability.

Solution

p_{upper}\approx\dfrac{3}{75}=0.040

Engineering Comment

Zero observed errors does not prove zero risk. If a four percent upper bound is too high for the task severity, more evidence or redesign is required.

Plausibility Check

More attempts would lower the bound; seventy-five attempts gives a few percent.

Exercise 6: Representative User Coverage

A study requires $12$ novice users, $12$ experienced users and $8$ supervisors. It includes $13$ , $10$ and $8$ . Which group fails?

Solution

Experienced users fail:

10<12

Novice users and supervisors pass:

13\ge12,\qquad 8\ge8

Engineering Comment

Total participant count cannot compensate for a missing intended user group.

Plausibility Check

Only the experienced-user count is below its planned minimum.

Exercise 7: High-Risk Scenario Coverage

A validation plan identifies $22$ high-risk scenarios. Testing covers $19$ . Compute coverage and compare with a $90\%$ gate.

Solution

C=\dfrac{19}{22}=86.4\%

The plan fails the $90\%$ gate.

Engineering Comment

Uncovered scenarios should be tested, justified out of scope or removed from the claim. They should not disappear into an average score.

Plausibility Check

Three missing scenarios out of twenty-two is more than ten percent.

Exercise 8: SUS Score and Lower Confidence Bound

A study reports mean SUS score $\bar{x}=77.1$ , sample standard deviation $s=5.31$ and $n=12$ . Use $t=1.80$ for a one-sided screen. Compute the lower confidence bound.

Solution

SE=\dfrac{5.31}{\sqrt{12}}=1.53

LCB=77.1-1.80(1.53)=74.3

Engineering Comment

SUS can support usability evidence, but it cannot replace critical-task success and observed use-error review.

Plausibility Check

The bound is a few points below the mean because the sample is small but variability is moderate.

Exercise 9: Touch Target Acquisition Time

A critical touchscreen action uses a target width $W=12\ \text{mm}$ at movement distance $D=180\ \text{mm}$ . Use $MT=0.12+0.13\log_2(D/W+1)$ . Compute movement time.

Solution

ID=\log_2\left(\dfrac{180}{12}+1\right)=\log_2(16)=4.0

MT=0.12+0.13(4.0)=0.64\ \text{s}

Engineering Comment

Small targets far from the current pointer or finger location increase time and reduce tolerance to gloves, vibration and awkward posture.

Plausibility Check

An index of difficulty of four bits gives a movement time well below one second for one target, but repeated actions can consume release margin.

Exercise 10: Critical-Control Spacing Gate

Two adjacent critical controls are $18\ \text{mm}$ and $20\ \text{mm}$ wide. A gloved-use allowance is $10\ \text{mm}$ and the required neutral gap is $8\ \text{mm}$ . Current center spacing is $48\ \text{mm}$ . Does it pass?

Solution

Required center spacing is:

S_{req}=\dfrac{18}{2}+\dfrac{20}{2}+10+8=37\ \text{mm}

Since:

48>37

the spacing screen passes.

Engineering Comment

Spacing is not only aesthetics. Critical controls should account for fingers, gloves, vibration, posture and accidental activation risk.

Plausibility Check

The required spacing is a little over the sum of half-widths plus allowances, and forty-eight millimeters is above it.

Exercise 11: Label Comprehension Pass Rate

A label comprehension test has $44$ correct interpretations out of $50$ representative users. The gate is $90\%$ . Does it pass?

Solution

C=\dfrac{44}{50}=88.0\%

The label fails the gate.

Engineering Comment

Ambiguous labels create use errors even when the button layout is correct. The label should be rewritten and retested.

Plausibility Check

Six misunderstandings in fifty users is more than one in ten.

Exercise 12: Mode-Confusion Residual Risk

A hidden mode creates $16$ wrong-action events per $10{,}000$ tasks. A mode indicator is expected to reduce events by $75\%$ . Estimate residual wrong-action events.

Solution

N_r=16(1-0.75)=4\ \text{events per }10{,}000\text{ tasks}

Engineering Comment

Four residual events may still be unacceptable for severe tasks. Mode visibility may need interlocks, confirmation or task redesign.

Plausibility Check

A seventy-five percent reduction leaves one quarter of the original events.

Exercise 13: Error Recovery Success

In a simulated-use test, users recover from $27$ of $30$ noncritical errors without assistance. Compute recovery rate.

Solution

R=\dfrac{27}{30}=90.0\%

Engineering Comment

Recovery evidence is useful when errors are expected, but the design should still prevent critical errors where recovery is unlikely or too late.

Plausibility Check

Three unrecovered errors in thirty attempts gives ten percent failure.

Exercise 14: Assistance Request Rate

During validation, $18$ assistance requests occur in $120$ task attempts. The release screen allows at most $10\%$ . Does it pass?

Solution

A=\dfrac{18}{120}=15.0\%

The screen fails.

Engineering Comment

Frequent assistance requests indicate that users cannot complete the task independently under the tested conditions.

Plausibility Check

Twelve requests would equal ten percent, so eighteen is clearly above the limit.

Exercise 15: Alarm Message Actionability

An interface review samples $80$ alarm messages. Users identify the correct next action for $66$ messages. Compute actionability rate and compare with an $85\%$ gate.

Solution

A=\dfrac{66}{80}=82.5\%

The gate fails.

Engineering Comment

Alarm wording is part of the interface. If users cannot identify the action, the alarm adds workload without reliable control.

Plausibility Check

Fourteen unclear messages out of eighty is more than fifteen percent.

Exercise 16: Confirmation Dialog False Acceptance

A confirmation dialog is shown $500$ times in a trial. Users accept it incorrectly $7$ times. The false-acceptance gate is $1\%$ . Does it pass?

Solution

p=\dfrac{7}{500}=1.4\%

Since $1.4\%>1.0\%$ , it fails.

Engineering Comment

Confirmation dialogs often become automatic. A severe action may need differentiated wording, physical separation, delay, interlock or undo path.

Plausibility Check

Five false acceptances would equal one percent; seven is above that.

Exercise 17: Form Field Error Reduction

A redesigned data-entry screen reduces field errors from $36$ errors in $600$ entries to $14$ errors in $600$ entries. Compute relative reduction.

Solution

R=\dfrac{36-14}{36}=61.1\%

Engineering Comment

The reduction supports the redesign, but the remaining errors need field type, severity and detectability review.

Plausibility Check

The error count falls by twenty-two out of thirty-six, a little over sixty percent.

Exercise 18: Usability Release Gate

A release requires four gates: critical-task success at least $95\%$ , high-risk scenario coverage at least $90\%$ , label comprehension at least $90\%$ and no unresolved severe use-error cause. Results are $96\%$ , $92\%$ , $88\%$ and no unresolved severe cause. Does it release?

Solution

The label comprehension gate fails:

88\%<90\%

The release fails.

Engineering Comment

Usability release should not average independent gates. A weak label can still drive the next critical use error.

Plausibility Check

Three gates pass, but one explicit gate fails, so the integrated decision is hold.

REF