Glossary term
Failure Mode
A specific way in which a component, process, or system can fail to perform its intended function.
Definition
conceptA specific way in which a component, process, or system can fail to perform its intended function.
Failure mode identifies the observable or functional manner of failure, such as cracking, seizure, leakage, drift, short circuit, loss of calibration, delayed response, data corruption, or unsafe shutdown. It is the starting point for reliability analysis, FMEA, hazard analysis, maintenance planning, and design review.
A failure mode is the particular way a system fails. The same system can have many failure modes, and each may have different causes, effects, detectability, severity, and mitigation. For example, a valve may fail stuck open, stuck closed, leaking, slow to respond, incorrectly positioned, or electrically disconnected.
Engineering role
Failure modes give reliability and safety work a concrete target. Instead of saying that a machine “fails”, engineers identify how it fails and what that failure does to the next level of the system. This supports design changes, redundancy, inspection, alarms, protective interlocks, maintenance tasks, and verification tests.
Failure mode versus cause
A failure mode is not the same as a root cause. “Bearing seized” is a failure mode. Causes might include lubricant starvation, contamination, overload, misalignment, incorrect installation, or overheating. Similarly, “sensor output stuck high” is a failure mode; causes may include wiring fault, software error, internal electronics failure, or contamination.
Use in FMEA
In failure mode and effects analysis, each function is examined for possible failure modes. The team records effects, causes, controls, severity, occurrence, detectability, and recommended actions. Risk priority numbers can help screen issues, but high-severity modes should not be ignored simply because their calculated score is moderate.
Design considerations
Good failure-mode analysis includes normal operation, startup, shutdown, maintenance, degraded operation, misuse, environmental exposure, manufacturing variation, and interfaces with other systems. It should cover software, hardware, human operation, documentation, inspection, and supply-chain assumptions where they affect function.
Common mistakes
Common mistakes include listing vague modes such as “does not work”, confusing cause with effect, and analysing only component failures while ignoring system interactions. Another error is performing FMEA after design decisions are frozen, when the analysis can only document risk rather than improve the design.