Glossary term

Fault Detection

The process of identifying abnormal conditions or faults from measurements, events, models, or diagnostic evidence.

Definition

method

Fault detection is the process of identifying abnormal conditions or faults from measurements, events, models, inspections, or diagnostic evidence.

Fault detection is used in electrical systems, controls, mechanical equipment, software, manufacturing, and infrastructure. It may use thresholds, protection logic, signal processing, model-based residuals, statistical monitoring, machine learning, or inspection data. Detection is different from diagnosis: detection identifies that something abnormal is present, while diagnosis seeks the cause and location.

Fault detection identifies that a system is behaving abnormally. It answers the first operational question: is there evidence of a fault or abnormal condition that requires attention? Diagnosis, localization, root-cause analysis, and corrective action may follow, but they are not the same task.

In power systems, fault detection may identify short circuits, ground faults, high-impedance faults, abnormal harmonics, relay or breaker problems, insulation degradation, overheating, or developing equipment defects. In machinery and industrial systems, it may use vibration, temperature, current, pressure, acoustic, image, or process data to detect abnormal operating states.

Detection methods range from deterministic protection thresholds to statistical monitoring, signal processing, model-based residuals, digital-twin comparisons, and machine-learning classifiers. The correct method depends on the fault type, sensor quality, detection time, operating state, required action, and consequence of a missed event.

Engineering use

Fault detection becomes engineering evidence only when linked to an action: trip, alarm, inspection, derating, maintenance work order, shutdown, isolation, or further diagnosis. A detector used for protective tripping must be reviewed differently from a detector used for slow condition monitoring. Protective logic needs speed, selectivity, dependability, security, and fail-safe behavior; monitoring logic may tolerate slower detection but needs trend stability and useful prioritization.

Performance should be evaluated with the fault cases that matter, not only with average accuracy. Engineers should check sensitivity, specificity, false-trip rate, missed-fault rate, detection delay, sensor failure behavior, nuisance-alarm handling, and whether the detector was validated on independent data from the actual equipment or network state.

Common mistakes

A common mistake is confusing detection with diagnosis. A detector may indicate that a fault exists without proving the root cause or location. Another mistake is tuning a detector only to reduce nuisance alarms, then increasing the probability of missed high-consequence events. A strong fault-detection review states the fault definition, sensor boundary, operating modes, detection threshold, validation dataset, action rule, false-alarm consequence, missed-event consequence, and response ownership.

REF

See also