Project
Maintenance Shutdown Planning and Reliability Risk Review Project
Industrial engineering project for planning a maintenance shutdown with reliability-risk screening, critical path, resource loading, spare readiness, safety gates, restart validation, and post-shutdown evidence.
This project produces a maintenance shutdown planning and reliability risk review package. The deliverable is not a calendar entry. It is an engineering decision file that explains why the shutdown should occur, what failure risk is being reduced, which work must be done inside the outage window, what resources and spares are required, what safety gates control the work, and what restart evidence proves the asset is ready for operation.
The project is written for industrial and management engineering students and early-career engineers. It uses simplified calculations, but the workflow matches real operations: a shutdown must balance production loss, failure risk, spare readiness, resource constraints, lockout requirements, quality gates, commissioning checks, and post-maintenance monitoring.
The central project question is:
Should the operation take a planned maintenance shutdown now, and can the shutdown be executed and restarted with controlled reliability risk?
The correct answer is not just “maintenance is due.” A shutdown is justified when the risk of deferral, the value of restored reliability, and the ability to execute safely are all visible.
Project Objective
Prepare a shutdown planning and reliability risk review for a production line with a deteriorating conveyor drive. The package must include:
- asset boundary and failure mode;
- deferral risk and expected consequence;
- planned work scope and work breakdown;
- critical-path shutdown schedule;
- labor, tooling, spare, and vendor readiness;
- safety and isolation gates;
- quality and restart validation checks;
- residual risk and post-shutdown monitoring plan;
- final decision: proceed, defer, split scope, or hold until prerequisites close.
The final deliverable should be reviewable by operations, maintenance, engineering, safety, quality, and production planning.
Scenario
A packaging line uses a conveyor drive that has shown rising vibration, intermittent overload trips, and oil contamination. The line can still run, but the next planned opportunity for a major maintenance window is 1500 operating hours away. The maintenance team proposes a planned shutdown to replace the gearbox, inspect the coupling, verify motor alignment, test interlocks, and restart under controlled load.
The simplified data are:
| Item | Value |
|---|---|
| current gearbox age | 3000 operating hours |
| next major opportunity if deferred | 1500 additional operating hours |
| Weibull shape factor | \beta=2.0 |
| Weibull scale parameter | \eta=4200\ \text{h} |
| unplanned repair duration | 8 h |
| production loss during unplanned failure | 12000 currency units/h |
| collateral damage and expediting cost | 60000 currency units |
| planned shutdown duration target | 16 h |
| production loss during planned low-demand window | 2500 currency units/h |
| planned labor, spares, vendor and testing cost | 32000 currency units |
These values are educational examples. A real review must use the owner’s reliability data, maintenance history, production economics, safety procedure, lockout requirements, spare certification, vendor constraints, and restart criteria.
Deliverable Boundary
The shutdown package covers the conveyor drive, coupling, motor alignment, guarding, overload trips, local interlocks, restart trial, and post-maintenance monitoring. It does not cover a full factory turnaround, civil work, electrical protection study, process revalidation, or a formal safety instrumented system proof test unless the site procedure requires those interfaces.
Useful boundary questions are:
- Which asset is being removed from service?
- Which upstream and downstream operations are affected?
- Which failure mode is being reduced?
- Which work must be completed before restart?
- Which resources are shared with other maintenance work?
- Which tests prove that the asset is safe and reliable enough to return to production?
Commentary: shutdown planning starts with a boundary because uncontrolled scope growth is one of the fastest ways to miss the restart window.
Worked Example 1: Reliability Risk of Deferral
Use a Weibull reliability model for a screening estimate:
The conditional probability of failure between current age t_1 and deferred age t_2 is:
The current age is:
If maintenance is deferred to the next major opportunity:
Substitute:
The screening probability of failure before the next major opportunity is about 47%.
Commentary: this result is not a warranty prediction. It is a decision screen. A high conditional probability means deferral should not be treated as “do nothing”; it carries measurable reliability exposure.
Worked Example 2: Expected Consequence of Deferral
Unplanned failure cost includes production loss and collateral consequence:
Expected deferral consequence:
Planned shutdown cost:
The planned intervention has a lower expected cost than deferral:
Commentary: the planned shutdown is justified in this simplified screen. The conclusion is stronger if the failure has safety, quality, customer, or environmental consequence not fully captured in the cost model.
Worked Example 3: Shutdown Critical Path
Define the shutdown work breakdown:
| Activity | Duration | Immediate predecessor |
|---|---|---|
| A: isolate, lock out, and verify zero energy | 1.5 h | none |
| B: drain, clean, and expose drive | 2.0 h | A |
| C: remove guards and access panels | 1.0 h | A |
| D: replace gearbox | 4.0 h | B, C |
| E: inspect coupling and shaft condition | 1.0 h | B |
| F: align motor and coupling | 1.5 h | D, E |
| G: reinstall guards | 1.0 h | F |
| H: interlock and overload trip test | 1.5 h | G |
| I: dry run | 1.0 h | H |
| J: loaded trial and quality release | 2.0 h | I |
Critical path duration:
If the approved window is 16 h:
Commentary: the shutdown fits, but it has little float. Scope additions, permit delays, missing tools, failed alignment, or repeated interlock tests can consume the window quickly. The plan should include a restart protection point, not just a finish target.
Worked Example 4: Resource Loading
Estimate core labor demand:
| Work element | Crew | Duration | Labor-hours |
|---|---|---|---|
| isolation and verification | 2 technicians | 1.5 h | 3.0 |
| drain, clean, expose | 2 technicians | 2.0 h | 4.0 |
| gearbox replacement | 3 technicians | 4.0 h | 12.0 |
| coupling inspection | 1 technician | 1.0 h | 1.0 |
| alignment | 2 technicians | 1.5 h | 3.0 |
| guards and interlocks | 2 technicians | 2.5 h | 5.0 |
| dry run and loaded trial | 3 technicians | 3.0 h | 9.0 |
Total direct labor:
If the available maintenance crew provides 3 technicians for the 16-hour window:
Labor capacity utilization:
The labor-hours appear feasible. The project still needs a resource-time check because the gearbox replacement and alignment require the right people at specific times, not only enough total labor-hours.
Commentary: total labor-hours can hide a timing conflict. A shutdown can have enough people overall while still lacking the specialist during the critical task.
Worked Example 5: Spare and Tool Readiness Gate
Define a simple readiness score for shutdown prerequisites:
| Requirement | Weight | Cleared? |
|---|---|---|
| replacement gearbox on site and inspected | 5 | 1 |
| coupling kit available | 4 | 1 |
| alignment tool calibrated | 4 | 0 |
| lockout procedure approved | 5 | 1 |
| lifting plan approved | 3 | 1 |
| vendor technician confirmed | 4 | 1 |
| overload trip test sheet approved | 3 | 0 |
| restart quality checks defined | 3 | 1 |
Weighted readiness:
Cleared score:
Total weight:
The weighted readiness is 77.4%. More importantly, the alignment tool and trip test sheet are hard gates. The shutdown should not be released until they close.
Commentary: readiness scoring supports the decision, but hard gates control release. A missing calibrated alignment tool can turn a planned shutdown into a long restart delay.
Worked Example 6: Risk Priority for Restart Failure
The main restart failure mode is misalignment causing vibration and early bearing damage.
Use:
Before controls:
- severity S=8 because restart failure stops the line and can damage the replacement gearbox;
- occurrence O=4 because field alignment is a plausible error;
- detection D=5 because misalignment may not be obvious until load and temperature stabilize.
Controls include calibrated laser alignment, independent signoff, baseline vibration measurement, loaded trial, and 24-hour post-restart temperature and vibration trend review. Assume occurrence reduces to 2 and detection improves to 2:
Commentary: the RPN shows why restart validation belongs in the shutdown plan. The job is not complete when the replacement part is bolted in; it is complete when the asset runs with acceptable evidence.
Safety and Isolation Gates
The shutdown release requires these gates:
- approved lockout and isolation boundary;
- zero-energy verification before guard removal;
- stored-energy and gravity-load controls;
- lifting plan and exclusion zone;
- oil spill and waste handling controls;
- interlock test before loaded trial;
- authorized restart owner;
- rollback plan if vibration, temperature, current, or quality checks fail.
Commentary: safety gates must be planned as technical work, not treated as paperwork. If isolation verification is late, the critical path changes. If interlocks fail, restart is not authorized.
Restart Validation
The restart package should include:
- no-load run for abnormal noise, vibration, oil leak, guard contact, and motor current;
- loaded run at normal line speed;
- gearbox temperature trend after 30, 60, and 120 minutes;
- vibration baseline at drive-end and non-drive-end bearings;
- overload trip and interlock test record;
- first-piece quality approval for product after restart;
- operator signoff that alarms, guards, access, and standard work are usable;
- maintenance closeout record with spares used, deviations, and follow-up actions.
If any restart criterion fails, the line should not return to normal production until the failure is dispositioned.
Final Decision
For the worked data, the recommendation is:
Proceed with the planned shutdown only after the alignment tool calibration and overload trip test sheet are closed. Do not defer to the next major opportunity.
The basis is:
- conditional Weibull screening gives about 47% failure probability before the next major opportunity;
- expected deferral consequence is about 73476 currency units;
- planned intervention cost is about 57000 currency units;
- critical path is 14.5 h against a 16 h window;
- labor capacity is feasible, but specialist timing and hard gates must be controlled;
- restart risk falls from RPN 160 to 32 with alignment, testing, and post-restart monitoring controls.
This is a conditional proceed decision, not a blanket approval. The shutdown is technically justified, but release depends on readiness gates.
Deliverable Checklist
The shutdown package is complete only when it includes:
- asset boundary and failure mode;
- deferral risk calculation and consequence model;
- work breakdown and critical path;
- resource loading and specialist availability;
- spare, tool, vendor, permit, and procedure readiness;
- safety isolation gates;
- restart validation criteria;
- rollback plan and production communication;
- post-shutdown monitoring plan;
- closeout evidence and lessons learned.
If any item is missing, the shutdown should remain in planning rather than moving to release.
Common Mistakes
Avoid these mistakes:
- justifying a shutdown only by calendar age;
- ignoring conditional failure risk between now and the next opportunity;
- calculating total labor-hours but missing specialist timing;
- adding scope after the critical path is fixed;
- treating spares as ready before inspection and compatibility checks;
- restarting after mechanical work without loaded validation;
- closing the work order before post-restart evidence is reviewed;
- reporting availability improvement without defining failure mode and exposure basis.
The project succeeds when the shutdown is treated as an engineered reliability intervention with evidence before, during, and after the outage.