Case study
Cooling Water Loss During Exothermic Reactor Startup Case Study
Chemical engineering case study on cooling-water loss during exothermic reactor startup, covering heat generation, safe hold time, alarms, interlocks, operator response, validation evidence, and restart decision.
This case study follows a cooling-water loss during startup of an exothermic liquid-phase reactor. The event is realistic rather than tied to one incident. It is useful because the hazard develops through ordinary engineering details: feed addition begins, heat generation rises, cooling flow is not actually available, a bypassed safeguard is not restored, and the operator has less time than the procedure assumes.
The purpose is to show how a chemical-process safety decision should connect heat balance, safe hold time, alarm timing, interlock status, field evidence, and restart authorization.
Technical Context
Exothermic reactors need heat removal at the same time that reaction rate, feed rate, temperature, mixing, and composition are changing. Startup can be more hazardous than normal operation because the plant is moving through transient states, instruments may have been bypassed for maintenance, utilities may not be fully lined up, and operators may be following a sequence rather than watching one steady operating point.
A simplified thermal balance during an upset is:
where m is reacting mass, C_p is average heat capacity, \dot{Q}_{gen} is heat generation, and \dot{Q}_{rem} is heat removal. The equation is simple, but the safety decision depends on how quickly alarms, interlocks, and operator actions act relative to the temperature rise.
Scenario
A batch reactor is being restarted after jacket maintenance. The startup procedure allows feed addition after the reactor reaches the initial temperature and the operator confirms cooling-water availability.
| Parameter | Value |
|---|---|
| Reacting mass | 6200\ \text{kg} |
| Average heat capacity | 3.4\ \text{kJ/(kg K)} |
| Initial reactor temperature | 70^\circ\text{C} |
| High-temperature alarm | 82^\circ\text{C} |
| High-high trip setpoint | 90^\circ\text{C} |
| Decomposition concern threshold | 100^\circ\text{C} |
| Nominal heat generation during feed ramp | 520\ \text{kW} |
| Residual heat removal after cooling-water loss | 120\ \text{kW} |
| Temperature sensor effective dead time | 2.5\ \text{min} |
| Operator diagnosis and action time | 4.0\ \text{min} |
| Feed isolation valve closure time | 1.5\ \text{min} |
During startup, reactor temperature rises faster than expected. The cooling-water low-flow interlock had been bypassed during maintenance and was not restored before feed addition. The temperature alarm remains active, but the response now relies on operator diagnosis and manual action.
Event Sequence
- Maintenance clears the jacket for startup but leaves a cooling-flow interlock bypass active.
- The startup checklist confirms valve lineup but does not require an interlock proof or live cooling-flow challenge.
- Feed addition begins at the normal ramp rate.
- Reactor temperature starts rising; jacket outlet temperature does not rise as expected because cooling flow is low.
- The operator receives a high-temperature alarm and checks trends.
- Feed is stopped and emergency cooling lineup is restored before the high-high trip setpoint is reached.
- The unit is held for engineering review before restart.
No vessel rupture or release occurs. The near miss is still serious because the credited safeguard was not available at the moment it was needed.
Heat-Balance Screening
Thermal capacitance of the reacting mass:
Net heat accumulation after cooling-water loss:
Because:
temperature rise rate is:
Convert to kelvin per minute:
Engineering Interpretation
At the nominal heat-generation rate, the reactor temperature rises by more than one kelvin per minute after cooling is lost. That is slow enough for a working protection system, but not slow enough to treat the event casually.
Time Available Before Trip
Temperature margin from the initial condition to high-temperature alarm:
Time to alarm under the screening rate:
Temperature margin from the initial condition to high-high trip:
Time to trip:
Response chain after the temperature measurement begins to show the event:
Nominal time margin from alarm to completed feed isolation:
Engineering Interpretation
The manual response has only a small margin. The process did not run away because the operator acted correctly and the heat-generation rate stayed near the nominal estimate. A few minutes of diagnostic delay, a higher feed concentration, a slower valve, or a less visible trend could have consumed the margin.
Conservative Heat-Release Case
The engineering team checks a plausible conservative case:
- heat generation is 10\% higher because feed concentration is high;
- residual heat removal is 80\ \text{kW} because the cooling path is more restricted than assumed.
Conservative heat generation:
Conservative net heat:
Temperature rise rate:
Convert:
Time to alarm:
Time margin against the same response chain:
Engineering Interpretation
The conservative case nearly consumes the available response time. That means the alarm cannot be credited as a robust independent safeguard for this startup unless response time is shortened, the alarm is moved earlier, heat generation is limited, or an automatic interlock is restored and proof-tested.
Failure Analysis
The incident has several contributing failure modes:
| Failure mode | Technical effect |
|---|---|
| Cooling-flow interlock bypass left active | automatic feed prevention was unavailable |
| Startup checklist did not require live cooling-flow proof | utility availability was assumed rather than demonstrated |
| Temperature alarm was the remaining safeguard | response depended on operator diagnosis and timing |
| Feed ramp was normal despite abnormal safeguard state | heat generation rose before cooling evidence was secure |
| Maintenance handover did not flag bypass restoration | control-room and field state were not aligned |
The root problem is not only cooling-water flow. It is safeguard governance during a transient operating mode.
Evidence Required Before Restart
Restart should require evidence, not reassurance.
| Evidence | Acceptance purpose |
|---|---|
| cooling-water flow transmitter proof test | confirms the trip input works |
| interlock bypass removed and independently checked | restores automatic protection |
| emergency cooling valve stroke test | confirms response path |
| reactor temperature sensor calibration and dead-time check | validates alarm timing |
| feed isolation valve closure test | validates response time |
| startup checklist revision | prevents recurrence during lineup |
| operator trend review | confirms event recognition and response sequence |
| management-of-change record | documents altered startup protection basis |
The review should also confirm whether the high-temperature alarm setpoint is early enough for the fastest credible heat-release case.
Restart Decision
The engineering decision is:
Do not restart the reactor until the cooling-flow interlock is restored and proof-tested, startup requires live cooling-flow verification before feed addition, the high-temperature alarm has adequate response margin for the conservative heat-release case, and bypass governance is corrected.
If production pressure requires a temporary startup limit, it should be written as an engineered restriction:
- lower initial feed ramp;
- verified cooling-water flow permissive before feed;
- no active bypasses on credited safeguards;
- operator stationed on reactor temperature, jacket outlet temperature, and cooling-flow trends;
- automatic feed isolation available and tested;
- engineering approval required for any deviation.
Transferable Lessons
- A utility confirmation is not the same as a live functional proof.
- Manual alarm response is weak when the calculated time margin is measured in minutes.
- Interlock bypasses are process-safety states, not paperwork details.
- Startup and shutdown need their own hazard review because the plant is not at steady state.
- A heat-balance calculation becomes operational only when it is tied to sensor dead time, valve closure time, operator action, and proof-test evidence.