Case study

Control Valve Stiction Limit Cycle Case Study

Automation and control engineering case study on diagnosing a control-valve stiction limit cycle using process trends, valve travel, controller output, deadband, tuning checks, maintenance action, and validation evidence.

A control loop can oscillate even when the controller tuning is reasonable. One common cause is control-valve stiction: the valve does not move smoothly when the controller output changes, then breaks free and jumps. The controller keeps integrating error while the valve is stuck, the process variable drifts, the valve suddenly moves, and the loop repeats.

This case study follows a flow-control loop feeding a batch reactor. Operators report a slow cycling flow trend after a control valve actuator service. The first suggestion is to “detune the PID loop.” The engineering question is whether the loop is poorly tuned or whether the actuator and valve assembly have become the limiting element.

The purpose is to show how to diagnose a limit cycle from controller output, process variable, valve-travel feedback, disturbance checks, and a short maintenance test before changing tuning in a way that hides the real fault.

Case Context

The loop controls solvent feed to a reactor. The manipulated element is a pneumatic control valve with a smart positioner and travel feedback. The process is not safety-instrumented by this loop, but poor feed-ratio control can create off-spec batches and unnecessary operator intervention.

ItemValue or observation
Controlled variablesolvent flow
Flow setpoint120\ \text{m}^3/\text{h}
Normal controller modeautomatic PI control
Trend sampling interval5\ \text{s}
Observed flow range108 to 132\ \text{m}^3/\text{h}
Observed cycle period7.5\ \text{min}
Controller output range during cycle42\% to 56\%
Valve travel before jumpheld near 47\%
Valve travel after jumpabout 55\%
Pump discharge pressure variationless than \pm 0.03\ \text{bar}
Upstream tank level variationless than \pm 2\%
Existing PI tuningK_c=1.2, T_i=90\ \text{s}

The observed waveform is not a smooth sinusoid. The flow drifts slowly while the controller output ramps, then the valve travel jumps and the flow crosses the setpoint rapidly. That stick-slip shape is the first diagnostic clue.

What a Tuning Problem Would Look Like

A lightly damped tuning problem usually shows a more continuous relation between controller output and process variable. The actuator moves when commanded, but the closed-loop dynamics have too much effective gain, too much phase lag, or too little damping. In that case, retuning should change the oscillation amplitude and damping in a predictable way.

A stiction problem looks different:

  1. controller output changes continuously;
  2. valve travel does not move for part of the output change;
  3. valve travel then moves abruptly;
  4. the process variable changes after the valve breaks free;
  5. slowing the controller mostly changes the cycle period, not the underlying stick-slip behaviour.

The diagnostic objective is therefore not “make the trend calmer.” It is to identify whether the actuator obeys the controller command closely enough for feedback control.

Apparent Stiction from Trend Data

Use the controller output and valve-travel trend to estimate the output change required before the valve moves.

During one representative cycle:

EventController outputValve travelFlow
start of ramp47.0\%47.1\%108\ \text{m}^3/\text{h}
just before breakaway55.5\%47.3\%111\ \text{m}^3/\text{h}
after breakaway55.8\%54.9\%130\ \text{m}^3/\text{h}

The apparent breakaway command span is:

\Delta u_{breakaway}=55.5\%-47.0\%=8.5\%\ \text{of controller output span}

The valve travel changed only:

\Delta x_{stuck}=47.3\%-47.1\%=0.2\%\ \text{of travel}

while the controller output moved by 8.5\%. That is not normal positioning error for a control valve intended for continuous modulation.

The flow jump after breakaway is:

\Delta Q_{jump}=130-111=19\ \text{m}^3/\text{h}

The implied installed gain during the jump is:

\displaystyle K_{installed}=\frac{\Delta Q_{jump}}{\Delta x_{travel}}=\frac{19}{54.9-47.3}=2.5\ \frac{\text{m}^3/\text{h}}{\% \text{travel}}

Using that installed gain, an 8.5\% delayed movement would correspond to:

\Delta Q_{expected}=2.5(8.5)=21.3\ \text{m}^3/\text{h}

That is close to the observed process swing. The trend is therefore consistent with a valve-positioning fault, not just an aggressive controller.

Feed-Ratio Consequence

The solvent flow is paired with a second feed held near:

Q_B=96\ \text{m}^3/\text{h}

The target ratio is:

\displaystyle R_{target}=\frac{120}{96}=1.25

During the low and high parts of the cycle:

\displaystyle R_{low}=\frac{108}{96}=1.13
\displaystyle R_{high}=\frac{132}{96}=1.38

If the batch recipe allows 1.25\pm0.05, the acceptable range is 1.20 to 1.30. The cycling loop violates the ratio requirement at both extremes. The average flow may look acceptable over a long batch, but the process experiences repeated composition excursions.

This is why averaging the trend is a weak argument. A sticky valve can meet a totalized quantity while still failing a dynamic quality requirement.

Disturbance and Measurement Checks

Before blaming the valve, the team checks whether another disturbance is driving the loop:

CheckResultInterpretation
Pump discharge pressurestable within \pm 0.03\ \text{bar}no strong pump cycling source
Upstream levelstable within \pm 2\%no repeating static-head disturbance
Flow transmitter zero and spanwithin calibration tolerancemeasurement bias is not the main cause
Flow signal noiseless than 0.8\ \text{m}^3/\text{h} rmsnoise is much smaller than the 24\ \text{m}^3/\text{h} peak-to-peak cycle
Valve position feedbackflat while controller output rampsactuator/positioner evidence supports stiction
Manual small output stepsno movement below about 7\% to 9\% changedeadband is repeatable

The manual output-step result is important because it separates closed-loop symptoms from open-loop actuator behaviour. A tuning problem disappears when the loop is placed in manual. Stiction remains visible as poor valve response to small commanded changes.

Tuning Trial

The team performs a controlled tuning trial after confirming that the operating condition is safe. Integral time is increased from 90\ \text{s} to 240\ \text{s} while proportional gain is held constant.

The cycle period changes from about 7.5\ \text{min} to about 15\ \text{min}, but the flow still moves in ramps and jumps with nearly the same peak-to-peak amplitude.

This result is consistent with stiction. Slower integral action reduces how quickly the controller output accumulates enough force to break the valve free. It does not remove the mechanical deadband or friction. Detuning may reduce nuisance alarms in the short term, but it also makes the loop slower and leaves the actuator fault in service.

Engineering Decision

The loop should not be closed by retuning alone. The engineering decision is:

  1. keep the loop in automatic only under a temporary operating limit if quality risk is acceptable;
  2. schedule valve and positioner inspection before routine production use;
  3. inspect packing load, actuator linkage, positioner calibration, air supply, I/P converter behaviour, and stem travel;
  4. verify whether dither is enabled and appropriate for the valve;
  5. repeat the manual small-step test after maintenance;
  6. restore PI tuning only after valve response is acceptable.

The root cause is treated as an actuator reliability and maintenance problem, not as a control-algorithm problem.

RPN Screen

A simple risk-priority-number screen helps document the decision:

RPN=S \times O \times D

Before maintenance:

FactorValueRationale
Severity S7Feed-ratio cycling can create off-spec batches and operator intervention.
Occurrence O5The limit cycle appears repeatedly in normal automatic operation.
Detection D4The trend is visible, but the root cause can be mistaken for tuning.

Initial risk priority number:

RPN_{initial}=7(5)(4)=140

After positioner service, packing adjustment, air-supply filter replacement, and validation:

FactorValueRationale
Severity S7The process consequence is unchanged if the fault returns.
Occurrence O2Small-step response and production trend show much lower recurrence likelihood.
Detection D2Travel feedback and acceptance checks make recurrence easier to identify.

Contained risk priority number:

RPN_{contained}=7(2)(2)=28

The lower RPN is not a claim that all control risk is removed. It records that this specific actuator failure mode has been reduced and made easier to detect.

Post-Intervention Validation

After maintenance, the acceptance test uses the same operating point and sample interval as the diagnostic trend.

MetricBeforeAfterAcceptance interpretation
Breakaway command span8.5\%0.7\%acceptable for this service
Flow peak-to-peak cycle24\ \text{m}^3/\text{h}4\ \text{m}^3/\text{h}within recipe ratio band
Valve travel response to 1\% output stepintermittentrepeatablestiction reduced
Automatic-mode cycle period7.5\ \text{min}no sustained cyclelimit cycle removed
Operator interventions per batch6 to 100 to 1practical improvement

For one hour of steady operation, approximate the integral absolute error using the average absolute error. A triangular cycle with peak error 12\ \text{m}^3/\text{h} has mean absolute error about half the peak:

IAE_{before}\approx 6(60)=360\ (\text{m}^3/\text{h})\cdot\text{min}

After maintenance, the peak error is about 2\ \text{m}^3/\text{h}:

IAE_{after}\approx 1(60)=60\ (\text{m}^3/\text{h})\cdot\text{min}

The reduction is:

\displaystyle \frac{360-60}{360}(100\%)=83\%

This is a useful engineering metric because it connects trend improvement to controlled-variable performance, not only to visual smoothness.

Validation Evidence

A defensible closeout package should include:

Evidence itemWhy it matters
Before and after trendsShows process variable, setpoint, controller output, and valve travel on the same time base.
Manual small-step testConfirms actuator response independent of closed-loop tuning.
Positioner diagnosticsDocuments air supply, travel calibration, friction, and any alarms.
Tuning recordPrevents hidden tuning changes from being confused with mechanical repair.
Recipe-ratio checkShows whether the process requirement is actually met.
Maintenance work orderLinks the observed control symptom to the physical correction.
Recurrence triggerDefines when the issue must be reopened.

The recurrence trigger should be quantitative. For example: reopen the action if valve breakaway exceeds 2\% command span, if flow oscillation exceeds 8\ \text{m}^3/\text{h} peak to peak for more than three cycles, or if operators must place the loop in manual during normal production.

Engineering Lessons

The first lesson is that oscillation is a symptom, not a diagnosis. It can come from poor tuning, excessive dead time, sensor noise, disturbances, saturation, or actuator nonlinearity.

The second lesson is that controller output and valve travel must be trended together. A flow trace alone cannot distinguish poor PI tuning from a valve that ignores small command changes.

The third lesson is that detuning can hide stiction. A slower loop may look calmer while quality, responsiveness, and maintenance condition remain unacceptable.

The final lesson is that validation should use the same variables that supported the diagnosis. If the decision was based on controller output, valve travel, flow, and ratio error, the closeout should show those same variables after correction.

REF

See also