Case study

Digital Receiver Clock Recovery Jitter Case Study

Telecommunications case study on a digital receiver that passed power and SNR checks but failed from clock recovery jitter, sampling aperture loss, EVM degradation, FEC growth and timing diagnostics.

This case study follows a digital receiver that passed received-power and SNR checks but produced bursts of corrected errors and occasional packet loss. The channel was not too weak. The dominant failure was excessive recovered-clock jitter, which moved the sampling instant close to symbol transitions and closed the receiver timing aperture.

The case teaches a practical telecommunications lesson: a link can have enough signal energy and still fail because the receiver samples that energy at the wrong time.

Case Summary

ItemEngineering relevance
Servicefixed digital wireless link carrying operational data and voice coordination
Waveform64-QAM screening mode at 40\ \text{Msymbol/s}
Initial symptomFEC corrections and short packet-loss bursts under warm equipment conditions
Misleading evidencereceived power and average SNR looked acceptable
Hidden weaknessrecovered-clock jitter and deterministic wander exceeded the sampling aperture budget
Corrective actionclock reference cleanup, timing-recovery loop retuning and release thresholds for jitter/FEC

The central engineering question was:

Is the link failing because the channel has insufficient SNR, or because the receiver timing recovery is not stable enough for the selected modulation?

The evidence pointed to timing recovery.

Baseline Data

Use the following simplified values.

ParameterValue
Symbol rate40\ \text{Msymbol/s}
Symbol period25\ \text{ns}
Selected mode during the incident64-QAM, rate 3/4
Receiver input signal-62.5\ \text{dBm}
Receiver noise bandwidth40\ \text{MHz}
Receiver noise figure6.0\ \text{dB}
Required detector SNR including mode margin26.0\ \text{dB}
Allowed timing aperture error5\% of one symbol period
Initial random timing jitter, RMS0.42\ \text{ns}
Initial deterministic timing wander, peak-to-peak1.20\ \text{ns}
Initial static timing bias0.25\ \text{ns}
Corrected random timing jitter, RMS0.16\ \text{ns}
Corrected deterministic timing wander, peak-to-peak0.35\ \text{ns}
Corrected static timing bias0.05\ \text{ns}

The case uses a simple timing budget. A real receiver review should also include vendor timing specifications, clock phase-noise measurements, equalizer status, loop bandwidth, acquisition behavior, temperature, firmware version, oscillator reference, and the exact test pattern.

Step 1: Prove That SNR Is Not the Primary Limit

Receiver noise power is:

N_{dBm}=-174+10\log_{10}(B_{Hz})+NF

For:

B=40\times10^6\ \text{Hz}

the bandwidth term is:

10\log_{10}(40\times10^6)=76.02\ \text{dB}

With:

NF=6.0\ \text{dB}

the noise floor is:

N=-174+76.02+6.0=-91.98\ \text{dBm}

Available SNR:

SNR=P_r-N
SNR=-62.5-(-91.98)=29.48\ \text{dB}

Margin above the required detector SNR:

M_{SNR}=29.48-26.0=3.48\ \text{dB}

Engineering Comment

The SNR screen passes with about 3.5\ \text{dB} margin. That is not excessive, but it is enough to make a pure weak-signal diagnosis incomplete. If the team only checked received power and SNR, it would miss the timing failure.

Step 2: Build the Sampling Aperture Budget

The symbol period is:

\displaystyle T_s=\frac{1}{R_s}
\displaystyle T_s=\frac{1}{40\times10^6}=25\ \text{ns}

Allowed timing aperture error:

T_{allow}=0.05T_s
T_{allow}=0.05(25)=1.25\ \text{ns}

Use a conservative timing estimate:

\displaystyle T_{err}=3\sigma_j+\frac{T_{wander,pp}}{2}+T_{bias}

where \sigma_j is RMS random jitter, T_{wander,pp} is deterministic peak-to-peak wander and T_{bias} is static timing offset.

Initial timing error:

\displaystyle T_{err,initial}=3(0.42)+\frac{1.20}{2}+0.25
T_{err,initial}=1.26+0.60+0.25=2.11\ \text{ns}

Compare with the allowed aperture:

2.11\ \text{ns}>1.25\ \text{ns}

Engineering Comment

The timing budget fails even though SNR passes. The receiver is trying to decide symbols with a sampling instant that can move too close to transitions. Higher-order QAM is sensitive to this because timing error becomes amplitude and phase error after filtering and equalization.

Step 3: Compare With Corrected Timing

After correcting the reference-clock configuration and retuning the timing-recovery loop, measured jitter values are:

\sigma_j=0.16\ \text{ns}
T_{wander,pp}=0.35\ \text{ns}
T_{bias}=0.05\ \text{ns}

Corrected timing error:

\displaystyle T_{err,corr}=3(0.16)+\frac{0.35}{2}+0.05
T_{err,corr}=0.48+0.175+0.05=0.705\ \text{ns}

Timing margin:

M_T=T_{allow}-T_{err,corr}
M_T=1.25-0.705=0.545\ \text{ns}

Engineering Comment

The corrected receiver has about 0.55\ \text{ns} of timing margin by this screening method. That result does not prove all possible traffic, temperature and interference cases, but it shows that the main timing fault was removed.

Step 4: Interpret EVM Evidence

Error vector magnitude can be used as a receiver health indicator. A common screening relation is:

SNR_{EVM,dB}\approx -20\log_{10}(EVM_{rms})

Before correction, the receiver reported:

EVM=8.2\%=0.082

Therefore:

SNR_{EVM}\approx -20\log_{10}(0.082)=21.72\ \text{dB}

After correction:

EVM=3.9\%=0.039
SNR_{EVM}\approx -20\log_{10}(0.039)=28.18\ \text{dB}

Engineering Comment

The EVM-derived value before correction is much worse than the simple power-and-noise SNR. That mismatch is important. It means the receiver impairment is not just additive thermal noise. Timing recovery, phase noise, equalization, nonlinear distortion or implementation effects are degrading the constellation.

After correction, EVM-derived SNR and power-based SNR are much closer. That agreement supports the clock-recovery diagnosis.

Step 5: Rule Out Competing Failure Modes

The investigation compared alternative causes.

Candidate causeEvidenceResult
Weak received signalPower and SNR budgetSNR passed with 3.48\ \text{dB} margin
OFDM delay spreadChannel impulse response and cyclic-prefix screendelay spread remained inside the guard interval
RF intermodulationSpectrum capture and strong-signal testno in-band intermodulation product correlated with errors
Optical or cable impairmentNot in the affected wireless segmentnot applicable to the failing hop
MCS policy aloneForced lower mode reduced symptoms but did not remove jitter evidencelower mode masked the fault
Receiver timing recoveryjitter logs, EVM, FEC bursts and correction testsupported

Engineering Comment

Forcing a lower modulation mode was a useful temporary mitigation, but it was not the root fix. It increased decision margin enough to hide the timing problem. The permanent fix had to address the recovered clock and timing loop.

Step 6: Corrective Actions

The accepted corrective actions were:

  1. replace the noisy local reference with the approved disciplined reference;
  2. correct the receiver clock-tree configuration so the modem used the intended reference path;
  3. retune the timing-recovery loop bandwidth to track slow wander without passing excessive high-frequency noise;
  4. add monitoring for recovered-clock jitter, EVM, FEC correction rate and loss of lock;
  5. require a warm-soak validation test before re-enabling the higher-order mode.

The corrected release used the same RF path and antenna alignment. The fix was inside the receiver timing system.

Validation After Correction

The corrected receiver was tested under the same traffic load and warm equipment condition.

MetricBefore correctionAfter correctionAcceptance
power-based SNR29.5\ \text{dB}29.3\ \text{dB}both pass
timing error estimate2.11\ \text{ns}0.705\ \text{ns}corrected passes
EVM8.2\%3.9\%corrected passes
FEC corrected blocks per minutehigh and burstynear baselinecorrected passes
packet loss during warm soakintermittent burstsnone observed in release windowcorrected passes
clock recovery alarmsintermittent margin alarmsno alarmscorrected passes

Engineering Comment

The most persuasive evidence is not a single improved number. It is the pattern: SNR stayed nearly unchanged while timing error, EVM, FEC counters and clock alarms improved together. That pattern matches a receiver timing fix.

Transferable Lessons

Digital receiver diagnosis should separate energy margin from decision margin. Received power and SNR answer whether enough signal energy reaches the receiver. Clock recovery, timing jitter, phase noise, EVM, equalization and FEC behavior answer whether the receiver can decide symbols reliably.

Practical release checks should include:

  1. received power and noise margin;
  2. EVM or constellation-quality evidence;
  3. recovered-clock jitter or timing-recovery status;
  4. FEC and packet-error trends under temperature and load;
  5. acquisition, loss-of-lock and recovery behavior;
  6. mode fallback criteria if timing margin degrades.

The engineering failure was not that the receiver lacked sensitivity. The failure was that the receiver did not preserve enough timing aperture for the selected modulation mode.

REF

See also