Case study
Digital Receiver Clock Recovery Jitter Case Study
Telecommunications case study on a digital receiver that passed power and SNR checks but failed from clock recovery jitter, sampling aperture loss, EVM degradation, FEC growth and timing diagnostics.
This case study follows a digital receiver that passed received-power and SNR checks but produced bursts of corrected errors and occasional packet loss. The channel was not too weak. The dominant failure was excessive recovered-clock jitter, which moved the sampling instant close to symbol transitions and closed the receiver timing aperture.
The case teaches a practical telecommunications lesson: a link can have enough signal energy and still fail because the receiver samples that energy at the wrong time.
Case Summary
| Item | Engineering relevance |
|---|---|
| Service | fixed digital wireless link carrying operational data and voice coordination |
| Waveform | 64-QAM screening mode at 40\ \text{Msymbol/s} |
| Initial symptom | FEC corrections and short packet-loss bursts under warm equipment conditions |
| Misleading evidence | received power and average SNR looked acceptable |
| Hidden weakness | recovered-clock jitter and deterministic wander exceeded the sampling aperture budget |
| Corrective action | clock reference cleanup, timing-recovery loop retuning and release thresholds for jitter/FEC |
The central engineering question was:
Is the link failing because the channel has insufficient SNR, or because the receiver timing recovery is not stable enough for the selected modulation?
The evidence pointed to timing recovery.
Baseline Data
Use the following simplified values.
| Parameter | Value |
|---|---|
| Symbol rate | 40\ \text{Msymbol/s} |
| Symbol period | 25\ \text{ns} |
| Selected mode during the incident | 64-QAM, rate 3/4 |
| Receiver input signal | -62.5\ \text{dBm} |
| Receiver noise bandwidth | 40\ \text{MHz} |
| Receiver noise figure | 6.0\ \text{dB} |
| Required detector SNR including mode margin | 26.0\ \text{dB} |
| Allowed timing aperture error | 5\% of one symbol period |
| Initial random timing jitter, RMS | 0.42\ \text{ns} |
| Initial deterministic timing wander, peak-to-peak | 1.20\ \text{ns} |
| Initial static timing bias | 0.25\ \text{ns} |
| Corrected random timing jitter, RMS | 0.16\ \text{ns} |
| Corrected deterministic timing wander, peak-to-peak | 0.35\ \text{ns} |
| Corrected static timing bias | 0.05\ \text{ns} |
The case uses a simple timing budget. A real receiver review should also include vendor timing specifications, clock phase-noise measurements, equalizer status, loop bandwidth, acquisition behavior, temperature, firmware version, oscillator reference, and the exact test pattern.
Step 1: Prove That SNR Is Not the Primary Limit
Receiver noise power is:
For:
the bandwidth term is:
With:
the noise floor is:
Available SNR:
Margin above the required detector SNR:
Engineering Comment
The SNR screen passes with about 3.5\ \text{dB} margin. That is not excessive, but it is enough to make a pure weak-signal diagnosis incomplete. If the team only checked received power and SNR, it would miss the timing failure.
Step 2: Build the Sampling Aperture Budget
The symbol period is:
Allowed timing aperture error:
Use a conservative timing estimate:
where \sigma_j is RMS random jitter, T_{wander,pp} is deterministic peak-to-peak wander and T_{bias} is static timing offset.
Initial timing error:
Compare with the allowed aperture:
Engineering Comment
The timing budget fails even though SNR passes. The receiver is trying to decide symbols with a sampling instant that can move too close to transitions. Higher-order QAM is sensitive to this because timing error becomes amplitude and phase error after filtering and equalization.
Step 3: Compare With Corrected Timing
After correcting the reference-clock configuration and retuning the timing-recovery loop, measured jitter values are:
Corrected timing error:
Timing margin:
Engineering Comment
The corrected receiver has about 0.55\ \text{ns} of timing margin by this screening method. That result does not prove all possible traffic, temperature and interference cases, but it shows that the main timing fault was removed.
Step 4: Interpret EVM Evidence
Error vector magnitude can be used as a receiver health indicator. A common screening relation is:
Before correction, the receiver reported:
Therefore:
After correction:
Engineering Comment
The EVM-derived value before correction is much worse than the simple power-and-noise SNR. That mismatch is important. It means the receiver impairment is not just additive thermal noise. Timing recovery, phase noise, equalization, nonlinear distortion or implementation effects are degrading the constellation.
After correction, EVM-derived SNR and power-based SNR are much closer. That agreement supports the clock-recovery diagnosis.
Step 5: Rule Out Competing Failure Modes
The investigation compared alternative causes.
| Candidate cause | Evidence | Result |
|---|---|---|
| Weak received signal | Power and SNR budget | SNR passed with 3.48\ \text{dB} margin |
| OFDM delay spread | Channel impulse response and cyclic-prefix screen | delay spread remained inside the guard interval |
| RF intermodulation | Spectrum capture and strong-signal test | no in-band intermodulation product correlated with errors |
| Optical or cable impairment | Not in the affected wireless segment | not applicable to the failing hop |
| MCS policy alone | Forced lower mode reduced symptoms but did not remove jitter evidence | lower mode masked the fault |
| Receiver timing recovery | jitter logs, EVM, FEC bursts and correction test | supported |
Engineering Comment
Forcing a lower modulation mode was a useful temporary mitigation, but it was not the root fix. It increased decision margin enough to hide the timing problem. The permanent fix had to address the recovered clock and timing loop.
Step 6: Corrective Actions
The accepted corrective actions were:
- replace the noisy local reference with the approved disciplined reference;
- correct the receiver clock-tree configuration so the modem used the intended reference path;
- retune the timing-recovery loop bandwidth to track slow wander without passing excessive high-frequency noise;
- add monitoring for recovered-clock jitter, EVM, FEC correction rate and loss of lock;
- require a warm-soak validation test before re-enabling the higher-order mode.
The corrected release used the same RF path and antenna alignment. The fix was inside the receiver timing system.
Validation After Correction
The corrected receiver was tested under the same traffic load and warm equipment condition.
| Metric | Before correction | After correction | Acceptance |
|---|---|---|---|
| power-based SNR | 29.5\ \text{dB} | 29.3\ \text{dB} | both pass |
| timing error estimate | 2.11\ \text{ns} | 0.705\ \text{ns} | corrected passes |
| EVM | 8.2\% | 3.9\% | corrected passes |
| FEC corrected blocks per minute | high and bursty | near baseline | corrected passes |
| packet loss during warm soak | intermittent bursts | none observed in release window | corrected passes |
| clock recovery alarms | intermittent margin alarms | no alarms | corrected passes |
Engineering Comment
The most persuasive evidence is not a single improved number. It is the pattern: SNR stayed nearly unchanged while timing error, EVM, FEC counters and clock alarms improved together. That pattern matches a receiver timing fix.
Transferable Lessons
Digital receiver diagnosis should separate energy margin from decision margin. Received power and SNR answer whether enough signal energy reaches the receiver. Clock recovery, timing jitter, phase noise, EVM, equalization and FEC behavior answer whether the receiver can decide symbols reliably.
Practical release checks should include:
- received power and noise margin;
- EVM or constellation-quality evidence;
- recovered-clock jitter or timing-recovery status;
- FEC and packet-error trends under temperature and load;
- acquisition, loss-of-lock and recovery behavior;
- mode fallback criteria if timing margin degrades.
The engineering failure was not that the receiver lacked sensitivity. The failure was that the receiver did not preserve enough timing aperture for the selected modulation mode.