Glossary term

Preemption Latency

Engineering definition of preemption latency covering ready-to-run delay, scheduler dispatch, non-preemptive sections, wakeup latency, deadline margin and validation evidence.

Branch: Computer Engineering
Glossary type: metric
Content: Glossary term
Updated: Jun 26, 2026
Revision: v1.0.0 · reviewed

Definition

metric

Preemption latency is the delay between a task becoming eligible to run and the scheduler actually switching execution to that task when preemption should be allowed.

Preemption latency matters in real-time operating systems, embedded firmware, control software, service schedulers and latency-sensitive applications because it bounds how quickly important work can displace lower-importance work. A useful analysis states the ready event, priority policy, disabled-preemption interval, non-preemptive section, scheduler dispatch cost, context-switch cost, same-priority queueing rule, measured worst case and validation evidence.

Preemption latency is the delay between a task becoming eligible to run and the scheduler actually switching execution to that task when preemption should be allowed. It is a scheduler and system-integration metric, not just a CPU speed metric.

The term appears in real-time operating systems, embedded firmware, control software, industrial gateways, robotics, medical devices, avionics and latency-sensitive services. A task can have a short execution time and still miss its response objective if it waits too long after becoming ready.

Ready-to-Run Boundary

Let a task become ready at:

T_{ready}

and start running at:

T_{run}

The preemption latency is:

T_{preempt}=T_{run}-T_{ready}

The ready event should be stated. It may be an interrupt unblocking a task, a timer release, a message arrival, a semaphore post, a condition-variable signal or a higher-priority task becoming runnable.

Latency Components

A useful first budget is:

T_{preempt}=T_{np}+T_{sched}+T_{ctx}+T_{runq}+T_{warm}

where T_np is remaining non-preemptive or preemption-disabled time, T_sched is scheduler decision time, T_ctx is context-switch overhead, T_runq is queueing behind eligible peers and T_warm covers cache, memory or pipeline warm-up effects before useful work starts.

The terms are implementation-dependent. Some real-time kernels bound them tightly. General-purpose operating systems may have much larger tails under I/O, interrupts, power management, page faults or background services.

Non-Preemptive Sections

If preemption is disabled for a maximum interval:

T_{np,max}

then any high-priority task released during that interval may wait up to that amount before the scheduler can run it. A deadline screen should include:

T_{preempt,max}\geq T_{np,max}+T_{sched,max}+T_{ctx,max}

Long critical sections, disabled interrupts, scheduler locks, flash operations, driver polling and blocking logging can all increase the bound. Reducing task execution time does not solve a missed response if the controlling term is non-preemptive time.

Same-Priority Queueing

For equal-priority round-robin tasks with time slice:

q

and:

N

runnable tasks at the same priority, a simple worst-case wait before a task gets another slice is:

T_{rr,max}=(N-1)q

This is not priority preemption, but it is often measured as wakeup or ready-to-run latency for interactive work. It explains why average CPU utilization can look acceptable while a particular class of work waits too long.

Deadline Screen

If a released task then needs execution time:

C_i

and has response deadline:

D_i

a first response screen is:

R_i=T_{preempt}+C_i

with margin:

M_i=D_i-R_i

The task passes only when:

M_i\geq0

Shared-resource blocking, interrupt interference and release jitter should be added when they apply.

Worked Example

A control task must start useful computation within:

W_{max}=2.5\ \text{ms}

after a timer release. A stressed trace shows remaining non-preemptive time:

T_{np}=2.40\ \text{ms}

scheduler dispatch:

T_{sched}=0.18\ \text{ms}

and context plus cache warm-up:

T_{ctx}+T_{warm}=0.42\ \text{ms}

The preemption latency is:

T_{preempt}=2.40+0.18+0.42=3.00\ \text{ms}

The start-margin is:

M_W=2.50-3.00=-0.50\ \text{ms}

The target fails. After splitting the non-preemptive region, the maximum observed non-preemptive time falls to:

T_{np,new}=1.20\ \text{ms}

The new latency is:

T_{preempt,new}=1.20+0.18+0.42=1.80\ \text{ms}

and:

M_{W,new}=2.50-1.80=0.70\ \text{ms}

The screen now passes for the measured stressed case, but the release still needs regression gates for maximum disabled-preemption time.

Validation Evidence

Useful evidence includes ready timestamp, run timestamp, priority, scheduler class, run-queue length, context-switch trace, preemption-disabled interval, interrupt-disabled interval, lock owner, CPU affinity, cache or memory pressure, p95 and p99 wakeup latency, maximum observed latency and test workload.

Validation should include the background load that matters: communication bursts, logging, diagnostics, DMA, storage, garbage collection, lower-priority CPU work, equal-priority peer load and fault-recovery paths. A quiet-system wakeup measurement is not enough for a real-time claim.

Relationship To Neighbor Terms

Interrupt latency measures the delay from interrupt request to ISR response. Preemption latency measures delay from task-ready to task-running. Jitter is variation in timing, while preemption latency is one source of that variation. Priority inversion, lock convoys and long critical sections can increase preemption latency. Task starvation is a more severe progress failure in which eligible work remains unserved beyond its bound.

In fixed-priority response-time analysis, preemption latency is often folded into blocking, release jitter or scheduler overhead. For engineering validation, it is still useful to measure it directly because it points to kernel, driver and workload causes.

Common Mistakes

The most common mistake is assuming a preemptive scheduler means immediate execution. Another is measuring from ISR entry instead of from the event that made the task ready. A third is reporting average wakeup latency while ignoring p99 or maximum values. A fourth is tuning task priority while leaving long non-preemptive sections unchanged.

A strong preemption-latency review states the ready event, scheduler policy, maximum non-preemptive time, dispatch overhead, queueing rule, measured worst case, deadline margin and regression gate.

REF

Disciplines