Glossary term

CPU Affinity

Engineering definition of CPU affinity covering processor pinning, scheduler migration, cache locality, latency margin, core overload and validation evidence.

Definition

method

CPU affinity is a scheduling constraint that limits where a task, thread, process or interrupt is allowed to run across available processor cores.

CPU affinity appears in operating systems, embedded Linux devices, real-time gateways, high-throughput services and performance tests when engineers pin work to cores to reduce migration, preserve cache locality, isolate critical paths or control interference. A useful review states the affinity mask, scheduler class, core budget, interrupt placement, migration rate, cache effect, overload risk, latency margin and validation evidence.

CPU affinity is a scheduling constraint that limits where a task, thread, process or interrupt is allowed to run across available processor cores. The constraint is usually expressed as an affinity mask: the work may run only on the cores included in the mask.

Affinity is used to reduce scheduler migration, preserve cache locality, isolate latency-sensitive paths, keep interrupt handling close to a device, or make a performance test repeatable. It can also create overload if too much work is pinned to the same core.

Affinity Mask

For a task i, define an allowed-core set:

A_i=\{c_1,c_2,\ldots,c_m\}

If the set contains one core, the task is pinned. If the set contains several cores, the scheduler may still migrate the task inside that set. A useful review records the mask for tasks, worker pools, interrupt handlers and background services, because one forgotten logging or network thread can invalidate a timing test.

Migration Cost

Let the observed number of migrations be:

n_{mig}

over observation time:

T_{obs}

The migration rate is:

\displaystyle f_{mig}=\frac{n_{mig}}{T_{obs}}

If each migration has direct and locality cost:

T_{mig}

then a first CPU-cost screen is:

U_{mig}=f_{mig}T_{mig}

This is a lower bound when migrations cause cache misses, NUMA traffic, lock handoff delay or device-interrupt detours.

Worked Migration Screen

Suppose a latency-sensitive service records:

n_{mig}=1800

migrations over:

T_{obs}=60\ s

The migration rate is:

f_{mig}=1800/60=30\ migrations/s

If a migration costs:

T_{mig}=35\ \mu s

the direct CPU fraction is:

U_{mig}=30(35\times10^{-6})=0.00105

or:

0.105\%

of one core. The CPU cost is small, but the latency effect can still be large if the migration occurs just before a deadline-sensitive path.

Latency Margin

If a response path has base time:

R_{base}=1.60\ ms

and migration plus cache warm-up add:

T_{mig}=0.04\ ms
T_{cache}=0.45\ ms

then:

R=1.60+0.04+0.45=2.09\ ms

For a deadline:

D=2.50\ ms

the margin is:

M=2.50-2.09=0.41\ ms

If pinning reduces cache warm-up to 0.12 ms, the response becomes 1.76 ms and the margin becomes 0.74 ms.

Interference Boundary

The affinity decision should state what interference is being controlled. A compute thread may need to stay close to its cache footprint. A data-acquisition thread may need to share a core with the interrupt that wakes it. A safety monitor may need distance from best-effort logging, compression or network processing.

The boundary also includes what remains movable. If background work can still land on the same core, the affinity mask alone does not prove isolation. If interrupts can migrate away from the pinned thread, locality may improve in one test and disappear after reboot or driver reconfiguration.

Core Budget

Affinity also concentrates load. For a pinned core k, the utilization estimate is:

\displaystyle U_k=\sum_i \frac{C_i}{T_i}+U_{irq}+U_{os}

where U_irq and U_os include interrupts, kernel work and unavoidable background activity on that core.

If:

U_k=0.62+0.18+0.12=0.92

and the engineering budget is:

U_{budget}=0.85

the pinning plan is unsafe even if it improves locality. The design moved one risk from migration jitter to core saturation.

Validation Evidence

Useful evidence includes affinity masks, actual CPU residency, migration count, scheduler class, priority, interrupt routing, run-queue length per core, context-switch trace, cache-miss counters, NUMA locality, p95 and p99 wakeup latency, maximum ready-to-running delay and overload behavior with background services enabled.

The validation test should compare unpinned, partially pinned and fully pinned configurations under representative load. The result should report both latency tails and per-core utilization, not just a lower average runtime.

Design Levers

Useful levers include pinning only the critical thread, keeping related interrupt handlers on the same core, isolating a real-time core, limiting worker admission, reducing shared locks, separating noisy background services, and keeping enough spare capacity for operating-system work.

Pinning every thread is rarely a good default. Strong affinity can reduce migration while hiding load imbalance, starving lower-priority work or making failover harder after a core becomes unavailable.

Relationship To Neighbor Terms

Context switch overhead measures part of the cost paid when work changes execution context. CPU affinity controls where that work may run. Preemption latency and release jitter can improve when migrations are removed, but they can worsen if pinned work overloads one core. Thread-pool saturation, lock convoy and task starvation can all be masked or amplified by affinity choices.

REF

See also