Formula sheet
Algorithmic Performance Formula Sheet
Core formulas and checks for algorithmic performance: Big O growth, loops, memory footprint, amortized cost, queueing, throughput, latency percentiles, and reliability.
This formula sheet collects common estimates used when reviewing algorithms, data structures, and computer-system performance. The equations are first-pass checks, not substitutes for profiling. Use them to compare growth rates, find capacity limits, set benchmark expectations, and identify workloads that need more detailed measurement.
Always state what n means. It may be number of elements, bytes, vertices, edges, requests, records, samples, tasks, or states. Also state whether the estimate is best case, average case, expected case, amortized case, or worst case.
Common growth classes
Constant time:
Logarithmic time:
Linear time:
Linearithmic time:
Quadratic time:
Exponential time:
Factorial time:
Asymptotic notation suppresses constants and lower-order terms. For engineering decisions, combine growth class with measured constants, memory traffic, allocation behaviour, and latency requirements.
Loop operation counts
A single loop over n items is usually:
so:
A nested loop over all pairs is usually:
so:
For triangular nested loops:
so the growth is still:
For a loop that repeatedly halves the remaining work:
so the growth is:
Recurrences
Binary search can be written as:
which gives:
Merge sort is commonly written as:
which gives:
A simple exhaustive subset search often has:
which gives:
Recurrences are estimates of structure, not performance proof. Constant factors, allocation, recursion depth, and cache behaviour still need review.
Data-structure operation estimates
Array index access:
Array scan:
Insertion or deletion in the middle of a contiguous array:
Balanced binary search tree lookup, insertion, or deletion:
Unbalanced binary search tree worst-case lookup, insertion, or deletion:
Hash table expected lookup, insertion, or deletion:
Hash table worst-case lookup under severe collision:
These are model values. Real systems should also check memory locality, allocation, resizing, synchronization, ordering requirements, and adversarial input behaviour.
Graph storage and traversal
For a graph with V vertices and E edges, adjacency-list storage is:
Adjacency-matrix storage is:
Breadth-first search or depth-first search with adjacency lists:
Checking whether an edge exists in an adjacency matrix:
Sparse graphs often satisfy:
Dense graphs approach:
The representation should match graph density, update frequency, memory limits, and the operations that dominate the workload.
Memory footprint
Approximate memory for an array of n fixed-size elements:
where S_e is element size and S_{header} is container overhead.
Approximate memory for a pointer-based node structure:
where p is number of pointers per node, S_p is pointer size, and S_m is metadata or allocator overhead per node.
Hash table load factor:
where n is stored entries and m is bucket capacity. Higher load factor reduces unused memory but usually increases collision cost and probe length.
Amortized cost
For a dynamic array that doubles capacity when full, a sequence of n appends has total copy cost bounded by a constant multiple of n:
Amortized append cost is therefore:
Amortized cost is not the same as worst-case cost for one operation. A resize can still create a latency spike. Real-time systems may need reserved capacity, bounded containers, incremental resizing, or static allocation.
Throughput and service time
If one worker completes one operation in average service time S_t, ideal throughput is:
With c identical independent workers, ideal throughput is:
Actual throughput is lower when there is contention, synchronization, serial work, I/O, retries, cache interference, or uneven load balancing.
If work has a serial fraction s and a parallel fraction 1-s, Amdahl’s Law gives ideal speedup on p processors:
This shows why serial bottlenecks can dominate even when many processors are available.
Queueing checks
Little’s Law:
where L is average number in the system, \lambda is arrival rate or throughput, and W is average time in the system.
For a simple service model with c parallel servers and service rate \mu per server, utilization is:
A stable queue requires:
As \rho approaches 1, waiting time can rise sharply. Design should include burstiness, variability, priority rules, backpressure, and tail latency, not only average utilization.
Latency decomposition
End-to-end latency can be estimated as:
Only include terms that apply to the system under review. The important point is to define start and stop events consistently.
Average latency:
Maximum observed latency:
Percentile latency P_q is the value below which q percent of observations fall. For interactive and real-time systems, P95, P99, and deadline misses are often more informative than the mean.
Bandwidth and transfer time
Transfer time for payload size B over effective bandwidth R:
If a fixed setup latency T_0 is present:
This distinction matters for small messages. A high-bandwidth link can still perform poorly for small requests if fixed latency dominates.
For a data bus with width w bits and clock rate f, ideal raw bandwidth is:
Effective bandwidth is lower after protocol overhead, idle cycles, arbitration, encoding, retries, and contention:
where \eta is efficiency between 0 and 1.
Reliability and failure probability
If an operation has independent failure probability p and is attempted n times, expected failures are:
Probability of no failures across n independent attempts:
Probability of at least one failure:
These simplified equations are useful for order-of-magnitude checks, but real systems often have correlated failures, retries, overload coupling, and common-cause faults.
Benchmark interpretation
Speedup from an optimization:
Percentage time reduction:
Operations per second:
Memory bandwidth used:
For a benchmark to be meaningful, report hardware, compiler or runtime version, input size, input distribution, warm-up method, sample count, measurement overhead, memory use, and latency distribution.
Practical checklist
Use these formulas with a short engineering checklist:
- Define n, workload mix, and input distribution.
- Estimate asymptotic growth and memory footprint.
- Check worst-case or adversarial behaviour.
- Check latency percentiles, not just average time.
- Compare theory with measurements on representative data.
- Review allocation, locality, synchronization, and I/O.
- Document invariants and failure behaviour.
A formula sheet can narrow the design space quickly. It cannot decide whether a structure is appropriate unless the workload, hardware, and operational constraints are clear.