The Hidden War Inside Your CPU: Throughput vs. Latency
CPU schedulers face an impossible tradeoff: maximizing throughput hurts latency, and optimizing for response time sacrifices efficiency. This article explores how Linux and modern schedulers balance the war between batch performance and interactive responsiveness, and what it means for your applications.
Advertisement
The Hidden War Inside Your CPU: Throughput vs. Latency
For decades, your computer's operating system has been fighting a quiet, invisible war. On one side: schedulers obsessed with squeezing every last drop of work through the CPU. On the other: applications that will stutter, freeze, or crash if they don't get attention right now.
The battlefield is your processor's run queue.
The Scheduler's Impossible Promise
Every OS scheduler faces the same core tradeoff: more throughput means worse latency, and vice versa.
When a scheduler prioritizes throughput, it stacks tasks like shipping containers — keeping the CPU 100% busy by batching work and minimizing context switches. This sounds efficient. And for a server farm running batch jobs, it is. But for a user hitting a key or a video stream buffering, it's a disaster.
The classic Completely Fair Scheduler (CFS) in Linux aims for "fairness" by distributing CPU time equally among tasks. But equal doesn't mean fast. A high-throughput scheduler might let a video transcoder run for a full time slice before checking on your mouse click.
Where Latency Bleeds
Latency-sensitive workloads — think real-time audio, video conferencing, game engines, or trading algorithms — require deterministic response times. They don't care if the CPU is "efficiently" busy; they care if their task gets scheduled within 1 millisecond.
The problem is fundamental: the CPU can only run one thread at a time per core (ignoring hyperthreading for a moment). Every time the scheduler switches contexts, it's pure overhead — saving registers, flushing caches — and that overhead compounds when you squeeze in more switches for latency's sake.
Modern schedulers employ three big tricks to balance this:
1. Priority Inversion and Its Fixes
A classic nightmare: a high-priority latency-sensitive task needs a lock held by a low-priority task, which gets preempted by a medium-priority task. Now the important work is stuck. Real-time schedulers (like Linux's SCHED_FIFO or SCHED_RR) use priority inheritance protocols — temporarily boosting the low-priority task's priority so it can finish and release the lock.
2. The "BFS" and "MuQSS" Rebellion
Con Kolivas, a legendary Linux kernel developer, argued the CFS was fundamentally broken for desktop interactivity. His Brain Fuck Scheduler (BFS) and its successor MuQSS (Multiple Queue Skiplist Scheduler) threw out global run queues and per-task fairness. Instead, they used per-CPU queues and a simple skip-list based "virtual deadline" — giving near-instant response to interactive tasks, even at the cost of batch throughput. Benchmark-obsessed sysadmins hated it. Desktop users loved it.
3. The New Contenders: EEVDF and CFS2
In 2024, Linux adopted the Earliest Eligible Virtual Deadline First (EEVDF) scheduler as the default for X86_64. This is a direct response to the tension. EEVDF sorts tasks not by fairness but by "how soon this task needs to run" — blending a virtual deadline for latency with weight-based fairness for throughput. Early benchmarks show much lower tail latencies for interactive workloads while maintaining near-CFS throughput for background jobs.
The Real Cost of Winning the Wrong Battle
The most painful examples of throughput optimization gone wrong come from big.LITTLE architectures (like ARM mobile processors). The early schedulers for these heterogeneous CPU clusters would try to keep the big, power-hungry cores at 100% utilization. This meant a background app could pin the "big" core forever, while your foreground task — waiting for a frame to render — got stuck on a slow "LITTLE" core. The result? Worse battery life and worse performance.
Modern Energy-Aware Scheduling (EAS) flips this: it aggressively migrates interactive tasks to the best core immediately, even if it means idling a fast core. Throughput drops slightly on paper, but user-perceived latency plummets.
Can You Win Both? The Experimental Frontier
Several research projects and niche schedulers are trying to have it both ways:
- Ghost schedulers like
SCHED_DEADLINEin Linux enforce hard real-time guarantees alongside background tasks — giving latency-sensitive jobs a "reservation" of CPU time, not just priority. - User-space scheduling (e.g., with
librseqor DPDK) lets applications bypass the kernel scheduler entirely for critical paths, managing their own cores. - Hardware-assisted scheduling — Intel's Thread Director and Apple's M-series Efficiency/P-Core controllers — feed real-time utilization hints directly to the OS scheduler, reducing latency by preventing migrations altogether.
The Bottom Line
No scheduler will ever "solve" the throughput-vs-latency problem, because physics won't let you. Every nanosecond spent deciding which thread to run is time not spent running one. The best we can do is:
- Recognize your workload: batch processing wants CFS/EEVDF; real-time wants
SCHED_FIFO; interactive desktops want MuQSS or EAS. - Tune aggressively: on Linux,
sysctlparameters likekernel.sched_latency_nsorkernel.sched_migration_cost_nsdirectly shift the balance. Cut migration cost, get better latency at the cost of throughput. - Measure tail latency, not just average CPU utilization. A 99.9th percentile response time of 50ms is a failure for audio, even if 99% of tasks complete in 1ms.
The war inside your CPU isn't ending. But with every kernel release, the generals are getting smarter about when to sacrifice throughput for the sake of your click.
Advertisement
Comments
Questions, corrections, and tips stay visible for everyone reading this page.
Join the discussion
No comments yet
Be the first to leave a note — it helps the next reader.