The Linux Scheduler: a Decade of Wasted Cores
The Linux Scheduler: a Decade of Wasted Cores
Abstract
As a central part of resource management, the OS thread scheduler must maintain
the following, simple, invariant: make sure that ready threads are scheduled on
available cores. As simple as it may seem, we found that this invariant is
often broken in Linux. Cores may stay idle for seconds while ready threads are
waiting in runqueues. In our experiments, these performance bugs caused
many-fold performance degradation for synchronization-heavy scientific
applications, 13% higher latency for kernel make, and a 14-23%
decrease in TPC-H throughput for a widely used commercial database. The main
contribution of this work is the discovery and analysis of these bugs and
providing the fixes. Conventional testing techniques and debugging tools are
ineffective at confirming or understanding this kind of bugs, because their
symptoms are often evasive. To drive our investigation, we built new tools that
check for violation of the invariant online and visualize scheduling activity.
They are simple, easily portable across kernel versions and run with a
negligible overhead. We believe that making these tools part of the kernel
developers' tool belt can help keep this type of bugs at bay.
Article
The Linux Scheduler: a Decade of Wasted Cores
[
PDF]
[
Slides]
[
Extended slides]
Jean-Pierre Lozi, Baptiste Lepers, Justin Funston, Fabien Gaud, Vivien Quéma, and Alexandra Fedorova
To appear in
Proceedings of the Eleventh European Conference on Computer Systems
(EuroSys '16), London, United Kingdom, 2016