The Linux kernel column #89
Last month saw the opening and subsequent closing of the 2.6.35 kernel’s merge window, the period of time during which all of the exciting new features that have been waiting in the wings (and in linux-next nightly kernels provided by Stephen Rothwell) are considered for merging into the official ‘mainline’ kernel source tree by Linus Torvalds…
Another nice feature in 2.6.35 is the ability for a host system running KVM virtualised guest machines to profile their performance using a new extension to the perf command. This is something that developers have been discussing for some time as a means to figure out ways to optimise guest machines for their host environment, or just to better understand what they are doing. Yet another one of the new features in 2.6.35, and of particular personal interest to me, is the now in-kernel KDB debugger from Jason Wessel (of Wind River) that augments the KGDB stub previously added to the kernel. This means that the Linux kernel finally has an in-kernel debugger (KGDB) available (a primitive one, but nonetheless a debugger) as well as the ability to be driven remotely using another system via a serial line, and using GDB (via the existing KGDB stub). I have been running the KGDB/KDB patches for some time now on some systems and am very encouraged by the enhanced ability to debug a crashed system. On a crashed system with these patches enabled, the debugger can be configured to start automatically, and this greatly aids in the kernel development process for some developers.
Thinking of crashes, error handling came up again this month in a big way. It turns out that April’s Linux Foundation Collaboration Summit had also featured a ‘Hardware Error Kernel Mini-Summit’. Mauro Carvalho Chehab sent a nice, detailed write-up of the event, at which various parties had attempted to put forth the case for a generic ‘hardware error’ handling infrastructure within the kernel. The wealth of existing solutions for tracking reportable, possibly correctable, errors includes such things as mcelog to capture Intel Machine Check Exception event data (temporary memory issues etc) and the AMD EDAC counterpart, but there are also various other devices such as hard disks and processors that can potentially survive a partial failure and for which generic infrastructure could be of use. Ingo Molnar is on record as favouring that his ‘performance events’ (perf) infrastructure simply be extended to support logging and reporting on hardware errors, with some support in user space for acting upon these error reports as appropriate. Whatever happens, expect to see a little churn in this area in future, perhaps as soon as 2.6.36 or 2.6.37 if people have time to work on it.
Android’s suspend blockers
One of the things that didn’t make it into Linux 2.6.35, but did spark a lot of discussion this month, was the ‘suspend blockers’ feature touted by Google. Like most mobile (cell) phone platforms, Google’s Android is very sensitive to achieving the best possible utilisation of limited battery resources. To do this, just like other embedded platforms, Android will frequently try to suspend the running system and enter into a very low-power state. Unlike iPhone (which offers an alternative to true multitasking for third-party app developers), Android does offer real multitasking support in apps, using cleaver techniques such as Bundles to allow certain background apps that consume too many resources to be effectively snapshotted, stopped and resumed in a transparent fashion at a later time (background services are handled differently). This is a nice feature, and a great reason to use an Android-based phone, but it comes with the cost that many applications will want to run at any one moment, and so waiting for a lull in system activity to perform a suspend could prove to be a real drain on battery life.
Rather than wait around for well-behaved applications (and bear in mind, Android devices often run numerous third-party apps downloaded from the Market that may not have been engineered for power use), Android uses the pragmatic approach of ‘suspend blockers’, which are an explicit means through which software can inform the system it must not suspend (during phone calls, or when the user is driving the display, for example), but it will attempt to aggressively suspend the system at other times, even if an application is running (a point during which regular Linux systems would not enter a suspend state). It’s a nice feature, but it is a little invasive, adds a new kernel API to user space and is not in the upstream kernel.
This last point has been a bone of contention, since it is the reason why some Android drivers have not yet made their way into the mainline Linux kernel. Developers are wary of taking the Google code outright, and favour instead evolving extensions to the existing Cpuidle, scheduling and/or QoS features already available. Ingo Molnar summed up the situation in describing how Linux is an engineering effort that has literally cost about 10,000 man years’ worth of developer time, and so it “takes effort to keep that kind of work valuable”.
Want more kernel column goodness? Try this…