The kernel column #91 by Jon Masters
In this months kernel column John Masters discusses another eventful kernel cycle, not to mention the latest round of Linus Torvald (justified?) rants, the Kernel Summit 2010 and some pretty intense penguin-on-penguin action…
Linux 2.6.35 was finally released last month after what can only be described as a (comparatively) mundane development cycle. With the high drama of the previous cycle, that was hardly very difficult to achieve. Sure, there were the typical Linus rants of the month (the main one focused on Linus’s dislike of the ‘defconfig’ files that he sees as cluttering up the kernel tree with tens of thousands of lines of reference configuration files that could live elsewhere – like on the websites for the various supported architectures that create them) and there were a few harsh words for one of the C library maintainers. But there was no giant flame war related to graphics, or security modules, nor calls of protest at Linus’s ever ongoing effort to herd the developers into a focus on stability and regression-fighting prior to release. It was, in short, a rather sleepy summer month in which it seemed people were often busy being away on vacation or being at one of the usual round of conference events. I myself managed both of these things to a greater or lesser extent, and I was grateful for a little less mailing list traffic to catch up on.
It wasn’t all quiet, of course. The latest kernel brings with it many new and improved features. Among them, my personal favourites are the new support for Receive Packet Steering (RPS – a means through which the kernel can essentially multithread and dish out packet processing between the many threads and cores of modern CPUs – courtesy of our packet-heavy friends at Google) and Mel Gorman’s Memory Compaction patches. The latter provide a means through which the kernel can use spare (idle) cycles to effectively defragment physical memory into larger regions of contiguous memory pages – which should prove useful the next time the kernel needs a large allocation that isn’t in the mapped virtual memory given to regular apps. My conversations at the Linux Symposium suggest to me that the next stage should be to look at NUMA node-aware node compaction, moving data back to retain its locality with regard to the system processor handling it (where memory accesses are faster), rather than the situation today in which we start out being very NUMA aware when creating processes with utilities like numactl, but don’t seem to quite have all of the pieces in place just yet to pull data
back onto a specific node after it has been allocated elsewhere. In any case, you can find further information about this and the other latest bits in the 2.6.35 kernel at the
Kernel Newbies site.
Perhaps the biggest thing we’ve learned from 2.6.35 is that the new model of development, one in which Linus is very strict about features he is willing to take into the kernel – and precisely when – is here to stay. The kernel merge window (the period of time during which new features can be merged into the kernel source) is these days being rigorously enforced. Linus was particularly proud of the way he handled things up until RC3, which he called out specifically as an achievement in a lengthy mail announcing 2.6.35 and reflecting upon the past development cycle. Interestingly, Linus also chose to make a point about the linux-next kernel development source code tree (the place where features ready for primetime are supposed to get one more soak prior to the opening of the merge window), saying that it wasn’t intended to be a general dumping ground: “If you’re nervous about the stability of your work, you should just admit that it’s not ready to be merged, shouldn’t do in the next release cycle, and shouldn’t be in linux-next yet and make life harder for people like Andrew [Morton – the second in command]”. Fighting words there from our fearless leader.
Kernel Summit 2010
When not working on the kernel, developers often like to discuss ways in which they might work on it in the future. One of those opportunities comes in the form of the (invitational) Kernel Summit, which is held in a different location each year, typically on the back on some other event. This year, the Kernel Summit is in my own fair city of Cambridge, Massachusetts, and will be held in early November, immediately after the Linux Plumbers Conference (LPC), but before the weather really takes a turn for the worse. It will be fun to have so many core kernel folks in town, but everyone can get something out of Kernel Summit, even if it’s just reading about the hot-button topics in Jon Corbet’s excellent summaries on the Linux Weekly News website. Speaking of websites, the Kernel Summit site is up and Ted Ts’o is soliciting for input on the event. Ted also noted in a rare ‘Public Service Announcement’ on the kernel mailing list that over 40% of visitors to the Kernel Summit site have Adobe Flash browser plug-ins that are known to be vulnerable to a particularly bad security bug. Once you’ve heeded his excellent advice on patching your software, head on over here. And don’t forget to check out the other events happening around the same time as the Kernel Summit and Linux Plumbers Conference – such as this year’s Power Management Minisummit – by visiting the LPC website.
As one kernel development cycle winds down, another prepares to gear up. A lot of developers have worked on exciting things for 2.6.36. Many of these continue to be focused on scalability and stability. For example, we heard of tests run on the latest SGI systems in which the ACPI tables describing multi-terrabyte RAM systems can take dozens of minutes to fully populate into sysfs. That led to a discussion about storing copies of those tables in cacheable RAM for the ten minutes saved in boot time – something we would never have had to worry about a decade ago! Which brings us on to the 2TB+ disk capacity story. These larger disks exceed the number of sectors that can be adequately expressed using 32-bit quantities, for example in almost any shipping BIOS, and can impose limits on booting for the majority of PC users who don’t yet have EFI-enabled systems (Mac users already get EFI, and had Open Firmware before that anyway). Fortunately, Tejun Heo discovered that life isn’t as bad for large drive users as it might have seemed, and his testing of a 2.5TB drive with four different controllers was all very positive. There are still problems with future 4K sector drives, and these are noted in the excellent discussion available here.
As the summer gives way to the fall, with it comes the end of another annual conference season. Last month, I wrote about my then forthcoming keynote on the state of the kernel at this year’s Linux Symposium. The conference was very interesting – smaller than in the past, but the talks were of high quality – and the talk was received well. In fact, I overran by a half hour (there was room in the schedule) in my quest to summarise everything that had happened over the past year, of which there were many events. It got me thinking that we need a rolling timeline of events in general. I’m going to try setting one up when I get chance, tracking when features are proposed, and when they are actually accepted into the kernel, or dropped on the floor. And I could do with your input on how you would best like to see this done. But for now, as I write this I’m about to head off to one more conference, the second annual LinuxCon, where I am donning a penguin suit and running the final quiz event, just for fun…