Kernel 3.14 development – the kernel column
Jon Masters summarises the latest happenings in the Linux kernel community, as the final days of 3.13 development come to a close, and the 3.14 kernel development cycle begins
Linus Torvalds announced the Linux 3.13 Release Candidate (RC) 8 kernel, saying “Another week, another RC. And things look fine.” According to Linus, RC8 has “nothing particularly scary”, which means that 3.13 is cooked and will be released by the time you read this. All of this is hardly surprising given that Linus had pre-announced his intention to do the (typically optional) RC8 in his RC7 announcement, and only because he was on the road (at LinuxConf AU), meaning he was “decidedly *not* opening the merge window [for Linux 3.14], so I’ll do an rc8 next week instead, needed or not.”
Development of features targeted for the upcoming 3.14 kernel development cycle has of course been an ongoing affair, with most of these features already mature, and present in the development trees of individual kernel subsystem maintainers. Most of these will have already landed in Stephen Rothwell’s nightly ‘linux-next’ compilation tree. The rest will have been soak tested in other places, but almost no entirely new code will land in 3.14 that hasn’t already been posted in public for review. These features will be posted for formal inclusion into the 3.14 kernel once Linus opens up the ‘merge window’, which is the (up to) several week period between one kernel release and the initial release candidate for the following one.
Known exploit detection
Vegard Nossum (Oracle) posted a patch proposal entitled ‘Known exploit detection’. Quoting Vegard: “[t]he idea is simple – since different kernel versions are vulnerable to different root exploits [remote or locally executed code sequences that compromise a system, giving “root” access to the system attacker], hackers most likely try multiple exploits before they actually succeed’. The idea really is that simple. It involves adding a new code annotation function named ‘exploit’ that can be called at points in the kernel where fixes are added for known security issues (exploits). As an exploit is patched, a one- line call can be (optionally) added to the exploit() function, which will log that a potential security exploit was attempted for which the kernel has been patched through a software update.
Existing kernels don’t provide such a mechanism. They are typically either vulnerable to a security weakness, or have been patched with some change that works around the issue or removes it entirely. Indeed, there was some pushback against the exploit() idea insomuch as it may lead to the kernel containing a large quantity of such annotations over time.
Another argument against the (potential) new feature was that attackers could remove logs after compromising a system, rendering the exploit() logging worthless. The latter is easily and commonly solved through network-based real-time system logging and other Linux audit features designed to prevent attackers from destroying log files upon compromising a system. The former concern (that of extraneous exploit annotations in the kernel) was addressed with an assertion that only certain security issues would be so annotated, and only for a period of time – after about five years – they would be removed.
The notion of exploit catching is an interesting one, and Vegard is likely right that most hackers will try multiple exploits before succeeding (successful exploits are typically a silent affair – the system is ‘rooted’ but there is little or no indication in the logs or to the user that this has occured), especially because most ‘hackers’ are really ‘script kiddies’ or people that have downloaded a ‘rootkit’ of precreated attackers that are automatically attempted against the intended target of the exploit attempt. Overall, the exploit logging mechanism is straightforward, simple, and seems to have buy- in from some of the key security stakeholders in the Linux kernel security community (such as Kees Cook, Google, who is a machine). This means that there is a very high chance of this feature (or one like it) being added.
Miklos Szeredi posted version 3 of his ‘cross rename’ patch series. These introduce a new systemcall (syscall) named ‘renameat2’, which behaves in a manner similar to the existing ‘renameat’ syscall, but has an additional argument named flags. This additional argument supports the cross-rename concept, which allows (for example) atomically swapping a irectory tree with a symbolic link. By atomic, this means that the two path components are seen by running programs, either before they have been swapped or after, but there is no mid-point in which programs see an undesirable state, as would be the case with existing implementations (that would need to use a series of system calls to emulate such behaviour by first tearing down the entries and recreating them in stages).
More to the point, cross-rename is useful to overlay filesystems, which combine two existing filesystems to create a (virtual) third one. Overlays are typically used with flash or ‘live’ systems in which a read-only (eg root) filesystem is made modifiable through the addition of a small lookaside space (another filesystem) that stores deltas between that read-only filesystem and the desired modified state.
New files are simply added to the lookaside overlay, but deletions or replacements of existing files are provided as ‘whiteout’ (‘whiteout’ is the American term for what is known in other parts of the world by brand names such as Tipp-Ex – a substance that covers up existing written text on printed pages and allows replacement text to be written) entries that indicate the system should ignore an existing file and use a replacement one (from the overlay). By using cross-rename, an overlay filesystem can atomically present either one view of the filesystem, or the whited- out replacement state. Miklos intends for cross- rename to land in the next Linux (3.14) kernel as a dependency for future overlays work.
Josh Triplett posted a question around usage of the compiler directive ‘#pragma once’, which is a feature provided by some modern compilers that allows for the replacing of the classic ‘#ifndef-#define-…-#endif’ code sequence with something a little more elegent. These code sequences are known as ‘include guards’ in that they are designed to ensure a header file is used only once in a chain of included headers by wrapping its entire contents in a conditional preprocessor directive. The ‘#pragma’ special directive supported by GCC, ICC (Intel Compiler Collection), and other modern compilers is more elegant and less error prone. There was a general consensus that enough tools and code checkers support this (including older versions in use by distros) that a transition could take place over future kernel cycles. Those tools broken by this change would need to be modified, but is unlikely to impact most kernel developers.
Dave Young noted that kernel.org has followed through on the plan to remove support for generating new ‘tar.bz2’ archives. Instead, future kernels will be shipped using the legacy Unix ‘tarball’ (tar.gz) file format as well as the newer ‘tar.xz’ format, using the more modern ‘xz’ compression, which provides more efficient space utilisation than even the BZ2 format. It is not likely possible to remove legacy tarball support due to the number of tools out there that still rely upon it, but the same is not really true of existing users of bz2 archives – they have systems supporting xz too.
Finally this month’s humour award goes to Steven Rostedt, for his suggestion that security could be improved by simply renaming the root account to ‘walley’ and using ‘root’ for his regular user account. Then ‘[i]f anyone tries to break into ‘root’ they will just gain access to a normal account and nothing more’. Unfortunately, Steven, your plan is now public record.