Linux Kernel Build

Since any piece of hardware needs a specialized piece of code in the kernel to work the hardware, the so called driver, a full kernel can be quite large, including code for all the possible hardwares that exist. In order to make the kernel size manageable, many kernel components are modules, pieces of code that are dynamically combined into the kernel on an on-need basis. All operating systems have these as well. OSX calles the kernel extensions, or "kext". In linux they are also called kernel objects and have extensions "ko".

Although the size of the running kernel is greatly reduced by use of kernel objects, the code still has to be compiled. One thing I will talk about is the configuration of a kernel, which also includes instructing the build which drivers to omit in the build. If one builds all drivers, the build will go upwards of a good portion of a day.

Linux is arranged in "distros", distributions, and is otherwise a combination of the linux kernel, GNU software, and distro specific code to administrate the system and its installation. There is also third party code that might be outside the distro system, or poorly integrated into the distro system. Whatever distro of Linux used, they all agree on the kernel, as it is the possession of Linus Torvalds. After we install the kernel sources, we configure the kernel, indicating a minimal number of drivers and also all the build options of code inside the kernel proper.

Equipment run-down

Installing Ubuntu on Virtual Box

Preparation for a build

You will need to work as root for some or all of the build. On a production machine, or to save your sanity, you tend not to work as root. Root has all powers, including the ultimate power to kill your machine in the blink of an eye. As in, no recovery. None. No "undo destruction of machine", no "restore to known state". Gone.

NAT vs. Bridged networking

In the Virtual Box manager panel, highlight a VM and select "settings", then the network tab. The network is most likely set to NAT, but the pull down option menu shows alternatives. Using "Bridge Networking" has advantages. Try selecting this and rebooting the machine.

Bridged networking will more closely simulate a separate machine, a box within your box, that you can log into using SSH. In the virtual machine, use ifconfig to find the IP address, and in the host machine, at a terminal, use this IP address and the username in an ssh command, ssh _username_@_ipaddress_. You can have multiple windows open like this. You can even use sftp-aware editors such as TextWrangler to edit on your host and have the changes immediately saved to the virtual machine.

How to use a config file:

Before building a kernel, you need to configure the build. This states the options that the kernel will include, will choose whether certain functionality is built-in or placed in a separable module; and what functionality and drivers will be omitted from the build altogether. A full build takes hours. Fortunately, for all Unix platforms, the desktop is not part of the operating system. Including the desktop, last time I did a full build, it took over a day.

A suitable dot-config file has been placed in class subversion reposo, under the class folder.

This configuration has been stripped and it took 90 minutes to build on my system. The USB and Audio drivers have been all removed, as well unneeded ethernet drivers, all Wifi, stuff like touch screen and joystick drivers, all other networking protocols of than IPv4, and the removal of several file systems. Once a full build is done, future builds are much shorter, rebuilding only what has changed, or what depends upon things that changed.

The config files for a linux build is a hidden file with name ".config" which is placed in the root of the build tree. You unpacked your source into the directory /usr/src/linux-source-xyz, so the full pathname of the config file is /usr/src/linux-source-xyz/.config. You can't just place the file the there — the configuration file is created by a script which reaches into every directory of the source tree looking for files which instruct the overall set of options, and the .config is built automatically from this process.

If what you need to do if you have just renamed your configuration file .config is to run the following, in the root of the source (/usr/src/linunx-source-xyz/):

To manipulate the .config file, you could edit it by hand or use the menu driven approach that helps guide you through the options:

make menuconfig screen

Building the kernel

There are several philosophies of Linux kernel building. A simple, standard kernel build uses the Makefile in the root of the source tree, and follows these steps:

Run make help in the root of the build source for information on the build targets. Modules are each a separate file, with a .ko extension. They are installed by being copied into the special directory /lib/modules/[kernel revision name]. Kernel modules are parts of the kernel that are dynamically loaded as needed.

Note that once you build, as long as you do not do make clean or update the configuration file, you make compiles and links only what has changed. This should only be a few minutes.

Booting your kernel

A RAM disk is a disk-like storage system that uses RAM as the final destination for the data rather than a magnetic disk. It is preferred for speed, however it is volatile — is goes away on reboot. Sometimes that's a plus, as for with temporary files that need to be erased on reboot anyway.

Linux uses a very intelligent buffer cache to shuttle data blocks to and from the disk. Each buffer in the buffer cache remembers what disk block it mirrors. A read to a disk to can satisfied without going out to the disk if the block has recently been read or written, if the buffer that held that block is still in the cache.

With such a sophisticated buffer cache, it was discovered that a RAM disk can be created more efficiently by stopping the transfer at the buffer cache, and marking the buffer in the cache which carries data to the RAM disk as "permanent". No need to copy an in-RAM buffer in the buffer cache to an in-RAM buffer in the RAM disk. Just sitting permanently in the buffer cache is sufficient to create an in-RAM file system.

This scheme was christened the ramfs.

After building and installing the kernel and the modules, you need to build the initial root directory. In case you google for additional help on this, beware that the initial root directory process has changed recently. It now uses a simple cpio into a ramfs, rather than imaging a RAM-disk.

At boot, the boot loader knows only enough about reading from disk so that it can copy the kernel from a file on disk into the memory, and then begin running the kernel (early boot). The kernel itself might rely on modules to provide the code necessary to access the file systems on disk. These modules are files that might reside on the file systems which the kernel cannot yet read.

Linux solves this problem by having a partition on disk in a file format that the boot loader and the initial kernel know how to read. Generally this is ext2, and the partition is later remounted into the runtime file tree at /boot. The kernel is stored in this partition, as well as an archive of the initial file system required by the kernel during early boot. The kernel creates a ramfs and unpacks this archive into the ramfs and mounts this as the initial root directory.

This root serves the kernel during early boot, but is later jettisoned when the real root directory is discovered on disk and mounted over the initial root, called re-rooting or pivoting on root.

The process of creating a initial root directory is automated by the update-initramfs script. It creates a file called initrd.img-[kernel revision name] in the /boot directory. You need to run this program or your kernel won't complete boot. If you watch your boot output carefully, you will see that the boot process seemed to go well up until the first file access to the file system. The error messages will be about not finding a root file system, or not recognizing it. That means the initrd was not properly formed or placed.

A single linux install can boot to any of several kernels. You will not be overwriting the old kernel with the new kernel. You will be providing an option at boot to run either the old kernel or the new kernel. Since the modules are in directories with names that correspond to the kernel names, the option will be to run the new kernel and all the new modules or the old kernel and all the old modules.

Kernel names differ because of the various revision numbers, or because you added a local suffix to the name, such as "-burt". (Note hyphen as first character.) You can specify a local suffix as an option in the .config file, using menuconfig.

The boot loader boots the system. There are three boot loaders you might see: lilo, grub and grub2. The Ubuntu we are using installs grub2. Before rebooting, you have to inform grub2 of your new kernel, so it can put in in the boot menu. You also have to modify its behavior a slight bit. By default, grub2 does a quiet boot, and it does not show the boot menu. (Booting with the shift key depressed will show the boot menu, and I think it cancels quiet boot.)

Now you are ready to try your new kernel. Type reboot and your machine will reboot. At the boot menu, select your kernel. Hopefully it will boot properly. Once booted, use uname -r to see the revision number of the kernel you are running. If it is yours, congratulations!

Grub boot screen

Changes in Grub2 make it likely that your kernel can be selected by going to a submenu, labeled "Previous Linux versions" in this image. For people going back and forth between kernels, this is unfortunate. It is difficult to convince Ubuntu to place all (or even a few more) kernels on the main boot menu.