Tales from a Core File

Search
Close this search box.

Three years ago I first came to Sun Microsystems working in Microelectronics and had a chance to work on verification for KT – what most of us now know as the Sparc T3. After a great time there, I found myself given the opportunity to join the Fishworks team twice, first for an unforgettable internship and again to continue on working full time. Throughout all of this I have had the opportunity to work with a number of outstanding engineers, seen innovative products and technologies thrive and launch, and learned innumerable things. I wouldn’t choose to spend my time at Fishworks any other way.

It has come time to move on and do something else. I am going to be switching gears and doing something different outside of Sun/Oracle. I’ll be heading off and tackling something new with a different company and some familiar faces.

One of the things I’ve worked with a decent amount is prototyping filesystems in FUSE. For the uninitated, FUSE allows you to implement the traditional VFS API in userspace. Of course, running your filesystem in userspace comes at an obvious performance cost. Serializing to and from userland is not the cheapest operation. However, a lot of filesystems that are not native to a platform are being written in FUSE. For example, the stock Ubuntu ships with NTFS-3G, an NTFS implementation in FUSE. Similarly, if you want ext-2 support on OS X, there is a FUSE filesystem that supports this.

However, those are some of the more mundane uses of FUSE. Rather, one could look
at FUSE as a way to about and reimplement a lot of the ideas from Plan 9 from Bell Labs. Plan 9 really added a lot of interesting things to the filesystem namesapce. In particular, take a look at exportfs, ftpfs, /proc, etc. The limit was very broad, the goal was to try and provide a file interface to a lot of different things. While in some cases, this may have not been the right way to go, it provides a fascinating look into where the original Unix guys thought to go next. You should definitely read up more about Plan 9 here.

While obviously, the means of implementation is vastly different, what you can do with fuse, is give anything really a file interface or filesystem that makes sense. Perhaps one of the more useful FUSE-based filesystems is sshfs, which is similar to the ftpfs presented above. This allows you to effectively mount a system that you have ssh access to and brose it as per any other filesystem. There are a lot of interesting things you could present as a file system interface; however, there is another thing we can do that is more interesting.

Now, while there are plethora of such systems and other things you can represent, there is a more interesting thing you can do. Rather than extending the namespace, change what happens when you access some portion of the namespace. This can be done thanks to a new round of system calls that were defined and published as part of a 2008 standard from the Open Group. These calls contain functions like openat, unlinkat, mkdirat, fstatat, etc. The main difference between these functions and their non-at forefathers are that they all take a file descriptor and a path relative to that file descriptor to determine what to modify.

With these, what you can do is open a file descriptor to the mount point you want to mount over, before you call into fuse. This when fuse mounts over that part of the filesystem namespace, you can still access it. This means that you now hvae the ability to interpose on all of the various system calls that are coming into this filesystem and present all the data that the underlying filesystems have always had. Your main method might look something like this:

static int rfd;

int
main(int argc, char *argv[])
{
        /* Throw in some sanity checking, expect mountpoint as last arg */
        rfd = open(argv[argc-1], O_RDONLY);
        if (rfd < 0) {
                fprintf(stderr, "Unable to open mountpoint - %s\n",
                    strerror(errno));
                return (1);
        }

        return (fuse_main(argc, argv, &fuseops, NULL));
}

This opens up a lot of possibilities for what we can do with our filesystem. Of course, not quite all of it makes sense to be done in this form. That is something that we’ll bring up and talk about in more detail later on.

A while back I was feeling like experimenting with a BSD just for a change of pace. My first inclination was to take a look at FreeBSD because of the ZFS and DTrace support that it has. I had heard the news that Debian/kFreeBSD was going to be considered stable and get the same treatment for security as the traditional GNU/Linux Debian Distributions. One of the things I’ve always liked about Debian is that they do try and stay active with security with their stable releases, which is what you want when you have a machine that is going to be out there braving everything the Internet might throw at it. So I decided to give it a whirl and investigate its current state. While the main installer doesn’t have ZFS support, the Debian Developer Robert Millan put together an installer with ZFS support.

So I booted up and went through the installer. One currently known limitation ofthe installer is that you can’t create a ZFS mirror. Though you can simply ZFS attach a disk later to create a mirror. Though, be careful, some versions of GRUB2 which Debian/kFreeBSD is using don’t handle mirrors well. From here, I wanted to get the familiar help message that comes from running dtrace with no arguments. You can imagine my surprise when I ran:

~ $ dtrace
-bash: dtrace: command not found

Well that’s no good. A quick scour of the package list confirmed my suspicions that there was no DTrace package. While obviously people end up thinking about something like ZFS much more than they might DTrace, after you get used to using DTrace it isn’t something you want to give up.

So, I began to take a look at what it might take to get DTrace working on the system. There are two major hurdles. The first comes in the form of making sure that the kernel components are in place. The second part is getting all the userland pieces in place: libdtrace, dtrace(1), etc. Now the first part is substantially easier than the second. One of the decisions that Debian/kFreeBSD
made was switching from the libc to glibc, as well as some other changes to make things fit more inside the Debian system.

The first step was to see if we had the kernel module and maybe it wasn’t loaded. To deal with this I ran:

# kldload dtrace

Syslog had some nasty messages waiting for me:
link_elf_obj: symbol lapic_cyclic_clock_func undefined
linker_load_File: Unsupported file type
KLD dtrace.ko: depends on cyclic – not available or version mismatch
linker_load_File: Unsuportted file type

Well, time to go try and take a look at rebuilding the kernel and get the cyclic module in there. To get started, you’re going to have to get several packages for building the kernel. This should be most of the ones that you need from a fresh install, though some may be missing: dpkg-dev debhelper quilt sharutils flex gcc-4.3 libdb-dev libbsd-dev libsbuf-dev.

The next thing you need to do is setup the appropriate environment. The FreeBSD build utils are all prefixed, but the kernel compilations call the traditional make, lex, yacc, etc. which calls the GNU counterparts. As you could imagine this causes some issues. The trick is to appropriately alias, symlink, or use whatever method you prefer to make it so all of the commands that start with freebsd no longer have that, i.e. freebsd-make will be the make you call.

Now we need to get the source to actually work with. So find a directory that you want to work in and run the following:

# apt-get source kfreebsd-source-8.1

This gets us the source of the package so we can eventaully build a new kernel and install it in the proper Debian fashion. From here you need to cd into kfreebsd-8-8.1/debian/arch/amd64/. Note the directory may not be kfreebsd-8-8.1 and may be a slightly different version number depending on when you go about this. Fire up vim (or your other text editor of choice) and edit the file amd64.config.

Add the following lines:

  • option DDB_CTF
  • – makeoptions WITH_CTF=1
  • – option KDTRACE_HOOKS
  • – option KDTRACE_FRAME (Only necessary for amd64 based kernels)

Now that we’re all set with this, time to let rip! From the root of the package, i.e. kfreebsd-8-8.1 in this case, run the following:

# dpkg-buildpackage -B -uc

This may say you don’t have a needed dependency, if that happens install it. If it does, then just go for it and let rip. Once this is done, you should see a lot of packages one directory level up. From here, install the image. Once that is all set, time to reboot and switch kernels.

Now that we have rebuilt and installed our kernel, we can go ahead and verify that we actually managed to do things correctly. The first trick is to go ahead and again run:

# kldload cyclic
# kldload dtrace

This should work without any scary error messages on dmesg, if it didn’t, you’ll want to confirm you booted into the right kernel and do all those fun things.

We have a kernel module, where next?

Well there are still two major things to get working. The first and more important is that we need the userland pieces for DTrace to work. Unfortunately the transition between the BSD libc and glibc is not a painless one and may be something I take a look at again at a later date. The second piece of the puzzle is to get CTF data in place. Unfortunately this requires that we have the ctfconvert and ctfmerge tools, which are not part of the stock Debian install. These will get built as a part of getting all the userland DTrace tools in place.

Welcome to this humble blog. I’m currently an engineer on the Fishworks team. I’ll have some interesting things to talk about from the work I do. Before that I graduated from Brown’s CS Department. During my time there I did research with Prof. Tom Doeppner, I helped run the TA Program, and TAed several courses, most notably the Operating Systems Course and the Distributed Systems Course.

When I have spare cycles to spend on technical projects I like to look at silently layering filesystems via FUSE, experimenting with media control via mplayer and Wiimotes, and tinkering around with systems in whatever seems intriguing.

Many thanks to Bryan for giving me some space up here.

Recent Posts

September 27, 2019
September 6, 2019
October 1, 2014

Archives