Operating Systems and Personal Computing

“An operating system is a collection of things that don’t fit into a language. There shouldn’t be one.”

–Dan Ingalls, Design Principles Behind Smalltalk, Byte Magazine, August 1981.

Background

Back around 1994 or so I got interested in operating system kernels. A friend had lent me the book on Minix, including complete source code, and I started building my own software heavily inspired by the Minix 1.5 way of doing things. My first attempt was actually a standalone bootable Scheme-like-language interpreter for 8086-class machines, a simple modification to one of the little experiments I had lying around at the time; sadly, I’ve lost the code to that system (not that it was particularly impressive).

The next few iterations were all much more Minix-like than that first experiment. All of them were for 8086-class systems in real mode. By the end of the series, the system had a hard-disk driver (read-only: I only had one disk, and no backups!), a simple FAT-compatible file system server, and a couple of “hello world” applications.

A couple of years later I got my first 386, and in 1996 I found a text-file copy of the Intel 80386 Programmer’s Reference Manual online and started building a protected-mode kernel, using an early linux kernel as a kind of cheat-sheet for when I got stuck. As I was at university at the time, the project didn’t move anywhere near as quickly as previous iterations, and in 1998 I graduated, got a job, and stopped work on the system entirely.

The code lay dormant for ten years, until 2008, when I picked it up again and drove it forward to its current state: a booting, protected-mode kernel for Pentium-or-better machines, with one device driver (the keyboard) and a simple message-passing facility. I experimented very briefly with constructing an L4-like IPC facility, but quickly realised that I’d reached a point where further work on this codebase, written in C as it was, would be futile, given my (sketchy) plans for a to-the-metal reflective language environment.

Roughly what I’m aiming for

SqueakNOS, or a Lisp machine, but with a language (or library, if you like) that is built around

shared-nothing isolation boundaries
capability-based security
message passing
recursively nested virtual-machines/images
a different notion of global-variable/file-system (Smalltalk’s global dictionary is too simple)

Within the Smalltalk world, Spoon comes close to the kind of thing I have in mind.

Isolation boundaries, capabilities, and recursive nesting

Unix’s isolated processes are a great start, but they don’t nest; this is like a flat file system. You can always simulate a directory structure by encoding file paths into a non-hierarchical file system, but it really isn’t like the real thing; in the same way, you can simulate nested virtual-machine structure with a flat process table, but it really isn’t quite right. Witness the recent rise of virtualization technology: clearly, while processes give some of the illusion of a program having the whole computer to itself, they don’t quite close all the nasty little corner-cases.

Smalltalk’s images are ‘object soup’. There are no internal isolation boundaries. The only reason Smalltalk images survive as well as they do is discipline on the part of the programmer. In this respect, they’re very similar to DOS and to early versions of RISC OS.

I’d like to look at ways of extending a language calculus (e.g. the lambda- or pi-calculus) with enough machinery that it can represent its own environment: to make a meta-aware calculus. Cardelli and Gordon’s ambient calculus could be an interesting starting point, given that it has a primitive awareness of the location a computation takes place within, but it’s not quite what I want, since there’s no equivalent of the 3-Lisp metainterpreter or Smalltalk Interpreter and ObjectMemory classes there. The rho calculus of Meredith and Radestock could be a good starting point for approaching those.

The E language uses object capabilities for security. They are overwhelmingly the simplest yet most effective access control mechanism I have ever heard of. Capabilities should form the core of the IPC mechanism of the system; this has ramifications on addressing and directory services, as well as for reflective facilities offered by the environment.

Message passing

Smalltalk gets off to a great start, with ‘objects all the way down’: a uniform message-passing model. It doesn’t have a very sophisticated addressing or layering topology, though; it’d be interesting to apply some of the lessons of AMQP and John Day’s NIPCA to extend message passing to distributed systems. How are nested virtual-machines addressed? Does it correspond at all with the addressing for inferior/supporting Day-style networking layers? If so, why?

Namespace management

Smalltalk’s global variable dictionary (in practice holding classes and little else) is far too simple for even a single-user operating system. Unix barely gets by with its hierarchical database, the filesystem. Past operating systems, for instance Pick, have experimented with full databases as the global store for applications; what kinds of database suit a meta-aware system? What kinds of “global” (really, per-isolated-process) database fit well with capability-based security? What kinds of names line up with modern DVCSs? Is it possible to name code by some DVCS-related identifier? Perhaps Zooko’s Triangle is involved here? How does all this interact with language vocabularies (like XML namespaces) and with language module and package systems?