Microkernel Characteristics and Features

What is a Microkernel

Early computers ran single jobs one at a time. The programs for these early systems often included all the software necessary to support both the program and the computer hardware. It wasn’t until computer users wanted to share the computer for several jobs running simultaneously that the concept of a privileged mode kernel was born. The responsibility of such a kernel was to partition the machine to allow more than one job to run, switching between the jobs when any one of them was waiting for some I/O operation to complete. These kernels grew in size and complexity as they incorporated the concepts of files, shared memory, inter-job communication, and other features. Modern kernels are generally large, complex, and expensive to maintain.

A Microkernel is an attempt to return to the concept of a small kernel without losing the ability to share the computer among several jobs. To accomplish this goal, all of the functions that implement files, shared memory, inter-job communication, and other features must be removed from the kernel. The kernel must be designed to allow the complete separation of jobs and to allow the implementation of privileged functions (such as file and memory allocation) outside of the kernel. Several of these Microkernels have made appearances recently.

All of the modern Microkernels are based on messages sent by the tasks running on the computer. Some of the messages are directed at the Microkernel and some are directed at other tasks in the system. The Microkernel’s responsibility is now limited to allocating the memory and CPU resources among the various tasks and to the routing of messages between tasks. All of the tasks on the system run in the non-privileged instruction set of the computer All privileged function is obtained via messages and not because some of the tasks are granted special privileges (access to the disks, for instance) by virtue of being recognized by the kernel as a special task.

The KeyTECH Microkernel

The KeyTECH Microkernel is based on objects, capabilities, and messages. Objects are an encapsulation of some data and a program that manipulates that data. Objects are generally programs outside the Microkernel, much like the servers of some systems or the tasks of other systems. Capabilities are special tokens that designate Objects and allow the holder to send a message to the designated Object. Messages are composed of a string of bytes and 4 capabilities.

What sets the KeyTECH Microkernel apart from other Microkernels is the pervasive use of capabilities to the exclusion of other more traditional mechanisms. While a complete description of the KeyTECH architecture is available in KeyKOS Architecture (KL028), a summary is provided here. There are many details omitted from this summary in order to make it concise.

The Microkernel is responsible for allocating the CPU and memory resources of the computer system to tasks. To this end, the KeyTECH Microkernel provides Nodes and Pages. Pages are used to store 4096 bytes of data (including programs and program variables), and Nodes are used to store 16 capabilities (called Keys in the KeyTECH system). Every other construct is represented by a Node or collection of Nodes and the Keys in those Nodes.

Objects

An Object is made up of a Domain, a Segment, and a Meter, each one of which is itself an Object (a Primary Object). The Domain is the description of a task and is the dispatching unit. A Domain is much like a virtual machine and has an address space described by a Segment and a Meter that allocates the CPU resources for the Domain. A Domain is described by three (3) Nodes and the Keys in those Nodes. The root Node of the Domain is what is designated by any capability that designates the Object which the Domain implements. In the root Node are Keys to two other Nodes. The keys Node contains the Keys that the Domain may use and the register Node is where the state of the Domain is stored when it is not running. The root Node also contains a Key that designates the Segment that defines the address space and a Key which designates a Meter. All manipulation of the attributes and state of the Domain is accomplished by altering the Keys in these three Nodes. All state information about the Object, the process running in the Domain, and the program variables of the Domain is stored in the Nodes and Pages that make up the Object. There are no other mechanisms.

Segments

A Segment is a collection of Nodes and Pages that describes an address space. A Segment is an Object in its own right and need not be part of any other Object. A Segment is often used as a file to store temporary or permanent data. However, when the Key to a Segment is found in the correct slot (there are 16 slots in a node) of the root Node of a Domain, the Segment becomes the address space of the Object. A Page is the smallest Segment. Larger Segments are a tree of Nodes with Pages at the leaves of the tree. Each node describes a span of 16 sub-Segments. The Segment may be sparse, and any portion of the tree may overlap with any other Segment with the same or different attributes (Read-Only or Read/Write). All manipulations of the Segment (and the resulting address space) are accomplished by altering the Keys in the tree of Nodes. All information describing the size of the Segment, the portions of address space that are backed by pages, and the data in the Segment is stored in the Nodes and Pages that make up the Segment. There are no other mechanisms.

Meters

A Meter is a Node with a Key to a super-ordinate Meter and some Keys that represent counts of available resources. If and only if the CPU resource counter is non-zero, the Object will be scheduled for the CPU. As in Domains and Segments, all information about the meter is represented as Keys in the Node.

Keepers

Each of these three Primary Objects (Domain, Segment, and Meter) has a Key that designates another Object that is the Keeper of the Object. In the case of the Domain, the keeper Key is in the root Node. In the case of the Segment, the keeper Key is in one of the slots of a Node that normally holds a sub-segment Key. In the case of the Meter, the keeper Key is in one of the slots in the meter Node. The Keeper receives a message whenever the kept Object encounters some exception. This message is fabricated by the kernel and is sent to the Keeper as if the Object with the exception sent the message explicitly. When the exception is processed by the Keeper, the Keeper may restart the Object by sending it a message using the Key provided by the Microkernel.

The Segment Keeper handles exceptions related to the address space of a Domain. These exceptions include attempts to write on a read-only page and attempts to reference an undefined address (an address that is not defined by a Page Key in some Node in the memory tree). The Domain Keeper handles CPU exceptions such as illegal instructions, divide by zero, and memory exceptions not handled by a Segment Keeper. The Meter Keeper handles conditions when a metered resource is exhausted.

Messages

Objects communicate by sending messages. All messages are sent synchronously. When a Domain invokes a Key, the message is copied directly from the invoker to the invokee. If the destination Object is not ready to receive the message (it is currently busy processing a previous message), then the invoker is backed up and will attempt to send the message again when the receiver is ready.

There are three ways to invoke a key. A CALL invocation sends a message and waits for a reply. In order to allow the invoked Object to return to the invoker, the kernel provides a special Key called a Resume key as the fourth key in the message. This key is used by the invoked Object (or any Object that is passed the resume Key) to reply to the message. A RETURN invocation sends a message and leaves the Domain available to receive another message. A RETURN is like a reply. A FORK invocation sends a message and continues running without awaiting a reply.

System State

The system outside the Microkernel is completely described by the contents of Nodes and Pages. This state includes files, programs, program variables, instruction counters, I/O status, and any other information needed to restart the system at any instant in time. It is this feature of the KeyTECH system that allows for system-wide checkpoints.

The Microkernel itself does not have any state information that is not derived from the state in Nodes and Pages. For reasons of efficiency, the Microkernel does reformat state information in private storage. However, the private storage is merely a cache of the state information and can be recycled at any time. When the discarded information is needed again, it is reconstructed from the information in Nodes and Pages. This lack of storage allocation in the Microkernel has several ramifications.

It is faster, since no complicated storage allocation code is ever run.
The Microkernel never has to worry about running out of space.
There is no Microkernel storage (such as message queues) that must be a part of the checkpoint.

Any time that a Domain cannot proceed (wait for I/O, wait for Caller to become ready, ...), the Microkernel backs up the instruction counter of the Domain so that it will reissue the instruction that caused the blockage.

Checkpoints

All Nodes and Pages have a permanent position on the disk called the home position. When a Node or Page is referenced, it is brought into main memory. If the Node or Page is changed, it may be written into a Swap Area if the main memory is needed for some other Node or Page. The Swap Area is always checked first for the most recent version of a Node or Page that is not in main memory. When a checkpoint is taken, any Nodes or Pages in main memory that have been changed since they were last read in from disk are written to the Swap Area. At this time the complete state of the system is on disk. The system can be restarted from that state, in which case all processes would resume running as if no interruption had occurred.

After the checkpoint is complete, the system resumes running but has switched to a second Swap Area. The Pages and Nodes in the first Swap area are migrated back to their home positions. When the migration is complete the system may take another checkpoint. All of the checkpoint and migration activity is cleverly hidden from the users. As long as there are some unused CPU cycles and unused disk I/O bandwidth, the user will not notice the checkpoints.

It is worth pointing out that the checkpoint is not simply of files, but consists of all processes as well. If an update of a file involves two different pages and only one of the pages has been modified at the time of the checkpoint, the file will not be damaged if the system is restarted. When the system is restarted the process that was performing the update is also restarted and the second page of the file is modified as if there had been no interruption. A power outage or hardware fault does not leave the system in some confused and damaged state. The state at the last checkpoint is completely consistent and the system may be restarted from that state without concern about damaged files. There are journaling facilities to handle the logging of transactions between checkpoints.

Other Microkernels

What distinguishes the KeyTECH Microkernel from other Microkernels is the total reliance on capabilities without any other mechanisms. There are no other mechanisms that add complexity to the ideas or to the implementation. In a KeyTECH system, a “task” is created, described, and manipulated by capabilities. KeyTECH itself knows very little about such tasks. From the KeyTECH view there are only Nodes and Pages. Some of the Nodes are on the CPU queue or various waiting queues and thus have a process associated with them. Some Nodes are referenced from the root Node of a Domain (one of those Nodes on the CPU queue) and therefore describe memory. If a Node describes memory, KeyTECH will build the mapping structures required by the hardware.

KeyTECH knows nothing of files; in fact, it does not allow Domains to reference the disks explicitly at all. From the Domain’s view there is simply permanent memory.

In most other Microkernels there are concepts associated with “tasks”, “threads”, and even files, though file concepts are fairly abstract. There are often data structures in the Microkernel that hold valuable state information such as a queued message. KeyTECH does not queue messages.

It is the simplicity of concepts that allows KeyTECH to support the system-wide checkpoint, the total reliance on capabilities that provides the basis for a wide variety of security policies to be implemented, and the small size and simplicity that make it possible to provide a reliable and fast Microkernel.