Kernel Objects

This is to record some notes about the implementation of parts of the kernel. We will concentrate on kernel objects at first. This may perhaps grow to be a major addition to the kernel-logic manual.

Kernel objects are supposed to act very nearly the same as objects defined by domain code.

There are these restrictions on kernel objects:

They require a resume key if the response from the object is to be received.
The string argument is padded with zeros whenever the object needs more bytes than is in the argument.

Both of these are conventions that could be overridden but it would be difficult to contravene the first.

The returner object was originally invented to allow domain code to define objects that behaved like kernel objects in this regard.

This is an attempt to describe the logic of the kernel sufficiently to enable a programmer to add function without first having to read the whole kernel, or even code by analogy. The design of the kernel has largely sacrificed modularity for performance and the kernel programmer must keep track of necessary data invariants as he or she adds code.

The routine “check” is fairly easy to read and many bugs introduced by new kernel code will provoke a complaint from check. All of the patterns required by check are explained, I think, in the old kernel logic manual.

The original Keykos kernel was written in 370 assembler. Today the data layouts are remarkably similar. A great deal of the C version of the kernel was transcribed manually from the assembly listings by someone who was familiar with the whole kernel. The kernel has lived exclusively in a 32 bit world with two’s complement data.

A tour thru some node logic

There is an order on a node to fetch a key. It is a medium difficulty exercise and a guided tour will allow opportunity to point out many interesting features.

When domain code invokes a key it executes a command that causes entry into the kernel. The first kernel code is written in assembler as it involves instructions that no compiler knows of and also register conventions very unlike the habits of compiled code.

There may be a brief evaluation as to whether there is code to perform the function with assembled code but normally steps are taken to prepare for executing compiled kernel C code. This is typically adopting a kernel stack and in case of register windows some sort of truce between domain windows and kernel windows. IHRW (I hate register windows.)

Details of privileged code after the key invocation instruction by the domain code and before execution of the first compiled kernel code are highly machine dependent and will not be further covered here. In the case of the SPARC there are a few generations of debugging hooks here in various states of conditional inclusion.

For our node key invocation the compiled code begins with routine gate in file gatec.c. There are extensive comments there describing what information from the jump has been placed in kernel variables that are accessible to the C code. We adopt in this note the notion that the domain code has produced a message. In short the registers loaded by the domain code that describe the message, have been placed in global variables. Strategic to understanding this is that variable cpudibp always locates the DIB of the running domain. When the routine gate returns the value in cpudibp may locate another domain which will then begin to run. This routine, gate, is poorly named for it used for all keys, not just gates. Portions of the information of the DIB reside in machine registers as the domain code runs. Other portions are merely close at hand in the DIB for quick access by privileged code that responds to entries to the kernel stimulated by actions of domain code.

When a domain is running there are various kernel states that are devoted to that domain. cpudibp is merely the most prominent. All of the others will necessarily emerge as we follow a gate key invocation.

There are a few things to do even before the invoked key is examined. The routine gate does these steps which include arranging for subsequent uniform access to the portions of the message formed by the domain code.

The first test in gate is to verify that key invocation is permitted. After we have gone thru the typical case we will come back to this case and follow the fault logic as it is a particularly simple case of a domain suffering a fault when jumps (invocations) are disallowed.

An easy step is to set variable je_key is set to locate the invoked key.

Now we cause the string, if any, of the message to appear at some address in the kernel’s virtual memory.

Next we worry about the case where the domain has pointed to a virtual address holding the string portion of the message and that virtual address is not in RAM or even more obscurely resides in a portion of storage undefined by the domain’s memory tree.

Let’s be optimistic and assume that the string is in domain memory that is even now mapped into the domain’s space. This is indeed the common case. We take the arg_memory branch of the switch statement. The routine map_arg_string is savvy about memory map things and in our case allocates one or two kernel virtual addresses to map over the domain’s message string. The value returned is a pointer in kernel virtual space locating that string.

keyjump

However we get to keyjump, variable cpuargaddr is a kernel virtual address of the message string at least if the string length is greater than 0.

je_key locates the key being invoked, je_key->type is the value of the key type byte, je_key->type & keytypemask knocks off three bits that are crowded into the byte but are not the type. We switch on that and go about 17 ways. In our case we go to branch “nodekey”. A priori it is not necessary to ensure that the node is prepared but in actuality it is convenient and not inefficient to ensure that the invoked node is in RAM and the key thereto prepared. In this case the virtual address of the node will appear explicitly in the prepared key. An artifact may be worth noting here. If the node has been deleted since this key was last used, branch prepkey_notobj of the switch provides the illusion that the invoked key was a number key (data key) which it soon will be.

Finally we call jnode (in file jnodec.c) which is the routine that responds to the various node key invocations.

jnode

Knowing that the key being invoked is prepared, we reach in directly and extract the address of the node frame of the node. This imposing qualified name is due, in part, to the fact that there are various kinds of keys and various states those keys can be in. There is a tree of structs and unions that serves well but does not fit in my head.

The routine jnode1 provides code for those orders in common between node keys and fetch keys, i.e. ad-hoc polymorphism. Our order is in common so we go to jnode1.

We fall quickly into the top branch of the switch statement which performs pointer arithmetic to locate the slot we are to fetch.

Now comes some black magic in the name of performance.

We jump up a level or so of abstraction to see if we are being invoked by a domain doing a call. If so we will be able to avoid both the production and interpretation of the resume key conceptually present in the semantics of the call jump.

We must dwell a while on the condition:

  if (cpuexitblock.jumptype == jump_call
     && (!(s->type & involvedr)
          || (node != cpudibp->rootnode && (s = readkey(s)) != NULL)))

If this is so then we can do it “the fast way”.

Both a fork and return jump are uncommon and would complicate our logic. We thereby exclude them.

If the key were “involvedr”, (involved — don’t read) then the logical state of the key we are to copy is not what it appears in the slot and more general logic is required. We thus exclude it from the fast way, unless it is easy to un-involve the key. The routine “readkey” can usually show us the logical state of the key by perhaps un involving it, or alternatively making a copy of the key that holds the logical state of the involved key. Even this fancy case may fail if the domain is using a node key to fetch from a slot in its own domain root. Perhaps we couldn’t carry out the proof and certainly it was not worth the effort.

Normally we will take the fast way, perhaps with key locator s diverted to a copy instead.

On the other hand