The Keykos Binder

The binder for Keykos is a complex program to link relocatable code and ultimately provide segments and entire address spaces for Keykos application programs; i.e. it loads domain code. It is probably too complex and the external contract needs improvement. I make some notes about it here because it is nonetheless interesting from several points of view. I have no simpler scheme in mind for this important function but the Unix ld command, while less general, has much to recommend it and might serve to found a function as general as the binder.

The very first relocatable loaders resolved external references where a value is defined in one module and required in another. The binder must and does do this.

IBM’s TSS (1967) first provided shared code in core (SIC) that could be mapped into various address spaces even when the owners of those address spaces had not, or could not have arranged to put that shared code at the same virtual address. Such code was called address free and compilers and assemblers know how to produce such code. The IBM 360 instruction set was an early adopter of the idea that data addresses should not be embedded in the instruction stream. They went farther and did not embed code addresses either. Such a feature is not necessary for address free code but it helps greatly. The binder accommodates this feature.

TSS ran on the IBM 360 model 67 which, like modern processors, required agreement in the low 12 bits of the various virtual addresses that might access the same real core. In other words the 4KB page was the unit of sharing. That machine and many modern machines gain still more efficiency when larger hunks may be shared. In particular hardware defined segments can be shared and the page tables required by the hardware to locate their pages may be shared as well. Such gains accrued on the 370 when the low 16 address bits agreed. The binder exploits this advantage as well.

Relocatable formats for IBM’s machines supported a concept called named control sections and this could be used to segregate code and data to be loaded into various parts of the virtual address space. For instance there might be good reason to put all writable storage at smaller virtual addresses than all read-only space. Sections with the same name from different compilations are given adjacent virtual addresses. There are several reasons for such segregation. The Keykos binder does this as well; or more properly it works with the binder’s invoker to do this.

Loaders must be able to load initial data into storage that will be modified when the program runs. Copy on write is often strategic here to minimize work. The binder does its part here.

The binder will reserve portions of an address space that are to be defined, and redefined, at runtime. Access to a slot of a node is provided to the program. The program may put a memory key in the slot and the defined segment appears at the reserved addresses.

The Binder’s Life Cycle

Binders are conceptually distinct even though they share immutable parts. In fact they depend on the continued existence of the constant parent binders.

The binder was designed with persistence in mind. In general a binder has a number of code elements already available inside it, in various degrees of fixedness. The idea is that an ISO C program will be processed with a binder with the necessary libraries already included. These libraries will usually be in the form if the real pages that will occupy the address space of the running program. Note that the library code, and not merely its name is in the binder.

Binders form a tree with older binders near the root. All binders except those at the leaves are immutable. We say that binder depends on these binders towards the root. The real storage occupied by a binder is actually shared and distributed among the binders upon which it depends. The binder contract abstracts this fact and talks as if the binders are standalone. At any point a mutable binder can be frozen, yielding a binder factory that yields new mutable binders which are discreet according to binder logic. The binder is thus not a covert channel.