Trusting those that you Call

Program Segmentation and Security

When a program in user mode invokes the OS it is much like calling another program. The OS is in a position to damage the state of its caller. More generally, when is a program vulnerable to those that it calls?

The machine language abstraction of call exposes a program to any subroutines that it invokes but any two programs running in the same space are already vulnerable to each other. In Multics there were several rings in place of just user mode and privileged mode. The less privileged outer rings were vulnerable to the code in inner rings. The patterns that developed in Multics were to only call rings with ring numbers less than the caller. The ring numbers of the stack frames were monotonic and so was their authority.

A program in Keykos can “call” another without being vulnerable to it. The two run in different address spaces and the call has the flavor of IPC (InterProcess Communication). As in the conventional call the caller is normally blocked awaiting a response from the callee and the CPU normally proceeds immediately to obey the subroutine.

When a routine relies upon those that it calls, what is gained in avoiding vulnerability to those that are relied upon by calling? Are there degrees of reliance? Perhaps the greatest vulnerability is that the subroutine may not return and so the caller cannot be sure of completing its task. Nonetheless, several patterns gain:

We may call an untrusted routine hoping for a solution to a puzzle when such a proffered solution can easily be checked.
An auctioneer may call a bidding agent. The agent should be unable to view earlier bids of other agents which the auctioneer holds.
Understanding the logic of a program is simplified when it is known that changes to and uses of a variable are confined to occurrences of that variable in the lexical scope of its declaration.

Whether or not there is an adversarial relationship between routine and subroutine there may be a requirement to time-out a call. This observation raises the question of why a call rather than other means to communicate between programs? Here are some advantages to calling that may apply:

Calling is an efficient way to schedule a processor for a hierarchically conceived problem.
For problems such as sqrt(sqrt(x)) the first performance of the routine must finish before the second can start even if there are multiple cpu’s.
Certainly not least is that programming languages support calling with a convenient and familiar syntax.
Cpu’s can schedule execution of a subroutine in as little as two clock ticks whereas arranging for concurrency of cpus conventionally requires thousands.

See this for fragmentary notes on this question. In calling for bids it is presumably strategic to initiate the requests more or less simultaneously and not wait for one to respond before requesting another. This raises the seldom discussed question “Why call is such a popular programming pattern?”. Perhaps the plainest answer is that we understand the problems to be solved by recursively breaking them down into smaller simpler problems. Together with the imperative nature of the von Neumann style computer, the subroutine was an invention, but an invention that could scarcely have been missed. On this point Babbage’s design for his analytical engine and Konrad Zuse’s first computers kept the program in mechanically sequential memory which did not allow for the simple implementation of subroutines. IBM’s Card Programmed Calculator, of the late 40’s had the same limitations. In each of these cases there were provisions for very limited subroutines, expressed in another medium, to provide the commonest functions such as square root. I think that it was Turing who first mentioned the idea of addressable storage for the program where the modern concept of subroutine became feasible.

Computer languages, before Keykos, protected the routine from its subroutines, at least to the degree that the subroutine respected the language constructs. Type safe languages like Scheme, for instance, enforced this protection. Scheme introduced a novel problem however, not only could a routine fail to return, it could return twice.

Another reason for the subroutine call even when the problem was naturally decomposed into inherently parallel tasks—there was just one processor and the subroutine construct was an efficient and familiar means to schedule that hardware. If the logic of responding to a call for bid runs on another machine then the subroutine pattern fails to meet one of these reasons. Keykos was designed when industry representatives from at least two industries had engaged us in a discussion of the possibility of such logic running on the same machine as the program that had submitted the request.

Patterns with Security and Call

Often a call is made to code with more authority than the caller and the called program’s responsibility is to attenuate its authority in service to its caller. In capability terminology, X holds a capability to call Y which holds a strong capability Z. Y’s job is to limit, and often enhance, the function of Z. Some cases:

Z is the ability to read and write anywhere on a disk. Y is the file system code that provides the abstraction of files and limits its callers to just those files they should see.
Z is raw access to ethernet interface card and Y provides access to a particular IP circuit.

Two patterns can be seen here: attenuation, the weakening of another capability, and augmentation, the improvement of a raw functionality for convenience. It is often strategic and convenient to combine these two. Indeed the combination may be necessary for important kinds of resource sharing such as disks and communications links. An important pattern is multiplexing by which some facility can be shared with the multiplexing logic protecting one user of the resource from another. Multiplexing attenuates and may augment but not all attenuation is multiplexing; I may provide a capability to read a variable, a getter, that attenuates the stronger capability of reading and writing that variable.

In current systems such wrappers are built by the platform provider to whom one is already highly vulnerable.

In the first cases that come to mind the facility to be multiplexed or attenuated is provided by some sort of central authority, such as the provider of the computing platform, who one must trust entirely.

As cyberspace extends to a plurality of services, there will arise services not provided by the central platform provider. Some of these can be exploited without entirely trusting them. Here the wrapper code is not to be trusted with innards of the invoker (caller).

Fuzzy Claim

I think that a large part of the issue is whether a program can proceed to its goal by imposing the rigid order implied by the call discipline. Early manuals from Apple advised the Mac application programmer to avoid “modal constructs” where the program would insist that the user answer some question about the job before proceeding on to other aspects of the job. The modal dialog box would come up when the program call stack held a nest of uncompleted tasks that the program’s call logic demanded answers to before proceeding. This is perhaps the natural inclination of the programmer who was accustomed to prescribing the order of the steps in a task. It was at odds with collaborative work where the user was likely to require another order. The new regime added extra complexity in the application in the form of code to audit what remained to be done, or alternatively providing default answers for questions that the user might have no concepts for. The new style prescribed an event loop with an empty stack as the application awaited the next input from the user. It is sort of like “Who takes the initiative?” or “Who is to take the lead?”. It seems that for any task someone must take the lead.

All of this speculation bears on the question of when service requests between independent agencies of cyberspace can be organized on the conventional pattern of the stack. With this pattern callers must be protected from their callees.

Keykos was designed with the call stack mentality even though there is no stack per-se. It supports other control structures as well. The unstated assumption was of predominately stack oriented scheduling among mutually untrusting objects.

What are the circumstances where event loop wins? The Non blocking property is often mentioned, but the Keykos patterns avoid this thru the tendency of subcontractors to be more primitive than contractors thus establishing a simple ordering of types of contractor, which in turn precludes loops that lead to deadlock. Actually it is merely necessary to arrange for simple ordering of instances, which is even much easier to arrange.

I don’t know if this bears on the dual problem of protecting callees from callers, but I think that is required too. Keykos and Scheme do both.

Is this a useful dichotomy?

Ousterhout argues for the event loop in “ Why Threads Are A Bad Idea”. I need to square my arguments with his.