KeyKOS - A Secure, High-Performance Environment for S/370

Copyright © Key Logic, Inc., 1988. All rights reserved.
Permission to reproduce and redistribute this document in paper or electronic form is hereby granted, provided that this copyright notice remains intact.

This document first appeared in Proceedings of SHARE 52 I, March 1988, pp 3-17.

KeyKOS is a trademark of Key Logic, Inc.
All other trademarks are the property of their respective holders.

Abstract

KeyKOS is an operating environment for S/370 computers which provides a high level of security, reliability, performance, and productivity. It allows emulation of other environments such as VM, MVS, and POSIX.

This paper includes: A brief history of KeyKOS, a brief description of the system’s features, an introduction to the KeyKOS architecture, an overview of the CP simulator, further discussion of security architecture, an introduction to security, an overview of the implementation of DOD security policy, a short discussion of the performance implications of the architecture, and a description of the features of the S/370 KeyKOS system.

History

When Tymshare started work on KeyKOS in the early 1970s, there were solid business requirements justifying the project. With the price of main storage dropping, applications were too tightly bound to disk storage. Because Tymshare’s systems were accessed from around the world, continuous operation was a requirement. Existing systems were prone to failure from many causes, both hardware and software. They did not recover from these failures gracefully. These systems required significant operator intervention in both normal operation and during recovery. They did not provide the security needed to allow competing organizations to share programs and data in a controlled manner where it made economic and social sense.

Because of these deficiencies, Tymshare decided its best option was to build a system of its own. This system had a number of design goals including: high security, high reliability, economical processing of high transaction volumes, and enhanced productivity for managers, programmers, users, operators, and hardware.

Here are some of the important milestones in the development of KeyKOS. As can be seen, developing a new system with new concepts takes a lot of time.

1972	First written description
1974	First development funding
1976	First terminal message
1979	First public presentation at Share 52 [1]
1981	First IPL without VM
1983	First production application
1985	First execution of CMS as guest
1985	Key Logic founded
1986	First customer
1987	First production transaction processing application

System Features

KeyKOS provides persistent virtual address spaces where programs may keep data. The system caches frequently referenced data in main storage. When several processes are accessing the same data, for example the CMS “S” disk, the data blocks involved are likely to already in main storage, improving access times. Only one copy will be maintained in main storage, improving storage utilization. Persistent virtual storage allows the kernel to globally optimize disk arm movement and rotational latency. The KeyKOS implementation also provides complete separation of physical and logical DASD management. No unprivileged program is aware of the type or configuration of real DASD in the system.

KeyKOS has a system-wide checkpoint which periodically saves the state of the entire system. If a system outage occurs, the system will restart from the last checkpoint with all data and processes in a consistent state as of that checkpoint. The KeyTXF transaction processing system will recover database updates to the point of failure. Should a CPU fail, the DASD can be shared with or switched to a backup CPU to quickly restore service by restarting from the last checkpoint.

Data mirroring stores multiple copies of data for reliability and performance. The KeyKOS system continues to operate if a mirrored disk fails. When the disk is repaired, or a replacement disk is formatted and brought online, the mirrored data is automatically restored to that disk. Performance is enhanced by having several paths to a particular piece of data. The full function of the system is available in essentially any S/370 computer language. A standard invocation protocol permits high level languages to invoke low level function and low level languages to invoke high level function, enhancing the usefulness of all languages.

The KeyKOS system is designed for unattended operation. The only common operator functions are mounting tapes and servicing the printer.

The KeyKOS system is designed for continuous operation. Full system backup dumps may be taken while the system is running. When a dump has completed, the backup tapes contain an image of all data and processes in the system at a consistent instant of time. avoiding inconsistency in the data. These “tape checkpoints” are conceptually independent of the physical DASD type or configuration. They may be restored to different physical devices if necessary.

Architecture

In a simplified view, the KeyKOS kernel takes a two state computer and turns it into a many state machine. The resulting fine grain authority is completely controlled by the system. A many state system permits many of the concepts used in other operating systems to be moved out of the privileged kernel and into the user controlled portions of the system. These include: directories, files, and address spaces.

Some concepts, such as node, are used differently in KeyKOS than they are in other systems. In general, when first approaching KeyKOS remember the Monty Python saying, “And now for something completely different” and there will be less difficulty.

A useful way to view the facilities of KeyKOS is through the object paradigm. A KeyKOS system runs on a hardware base. The KeyKOS kernel, the only code to run in supervisor state, controls the hardware and implements the primary objects. All the rest of the system consists of a non-hierarchical set of objects. These objects communicate with each other and with the kernel by using what we call Keys. Since the objects running outside the kernel use logical resources, the underlying physical reality may be changed without disturbing them.

A KeyKOS object consists of both data and the code that acts upon that data, similar to objects in systems such as Smalltalk. This paper uses “object” to mean a KeyKOS object unless otherwise qualified. Objects run in their own address space, protected from examination or alteration by other objects in the system. Objects may communicate only via the keys they hold.

Keys are the unforgeable tokens of authority referred to in the computer science literature as “capabilities”.[2] A Key designates a particular object and specifies what authorities the key’s holder has with respect to the designated object. A key is the only way one object can gain authority to access another object. Keys may be invoked directly to communicate with the object they designate or may be passed as a parameter to some other object. Passing a key makes a copy of that key. The function of a key does not depend on what object holds it. A particular key will perform the same function for anyone who holds it. Keys are not implemented in the address space of the program. so they may not be directly examined or counterfeited.

Keys which designate the same object may allow differing authority over that object. One distinction is the Use vs. Service distinction. A Use key allows the holder to obtain services from the object. A Service key allows the holder to access the internal state of the object. An analogy is a car with one lock on the ignition and another on the hood. The ignition key allows the holder to drive the car. The hood key allows the holder to perform service operations on the engine.

The KeyKOS kernel implements keys. It also implements some basic object types. Primary objects are the primitive objects out of which all other objects are constructed. Page objects store 4096 bytes. Node objects store 16 keys. Note that keys are not stored with data.

Fundamental objects are objects which are built from one or more nodes and/or pages and are interpreted directly by the kernel. Domains provide a place for programs to execute. The keys they hold determine the authority of their program. Segments define a portion of storage as a linear address space. Each domain uses a segment to define its address space. Meters control and measure computer resources. Each domain must hold a key to a meter in order to execute,

Domains are analogous to CPUs, segments to main storage, and meters to a power supply.

Domains have a lot in common with virtual machines. They have a PSW, registers, and direct access to the problem mode S/370 instruction set. They hold a key to a segment that resolves the addresses generated by programs running in them. In addition to these familiar S/370 facilities, domains have 16 key registers which define their privileges. These registers are accessed with the CALL, FORK, and RETURN instructions, implemented as SVCs 253, 254, and 255.

The key registers and additional instructions permit domains to communicate by sending messages, a facility which is analogous to the VMCF and IUCV communication paths provided by VM/SP. One important difference is that the communication is authorized by the fact that the domain holds the key, rather than by code in the message’s receiver or system directory.

Messages are passed from one object to another to request or provide an object’s service. They are the KeyKOS form of remote procedure call. The CALL invocation automatically provides for a return, The RETURN invocation indicates an object has completed some operation, and the FORK operation starts a parallel process. Messages contain both data (a four byte parameter and a string up to 4096 bytes long) and up to four keys, KeyKOS uses the rendezvous system of message passing so the kernel need not buffer the messages, If buffering is desired it may be constructed outside the kernel by using domains.

A segment is the way an address space is defined in KeyKOS. A segment is defined by a node and the keys stored in it. A segment may define some pages as an address space.

A segment may group other segments into one address space. One segment might contain three subsegments, a program segment (typically read-only and shared). its working storage (read-write and private), and a shared data segment (read only or read-write as required).

A segment may use a moveable window on another segment to define part of its address space. This facility allows access to segments that are up to 2⁴⁸ bytes long. Windows are also used by debuggers so they need not reside in the address space of the program they are debugging.

By directly manipulating its segment nodes. an object may address as many pages as it has keys to (up to the implementation limit, currently about 500,000 terabytes).

Exceptions are handled by domains called Keepers. Domains, segments, and meters may each designate a keeper (called a domain keeper, segment keeper, and meter keeper) to handle their exceptions (e.g., A program interrupt on a domain results in its keeper being called). Keepers remove exception policy issues from the kernel making the kernel smaller and allowing an object designer to either use a standard policy, or define a specific policy for each object.

A domain keeper is invoked when a program in a domain encounters a program interrupt or an SVC that is not directly handled by the kernel. It is also invoked when a segment fault is encountered and there is no segment keeper defined for that segment. A mask in the domain’s PSW controls whether the CALL, FORK. and RETURN SVCs perform their KeyKOS function or trap to the domain’s keeper.

Segment keepers are invoked when a domain attempts to access an address in the segment which does not have a page defined for it or when a domain attempts to store in a read-only page. Segment keepers may either correct the condition (e.g., copy the read-only page and replace it with the copy) or cause the accessing domain to trap to its domain keeper. Segment keepers are also invoked when a segment key to the segment they are keeping is explicitly invoked.

A meter keeper is invoked when a resource counter in a meter is decremented to zero. This occurs when resources are used by domains running under the meter. Typically the meter keeper will replenish the resources and restart the domain that faulted. Meters are a principle tool of external schedulers. The kernel implements a primitive scheduling algorithm designed to permit an external scheduler to obtain enough service to perform its scheduling function.

The kernel also implements device access keys which permit their holders to directly access an I/O device such as a terminal, network, or printer, and timer keys which allow their holder to set a wakeup time and wait until that time has passed.

The CP Simulator

Key Logic’s CP Simulator can be used to run CMS/SP Release 4 as a guest in KeyKOS. At Key Logic we use familiar CMS facilities such as AUX files and updates to maintain our software. We generally do our edits and compiles in CMS and then export the program to a KeyKOS environment for execution. One difference from a “normal” VM shop is that our programmers may create and use simultaneous additional CMS environments as needed. Additional CMSes allow productive work to continue while the machine is executing a lengthy process such as compiling all the modules of a large subsystem. The CP simulator is also used as a component of special purpose objects (e.g.. the XEDIT object).

A programming user of KevKOS is connected to a program we call a context switcher. This program has two major functions. The first is to create environments, called contexts, in which to test and run programs. The context switcher provides facilities to measure the resources used by a context, stop and restart a context, and destroy all the objects built in a particular context, freeing their space. The other function is to provide a set of virtual terminals for programs in the contexts. The real terminal may be attached to one of these virtual terminals or connected to the context switcher command system.

A programmer may define a new context whenever a need arises. One context typically defines the user’s CMS “A” disk. Additional contexts hold objects being debugged, CMS systems, data bases, or any other objects that the user creates,

The CP simulator is implemented in several domains. The CMS domain runs the unmodified CMS code and the code of all the programs that are run in that CMS. Its domain keeper is the main CP simulator domain. This domain holds the keys to the virtual terminal which acts as the console. As a domain keeper, it receives the trap that occurs when the CMS domain uses a privileged instruction. and simulates it.

The CP simulator domain also simulates the function of disks, tapes, and unit record equipment. It has a keeper which allows it to recover from errors encountered while simulating I/O. For example, when a read-only segment is used to simulate a disk and a write CCW is issued to that disk, the CP simulator will trap on a write protection violation. Its keeper will allow it to recover from the error and simulate a real disk with the write protect switch on.

The segment which makes up the CMS domain’s virtual storage has a keeper which adds pages within the virtual machine’s size and implements the “release pages” function. The CP simulator domain uses a segment that has several windows for accessing other segments. One of these windows is permanently assigned for accessing the first megabyte of the CMS domain’s address space. One other segment is used to window over the rest. Four I/O windows are used to map segments associated with active spool files and disk simulation. These windows are assigned on demand with LRU replacement.

Each Minidisk is simulated by a KeyKOS Segment. IBM 3330, 3350, 3370, and 3380 DASD formats are supported. CKD formats are simulated by treating the segment as an array of tracks. FBA is simulated by treating the segment as an array of 512 byte blocks. When disk I/O is simulated, part of its segment is mapped into one of the CP simulator’s I/O windows.

One of the advantages of this method of disk simulation is the automatic caching of active disk blocks in main storage. If the disk is shared, then the active blocks are also shared. Another advantage is that all the data are checkpointed along with the CMS. Being fully checkpointed allows a CMS session to last through both scheduled and unscheduled IPLs.

Each spool file is simulated by a KeyKOS segment. The segment is formatted in a KeyKOS standard format for sequential data. Each record is preceded by a carriage control byte. When the segment is passed to the printer or a virtual machine’s reader, the additional CP data (name, type, tag etc.) is passed separately from the spool file. Spool files use an I/O window while data is being read or written. One advantage of representing spool files in a standard format is they may be directly passed to other programs. such as programs using the KeyKOS OS simulator’s QSAM simulation.

Security Architecture

Factories are the patented [3] standard facility for creating new objects in KeyKOS. There is a CP simulator factory which produces new instances of the CP simulator. The factory also performs a major security role for both the builder and the requestor of its object. For the builder of the factory (the program owner), factories provide assurance that the people who call the factory to build instances of its product will not automatically obtain service keys to the domains created. As a result, the internal structure of those domains cannot be modified or even examined by their callers. For the requestor of the object (the object user), factories provide a way to audit the communication paths available to the objects they create. A requestor may use this audit ability to ensure that proprietary data entrusted to the object is not communicated to the builder or other unauthorized recipients.

Factories can perform this audit since all authority in KeyKOS is controlled by keys. Note that all communication paths available to the factory’s product are either keys passed by the builder when the factory was defined or keys passed to the resulting object by the requestor of that object. The keys which must be examined are those provided by the builder.

Since objects typically need other objects to perform their function, factories permit these other objects (defined by keys) to be included when the factory is defined. All keys provided by the builder are classed as either read-only, other factories, or “holes”.

Read-only keys can not be used as two way communication paths. Keys to other factories have known communication paths which are included in the list of paths available to this factory’s objects. Holes are keys to objects that may provide a communication path from the object to the factory builder. They are registered so the requestor may check for their presence.

Factories provide a solution to the mutually suspicious user problem. A Key Logic technical report describes this solution in more detail. [4]

One frequently cited problem with implementing security policies in other capability systems is the ability to use a read-only capability to fetch a read-write capability from a capability segment. If the security policy is such that information may flow from A to B, but no information may flow from B to A, it is not sufficient in traditional systems to give B a read-only capability to A. A Trojan horse in B could fetch a read-write capability through the read-only capability and use it to move information from B to A in violation of the policy.

KeyKOS provides the Sense Key to avoid this problem. A sense key designates a node and may be used to fetch keys from that node. When a sense key is used to fetch other keys, their access rights are automatically diminished so they are read-only. For example: A page key becomes a read-only page key. a segment key becomes a read-only, no-call segment key, and a node key becomes a sense key. As a result of this behavior, objects accessed through a sense key can not be changed, nor can additional privileges be obtained through this facility.

Introduction to Security

Different people and organizations have many differing ideas of “What is security?” Is it keeping secrets? Is it protecting data and programs from unauthorized alteration? Does it mean that the system is reliable? How about available? What about restart and recovery? Is the system still secure? At Key Logic we believe all of these features are important for meaningful security.

Here are a few general observations about security:

Different organizations require a very wide range of policies which define under what conditions programs and information can be shared. New applications and changing environments further require policies which change with time. The policies needed in the future can not always be anticipated today.
A rigid system which has been “enhanced” with add-on security systems quickly becomes even more complex, inefficient, inconsistent, difficult to understand, and difficult to use. Add-on controls increase both system and administrative overhead.
A system, such as KeyKOS, which has all security enforcement mechanisms built in to the compact kernel, and which uses simple capabilities (or keys) to enforce any installation defined security policy, can be simple, efficient, easy to understand, and easy to use.

A basic security principle is the Principle of Least Privilege. This principle is recognized in the Department of Defense’s “need to know” policy for information access. To apply the principle of least privilege to software systems, the underlying security mechanisms must first provide for strong isolation. The KeyKOS design provides such isolation. It also provides a programming paradigm where it is both natural and efficient to use that isolation to enhance the security and reliability of the system by giving each object only the minimum necessary privilege.

Observing the principle of least privilege limits the ability of Trojan horse programs or viruses to steal or alter data or programs. It also increases the reliability of the system since a programming error can only damage data the program must deal with, not other data the program can access due to excessive privilege.

In KeyKOS, as in many other systems, privileges are first controlled by running non-privileged code in problem state. Privileged instructions are reserved exclusively for the kernel. Unlike other systems, a domain’s privilege is specified and controlled directly by the keys it holds. Other systems have a rigid, frequently hierarchical structure of privileges, defined by mechanisms such as passwords, access lists, and rings. These mechanisms tend to protect only large structures, such as files, rather than small structures such as records and fields. KeyKOS encourages many small protection domains. Each protection domain is flexibly linked, via keys, only to the other domains it must access to perform its function in accordance with the principle of least privilege.

DOD Security Policy

One well defined computer security policy is the Department of Defense’s policy as specified in Department of Defense Trusted Computer System Evaluation Criteria [5], more familiarly called the “Orange Book”. Since this policy is well defined (and there is interest in systems that meet it), it makes a good example to use in illustrating implementation of security policy in KeyKOS.

The Orange Book rates systems according to how they meet criteria tied to the following scale:

D	does not provide meaningful access protection.
C1	provides discretionary access controls (DAC). Each user may control who may access data that user owns.
C2	introduces requirements to prevent object reuse (e.g., erase the data before reassigning the disk extent), control access propagation, and audit security-relevant actions.
B1	introduces labeled objects and mandatory access controls (MAC) based on them. With mandatory protection. a security officer can impose an access policy on users without their cooperation.
B2	has additional requirements for covert channel analysis (storage channels), a trusted login path. and additional label requirements (for I/O devices and all other user accessible objects).
B3	has an additional requirement for “trusted recovery”, (i.e., the system recovers from a crash in a trusted state). It also has requirements for additional covert channel analysis, particularly in timing channel analysis. and a structured, compact kernel.
A1	is the highest level specified in the Orange Book. It provides no functional enhancements over B3, but includes additional formal verification steps. The Orange Book mentions levels of assurance beyond A1, but these have not been fully defined.

In July 1987 a study group from the National Computer Security Center, the group responsible for evaluating systems against the criteria of the Orange Book visited Key Logic. Their visit had two major objectives: (1) Review capability based systems using KeyKOS as an example, to see if they can achieve high levels of orange book security. (2) Look specifically at the security features of KeyKOS. Based on their preliminary review. their conclusion was that KeyKOS, with certain enhancements described but not yet implemented, was a good candidate for evaluation at the high B level.

The basis for Orange Book MAC and DAC security policy implementations, and other “data sharing” policy implementations in KeyKOS, is what we call a compartment. This use of compartment should not be confused with the DOD use of the term “compartment” as an implementation of need to know. A KeyKOS compartment is a collection of objects which do not have keys that designate objects outside the compartment. The collection of objects can be small, such as one interactive program, or large, such as a collection of CMS objects that use shared data. Compartments support a policy of complete isolation from all other compartments.

To implement the Orange Book MAC policy there needs to be one compartment for each user at each sensitivity level for which that user is authorized. Since the Orange Book requires labels for all objects visible across compartment boundaries (i.e.. “named objects”). we add labels to each compartment. These compartments are built from no-hole factories, so they start with no external communication paths. After they are built, they are each given a unique key to a trusted system facility called a guard. The system recognizes which compartment is importing or exporting data by the particular guard key being used. Within a compartment, all invocations proceed at full speed.

The guard is part of a reference monitor which validates accesses to shared data named in a global directory. The reference monitor keeps the necessary audit trails for all the security relevant interactions between the compartments and the shared data

When compartment Lab1 exports a piece of data it specifies the external directory name for the dana and the discretionary access rights associated with that data. The system adds the user ID of the exporter and the sensitivity label of the compartment (Labl) to the global directory entry.

When another compartment imports the data it specifies the external name and the type of access requested. The system provides the importer’s user name and the compartment’s sensitivity label for the mandatory and discretionary access control checks.

If the access is permitted a new front-end object is created and saved in the external directory along with the importing user name and sensitivity label. A key to this front-end object is returned to the importing compartment. Future invocations of this key allow authorized access to the object without requiring any reference monitor or access list checking overhead. If it is desired to rescind access, the front-end object is destroyed. This action severs the connection between the importing compartment and the data. The added directory entry makes it easy to audit the connections currently in effect.

Policies for data sharing can be implemented with a reference monitor (using MAC and DAC rules) as described for the DOD Orange Book policy.

Other applicable data sharing policies can be implemented in commercial environments, and several policies can be supported on one system including:

Discretionary policies in which users control who can access their data
Mandatory policies in which a security officer controls who can access what data
Commercial policies designed to control access to corporate data such as customer lists and payroll files or to ensure compliance with laws such as the privacy act.

In summary, KeyKOS security is not just keeping secrets, but also the built-in reliability, integrity, and availability of the system. KeyKOS allows flexible local policies enforced by high performance mechanisms which are built into the base level of the system.

Performance Implications

The KeyKOS system delivers high performance for several reasons. The kernel provides the minimum level of function needed to allow a high function system to be implemented in domains. Since many important applications do not need the complete level of function, they do not need to pay the costs of supporting that function. For example, an ATM transaction processing application does not need multiple window support, and in KeyKOS need not pay the costs involved; data transfers are only checked and audited for security purposes when they are security relevant. not on every transfer.

Keys directly designate an object. Direct designation allows higher performance communication between objects than occurs in systems such as VMCF since there is no need to perform a search operation to locate the intended receiver of the message.

Since the basic security enforcement mechanisms are built into the KeyKOS kernel and not just added on, a high level of consistent security control may be implemented efficiently. Since security policy is defined outside the KeyKOS kernel, many different security policies can be implemented. These policies may range from the highest level of DOD style security, to the generally open access policies favored in research environments. Even within the same policy, functionality (and the resulting overhead) can be varied to meet specific application needs.

Features of the S/370 KeyKOS System

KeyKOS/370 runs on System/370-compatible single processor CPUs. It currently supports 3330, 3350, and 3380 count key data format disks and 3370 FBA format disks. System software includes the context switcher and two command systems.

Communications support includes X.25 and SNA networks. SNA is supported through HCF as LU0 or LU2 devices. Channel attached 3270s are supported for terminal access.

The KeyTXF transaction processing subsystem supports high performance transaction processing with record locking and commit/abort access to indexed data storage objects.

Two emulators are currently supported. The OS emulator supports a subset of the access methods and system services provided by IBM’s MVS. The CP emulator supports IBM’s CMS operating system and certain programs which run under it.

Supported language processors include IBM’s PL/I, Pascal, Cobol, Fortran and Assembler and Watcom’s C. A C preprocessor, and PL/I and Assembler macros are available to simplify the management of keys and invocations in these languages. Object programs from these compilers are supported in either the OS or CMS environments if they do not depend on unsupported facilities. Kolinar’s XMENU product is supported under the CP and OS emulators.

Other utilities, subsystems, and applications are provided. For example, a symbolic assembler debugger with extensions to aid in debugging PL/I programs, a mail system for sending text messages and keys, and an on-line documentation system with hyper-text features are supported.

References

“GNOSIS - A Prototype Operating System for the 1990’s”, Proceedings of Share 52 I Share lnc.. Chicago IL.
Levy, H.Nt.. Capability-Based Computer Systems, Digital Press, 1984.
U.S. Patent 4584,639.
“KeyKOS and Mutually Suspicious Users”, Key Logic Document KL108. Key Logic, Inc., Santa Clara CA.
Department Of Defense Trusted Computer System Evaluation Criteria, Department of Defense Computer Security Center, DOD 5200.28-STD, December 1985.