When any CPU includes hardware that decides to redirect the CPU’s attention to a new program regardless of the nature of the current program we have an interrupt. This decision is made on the bases of signals from outside the program. We will get to the nature of these signals later. This decision is made between instructions; instructions are completed or not begun.
Another situation is so much like this that we include it in the same discussion. A program comes to a presumably rare situation where the CPU cannot obey the program given the unusual state of the computation. This is a situation that is either unanticipated by the programmer, or anticipated but inefficient to explicitly program for. An example of the latter is division by zero where the application author has not yet decided what to do in such an unlikely event. I call these traps here and say that the program has been trapped. More later on how this can arise. Traps involve the program whereas interrupts do not. One might say that interrupts are due to exogenous signals where traps are due to endogenous signals.
In either of these cases the new program must, as its very first action, capture the information needed to restart the old program or at least document the reason why the trapped program cannot be obeyed—in short reify the state of the old computation. All of the machines that I will talk about here have an architected state that is in a set of registers whose values may be germane to the meaning of the interrupted program. This state includes at least one program counter indicating the address of the instruction next to be obeyed upon program resumption. An interrupt is an event tied to one CPU and these registers belong to that CPU. The new program will normally preserve this state in RAM as its first action.
My first experience with a trap was the IBM 704 (1957) feature called transfer trapping. This was a program settable hardware mode where every successful transfer (branch) would cause a trap in place of the transfer. We used this feature at Livermore to great advantage in debugging programs. For example some variable which is designed to accumulate the sum of positive values, turns up negative. All of the symbolic references to the variable seem certain only to increase the value. With transfer trapping one can test the value upon each successful transfer. Along with the test, a record is made of the effective address of the trapped instruction, and its target. When the observed value first decreases from one trap to the next, the content of memory (core) is recorded (on paper in 1958) along with a descriptor of a guilty block of contiguous instructions, undiverted by transfers, and in the context which produced the anomalous value. It was seldom hard to find the bug with such information. Typical programs ran about 1/2 speed in this mode. Only a few modern machines have such a simple, general and convenient feature, and few of those have operating systems that make the feature available. John McCarthy reminisces about early interrupt ideas and practice.
The 704 had a 38 bit accumulator and a 36 bit word size. The two extra bits were vestigial remains from the time before floating point, and a few programs used them. It took several extra instructions to record the extra bits and a few more to put them back upon finishing the interrupt program. There were several other miscellaneous program states, one bit each, with idiosyncratic instructions to test and set them. The interrupt overhead was thus considerable.
The IBM 709 (1959) included concurrent I/O but early 709s provided no interrupts at the end of such I/O. Livermore did not upgrade their machines. Op codes TCOA, thru TCOF addressed the six possible channels to test if they were still operating. The main pattern for using this hardware was:
The IBM 7090 (1961) included the interrupts that had become optional on the 709. There was nothing resembling an operating system at Livermore for the 7090 and most production continued on as it had on the 709s. Multiprogramming had not yet arrived at Livermore. I am not aware of multiprogramming on any 7090 anywhere. There was a library routine that wrote a magnetic tape with decimal data and used interrupts to write successive blocks while the CPU had otherwise resumed doing physics. I recount IBM’s ideas behind their 7090 interrupt design. Few machines had interrupts before then. MIT’s CTSS timesharing project for a highly modified 7094 relied heavily on interrupts (with a slight problem).
The IBM Stretch (1961) had an incomplete privileged architecture. It could protect some of memory but the lock was equally settable by any program. At Livermore we built a rudimentary operating system that tried to provide a stable environment for a sequence of jobs. The only successful multiprogramming was a scheme for the operator to declare to the OS what magnetic tapes were mounted so that when an application came to the point of needing them, the OS would grant the drive and the mounted tape. An unsuccessful attempt was made to copy card decks, representing jobs, to the disk during previous jobs, and to copy printer output to the printer during subsequent jobs. We realized that our foundations were not up to that degree of multiprogramming. The machine architecture for interrupts was greatly improved over the 709 and 7090. All relevant state had memory addresses and one transmit instruction (memory to memory) would move the old program state aside.
The Harvest was a one-of-a-kind streaming computer that was closely attached to a Stretch. The streaming unit and CPU were closely connected and did not operate at the same time. To interrupt while streaming the hardware would wait until relevant data were in latches, cease streaming, and then cause the CPU to begin executing the interrupt code. Many of these latches were not addressable and it seemed difficult or impossible to build a program, with or without streaming instructions, that would extract and restore all such data. Most interrupts would finish without the need to perform streaming operations; such interrupts need merely restore CPU state and resume whereupon streaming would resume where it left off. When it was necessary to divert the streaming hardware to another streaming task while retaining the state of the interrupted stream process, the interrupt routine would set a special hardware bit and resume the old streaming task. The special bit would cause the streaming hardware to cease initiation of certain activities and consequently all of the process state would flow to the architected parts of the machine with memory addresses. A new clean interrupt would then soon be taken with the entire combined process state residing in addressable memory where it could be saved for later resumption while the streaming hardware was diverted to another task. This would not have worked with page faults but the Stretch memory was not mapped.
See the section about the CDC 1604 (1962) for a short note on 1604 interrupts.
The IBM 360 (1963) did not play a significant role at Livermore but I became aware of its design. The 360 had a well designed privileged mode and some of the high end OSes from IBM succeeded in separating one application from another. Multiprogramming began to be practical. The 360 interrupt structure was sufficiently coherent as to make its description interesting. The 360 architecture was designed when micro programming was deemed strategic and with the plan to make a series of compatible computers of greatly different performance that appeared uniform, even to the OS. This plan succeeded very well. A 64 bit register in the CPU was called the PSW. The PSW held the address of the next instruction along with other state bits which included:
The CDC 6600 (1964) took a distinctly different approach to interrupts. The PPU’s had no semblance of interrupt logic but they did have the power to divert the CPU from one task to another. When this happened all of the CPU’s program state (8 60 bit registers, and 16 18 bit registers) were swapped with the contents of an area of RAM specified by the PPU. One drawback of this scheme was that when the main CPU became unable to proceed usefully, it could only loop at an address which the master PPU would soon notice. There was thus the tradeoff of how often the master PPU should look. Intel’s Pentium super privileged mode works somewhat like this except the address at which to swap the state is held in a register that even the privileged code cannot access.
The Rem Rand LARC (1960) was like the 6600 in that an IO processor polled I/O without the aid of interrupts and could divert the main CPU.
The SPC12 (1971) was a very small machine without an interrupt mechanism but something very much like it. An “interrupt” would happen only when a particular fast one byte interrupt poll instruction was encountered in the instruction stream. While this sounds awkward it did have a significant advantage in the design of critical sections of code. Those sections merely omitted the interrupt poll instruction. All code running in the machine was required to insert these instructions so that no long time periods would pass without executing them. This does not work well for large applications but the SPC 12 had only 4K of 12 bit words for memory. Here was a precursor.
This solution could almost certainly have been used for the Apple Lisa computer that used a Motorola 68000 and an off-chip memory map. The 68K could not be told to redo a store. I should have beaten down Apple’s door but it did not occur to me.
Another interesting feature of the PDP-6 design was the semantics of interrupts and traps. Most machines included displacement of the program counter value as part of the interrupt action. By contrast the PDP-6 hardware interrupt action was to interpolate the execution of an extra instruction, fetched from a fixed location, without modifying the program counter. The location from which the extra instruction was fetched depended on the cause of the interruption. For the effect of a traditional interrupt the OS would put a subroutine call instruction there and the instruction stream was thus diverted. Special provisions for entry into privileged mode were required and provided. A common alternative to such a subroutine call was an I/O instruction that would move one word between core and an I/O device. This instruction would address a 36 bit operand that held the state of this ongoing asynchronous I/O operation. DEC’s KL-10 superseded the PDP-6 and had its own style of memory map.
The IBM 370 (1971) was essentially like the 360 regarding interrupts. It had a memory map and a novel instructions, MVCL and CLCL, with operands as large as 16MB. Aside from MVCL and CLCL, instructions required at most 8 pages to be mapped in order to complete their whole function. Such instructions would not begin to execute until enough pages were valid in order for the instruction to finish. The MVCL moved a string of bytes from memory to memory. 16MB was larger than any real memory at that time and it was impossible to map the whole operand at once. It would start without first verifying that operands were mapped. The length and address of an operand came from a pair of registers. When it came to an unmapped page, it would put back into those registers the length and address of the operand portions that remained unprocessed, so that upon resumption the MVCL would resume where it had left off. The program counter delivered to the interrupt routine, points to the MVCL, and thus the interrupt routine requires no special precautions.
The Motorola 68000 (1981) from Motorola was chosen for the Apple’s Lisa, which was an expensive precursor to the Mac. Apple had decided to employ virtual memory but the 68K had the same problem as the PDP-6. The memory map for the Lisa was off chip and not really part of the 68K architecture. It could recover after being trapped on a fetch from an invalid address but program state was ruined upon a trap caused by a store. Apple adopted a programming convention to do a fetch upon the allocation of new storage such as stack growth. The 68K was microprogrammed and the hardware would deposit an opaque (undocumented) record of program state on the stack upon interrupts. The hardware would reload such a record upon resumption of the program. This interfered with an OS design that would resume a task on another processor in an MP configuration or when a program was check-pointed to be resumed on another system. This was permitted only if the CPUs were of exactly the same vintage. Motorola officially supported virtual memory with the 68020 which documented where in the opaque package the OS could find the frustrated stores, or, for that matter frustrated loads. This meant that the OS could provide the content of a virtual address without mapping that address. Most other hardware architectures would require the OS to interpret the faulting instruction to accomplish this. This general plan allowed an instruction that required several references to memory to cause a separate interrupt, and it was unnecessary to have all virtual addresses mapped at one time.
Motorola’s 88K (1985) had two program counters to be preserved upon interrupt. This was because the effect of a conditional branch was delayed by one instruction. The successful branch would change one PC but not the one that located the instruction executed just after the branch. The SPARC and MIPS machines worked this way as well. The 88K followed the pattern of the 68K (and to some extent our PDP-6) in reporting the meaning of trapped loads and stores. The 88K could report more than one frustrated store. The 88K was the first machine that I know of that provided a few (4) privileged general registers. The use of these registers was entirely up to the privileged code. These registers could be used to save a few general registers for very short interrupts, or to keep the address of a control block for the user mode program currently running.
See RAP for some issues related to entry to the OS upon interrupts.