IBM delivered CP-67 (in roughly 1967) which provided virtual 360’s. CP/67 ran only on the 360 model 67 which unlike the rest of the 360s had virtual memory. IBM called the hardware feature DAT (Dynamic Address Translation). It was common knowledge that a virtual machine could not be equipped with the virtual memory feature. (See virtual PC hack for the model 67.)
Part way thru the life of the 370 series, virtual memory was added to the entire line. IBM announced VM/370 at that point and indicated indeed that virtual 370’s that it provided would be complete, including virtual memory. Common knowledge had been wrong. Knowing that it was possible led quickly to the solution of the puzzle of how to do it. Just now I could not find on the web a description of the scheme so I decided to record the solution. In retrospect it is a straight forward extension of the other schemes for virtualization.
When guest code first encounters a LPSW instruction which switches to mapped mode, the kernel gets control and must decide how to extend the illusion of the real machine. Prior to this point the virtual machine illusion had been supported with a memory map controlled by the VM kernel. For virtual virtual memory three maps are required:
Real Machine | Virtual Machine |
TLB | Shadow tables |
Purge TLB instruction in privileged mode | Purge TLB of guest kernel interpreted by VM kernel resulting in total or selective destruction of shadow tables |
TLB miss | Real hardware finding invalid segment or page table entry and trapping to VM kernel which consults first two maps to usually produce a new value in the shadow table |
Page fault | Starts out like above but VM kernel discovers one of the two consulted maps is invalid. If it is the first map then like an ordinary page fault serviced by the VM kernel. If in the second map the VM kernel notifies the guest kernel as the hardware notifies a normal kernel of a page fault. |
Invariant is that TLB describes access to a subset of what the maps describe. | Invariant is that shadow tables describe a subset of the access described by the composition of the two maps. |
VM/370 assumed that its guests would conform and so invalidated the composed map upon PTLB. The composed map was much larger than a typical TLB and this introduced the hazard that old programs might have gotten by with fewer PTLB’s than were strictly necessary. This did not turn out to be a problem. There were some OSes that indeed failed immediately to run on VM/370. When these problems were tracked to lack of PTLBs, and then when the fixed kernel was put into the field on real machines, some once-per-year crashes were fixed. Stress testing of software had already been invented, but was rediscovered here.
We took this as a lesson and systematically changed table sizes in Keykos to absurd values which stressed various kernel algorithms. We found a couple of more bugs that way.
It is slightly more complex. It is not necessary to immediately do a PTLB after a modification of a map but it necessary to do a PTLB after the last mod and before the first use of the table by the hardware after the mod. As PTLBs were expensive on some machines the kernel developed the habit of batching them. This was tricky and thus buggy.