This is a memory map idea that requires some knowledge of classic memory map architectures.

Years ago an otherwise clever colleague of mine, involved in hardware design, suggested placing address maps used by the hardware in virtual memory. Virtual memory doesn’t work until the hardware has a chance to access real memory in order to fill the TLB. I thought at the time that he was badly confused about how hardware did and could work. Such access must not require virtual memory lest we require recursive hardware.

Much later I realized that there is a very clever design here that is, in a way simple conceptually and simple in hardware. Nonetheless it is confusing. The simplest description is indeed to say that the real address of the j’th virtual page is found at virtual address 8*j. All other talk of division of the virtual address into portions is thereby subsumed. I assume here that 8 bytes are ample to store the real page frame address as well as a few control bits. 4 bytes may do.

(We assume 4KB pages here.) Consider the hardware failing to find an entry in the TLB for the virtual address k0. k0>>12 is the virtual page number and serves as the index into the table, whose origin is virtual address 0. The hardware thus seeks the datum at virtual address k1 = (k0>>12)<<3. As usual it consults the TLB to learn the real address of this datum. It probably finds it in the TLB and a new TLB entry for k0 is created. If the TLB fails for k1, however, the hardware computes k2 = (k1>>12)<<3. Each iteration consults the TLB and very soon one of the following occurs:

  1. a valid TLB entry is found which leads to making a new TLB entry, whereupon the original memory reference can be reinitiated,
  2. A page table entry is found indicating that software must intervene,
  3. The real address of virtual page 0 is required which is found in a special privileged register, or perhaps a locked down TLB entry.
The hardware need not even keep track of the iteration number as it need not keep track of intermediate goals and is thus not recursive. It suffices to go back to the normal loop of executing instructions after creating a new TLB entry. It actually seems simpler than the classic scheme.

It is not yet clear to me how simple it is for the kernel but it is clearly no worse than the classic scheme. Keykos gains conceptual simplicity by avoiding page faults in privileged code. Rather than building recursive page fault handlers to serve the privileged code that builds the maps, the privileged code can view the page table at its real addresses as a classic hierarchical table.

Note that an invalid TLB entry may be encountered where the real address of yet another portion of the memory map is expected. It is not to be imagined that such a page will be swapped in. Such a page will be created on demand as the kernel determines the meaning of virtual addresses defined by this previously invalid entry.

This plan is just a bit oversimplified. The memory map must be inaccessible to user mode programs. My favorite solution is for the hardware to support a “context number” which resides in a privileged control register and extends the virtual address to the left by several more bits, providing a super virtual address to serve in lieu of the “virtual address” above. Different context numbers are for different user address spaces. The super virtual space now comprises the various user address spaces. Address space 0 belongs to the kernel. The super virtual address is the concatenation of the context number from the control register and normal virtual address computed conventionally.

The allusion to “Tiny Alice” is from a play by Edward Albee by that name. The play is set in a vast mansion. A conspicuous element of the main set is a scale model of the mansion. References are made to the model seemingly confusing the map and the mapped.

Intel’s IA-64 architecture (Itanium) suggests such a map mechanism but implemented in software. They call it the virtual linear page table.

There is a problem in the above design, or perhaps an opportunity. Good classic memory maps (in my opinion) provide RO bits in table entries prior to the page table entry. In the TA scheme (above) entries at the beginning of the map play a curious double role. They at once define the real addresses of virtual memory, and they define where to find other such maps. How might we describe the meaning of a bit in a page table entry so that it restricts writes even to those addresses found indirectly thru the entry?

I think that the solution is to provide a bit in each of the PTE and TLB that marks them sensory and provides “sensory access”. This idea comes from viewing the PTE page, as analogous to the Keykos node or EROS capage. The sensory PTE provides access to its page as the Keykos sense key provides access to its node. While Keykos data pages hold no capabilities, pages holding PTEs hold something very much like capabilities. These capabilities indeed do not reside in an address space out side the kernel. They are written (and read) by the kernel and read by the hardware. The TLB entry must have a “sense bit” marking that entry as sensory. TLB entries formed consulting sensory TLB entries must be sensory. Stores enabled by sensory entries must not be allowed.

I would design the hardware to switch context values upon interrupts to the kernel’s context and dispense with any mechanism such as the IA-64 Protection Key Registers.

The IA64’s 8 regions per space and the PowerPC’s 16 segments are each determined by bits at the left end of the effective address. These are translated into global ids by a small table and become part of the virtual address that is matched against entries in the TLB. This allows sharing of TLB entries that locate page frames of shared data. While the scheme that I describe here shares page tables and the work to build them, it does not provide TLB sharing unless combined with hardware such as the IA64 regions or PowerPC segments.

If locating the page table at virtual address 0 is awkward then the formula “k2 = (k1>>12)<<3” can be relaxed by “k2 = (k1>>12)<<3 + C” for some constant C. There will be an X such that X = (X>>12)<<3 and the hardware must be able to translate X without reference to the formula. X will be an address within the table.
This scheme does not obviously support different size pages such as provided by SPARC maps. In the 32 bit SPARC maps each entry in a mapping table may declare that it represents a page and not yet another level of mapping table.