IBM’s AS 400 (nee System 38; now a component of the iSeries (they changed the name again to “IBM i”)) has elements of capabilities in the hardware. In particular it has memory in which some 64 bit words (128 bits now) may be marked as capabilities in order to prevent user access to the bits of these capabilities thus preventing counterfeit capabilities. It is many years since I knew a bit about the system 38. I think that there were user mode instructions to load memory capabilities into addressing registers and these registers provided access to yet more memory. Protection of these capabilities from corruption is at the bottom level of system integrity. Reading the recent architecture overview leads me to speculate that the technology-independent machine interface (TIMI) is trusted to emit distinguished load and store commands for access to the special words that hold capabilities. The hardware protection is necessary for efficiency for there is no logic in TIMI to remember which big words hold capabilities. Addresses are formed by the logic of untrusted HILI (HI Level Instructions) code. The translation by TIMI from HILI to the real instructions is without benefit of a theory of types in RAM. Which registers held addresses derived from capabilities, is the business of the translator. I think that there are user mode commands in HILI to safely interconvert words between capabilities and data. The hardware verifies that the capability loads and stores access only capability words and the same for data.
I don’t like mixing capabilities with data. It works but too many familiar patterns are damaged. No longer can you blindly copy regions of memory; you must feel for the capabilities and use other instructions to copy them. I suppose that traps can ease the cost of testing, but not the cost of encountering them. No longer can you merely swap out a page, for the special words must preserve their specialness. The one fundamental advantage for intermixed capabilities that I am aware of is that data structures, ala C’s struct can include capabilities. This is a natural and useful thing to do, but I think it is not worth the cost. See this.
Pure software capability systems, and some hardware capability systems, segregate capabilities in their own pages. Keykos runs on conventional (non-capability) hardware. In Keykos capability pages are smaller than data pages and are pure software constructs called nodes. In the Plessey 250 capabilities live in standard hardware defined pages and the page map provides access to such pages only as capabilities. As an additional protection capabilities were stored with a different parity than data.
Here is a cryptographic proposal to mix data with capabilities.
Ronald Pose’s later designs.
Segregating capabilities from data in registers is independent from segregating them in memory. There are all four possible system designs.