Let me first present the best defense I can for RAP before I argue for another conventional plan. I invite other defenses of RAP.
As best I can determine the plan is that certain addresses be permanently reserved to the privileged code and that when a CPU runs in privileged mode, these reserved addresses are used to access both:
Ostensibly this is convenient to the design of the privileged code because it may directly access the user space by the same addresses used by the user and, at the same time, access its own privileged data by their reserved addresses.
I argue below that this convenience is illusory for any kernel that claims to continue correctly after bugs in user mode programs. Such kernels can use RAP, but for those, the ostensible convenience is gone; and further, RAP is prone to security errors. These pitfalls are analogous to the Confused Deputy Problem.
Imagine, however, that an erroneous user mode program provides an address that is reserved according to the RAP. Without an extra programmed check by the privileged code this will result in erroneous access to privileged data. If the nature of the system call is to copy data from user space to user space then the result will be to copy between user space and privileged space where addresses in the privileged space are determined by the user code. To prevent this, every reference by privileged code, to user memory must be accompanied by a programmed check to ensure that the reference is not to a reserved address. A loop to follow a chain thru user space must perform the check for each iteration. Unix does this for some cases that I have checked, but not others. Unchecked accesses are likely to be exploitable as serious security flaws. They are not found by testing correct user code. This is why such systems are prone to security flaws.
The Intel 386 had a closely related design flaw in its memory map.
There is yet another possible vulnerability as the kernel accesses data from the user’s memory. Kernel code that examines some data byte in memory via more than one fetch from memory may conventionally depend on the each fetch finding the same value. Since that is normally assured for benign environments it seems plausible to kernel code sensitive to this. This paper reports an exploit of this vulnerability. In that exploit the time separation of the two fetches was significant and fetches closer in time would be harder to exploit. Yet the perils of not copying data across protection domains seems real.
Linus Torvalds addresses these problems and others with this scheme. Unfortunately it is not part of the normal debug cycle.
One plan that seems feasible on some such machines is to reserve different addresses at different instants and thus present the illusion to the user code that no addresses are reserved. Perhaps there is just one page of privileged code that is mapped into user space (without user access) in order that that code switch to another map private to the kernel. This physical page can be mapped at different addresses at different times. The page contains address free code. The address of that page within a user’s space would indeed be reserved but when the user code uses those addresses the virtual address of that page is changed. The IA32 segment registers can be used to support this trick efficiently.
Almost every system call takes an address of some string or structure in the user’s memory and must either read or modify that location as part of its function.
See this on space checks. Linus explains all. As far as user addresses go I think it would have been OK to type them as long ints. That way it wouldn’t be easy to dereference them. C (and many other languages) need distinct types that are all storage equivalent. For instance you could declare floating point inches or floating point centimeters and the compiler would notice unit flaws. Ada and Euclid had this.
Apple explains these issues for the OS X kernel.
Just now (2006 May 10) a Slashdot article says that Torvalds has written a note on the micro kernel issue and reportedly says that these issues are critical to the issue. Some of the comments are interesting. I agree with those things that he says and that I understand. I don’t know what he means by ‘micro kernel’ and so I cannot understand his objections to them. It depends somewhat on hardware architecture.