Research
on Proof-Carrying Code for Mobile-Code Security, Peter Lee and George
Necula (CMU).
Infrastructure for Proof-Referencing Code, Carl Gunter (Penn),
Peter Homeier (Aerospace Corp), and Scott Nettles (Penn).
Trust Management and Proof-Carrying Code in Secure Mobile-Code Applications,
Joan Feigenbaum (AT&T Labs) and Peter Lee (CMU).
Turpin’s PhD thesis,
“Programming Data Structures in Logic”
recent interesting comments
Verve at MS
A few years ago I began to develop some ideas on a proof system for code. These were compromises between extremes of earlier ideas and are meant to be contributions to the engineering of proof systems rather than any fundamental ideas. My ideas have never been nearly complete.
I will be concerned here mainly with proofs that programs are well behaved, rather than correct. By well behaved I mean that they respect access boundaries without the need for runtime checking, and that they finish certain tasks in a known time.
Such theorems are typically necessary for a particular program before correctness of any sort can be proven.
By “program” I mean the binary form that the processor obeys—after the compilers, linkers and loaders are done. I anticipate that the checker could run without significant cost after loading of critical code. Obviously some would trust more of the infrastructure.
There are two ideas here:
The basic idea is to present a plan for a program. The plan can be roughly described as an encoding of the data declarations of the program. Some languages, perhaps most or all, will require information beyond the declarations, however, to build a suitable program plan.
Our design goal is to devise a definition of what it is to be a plan for a program so that:
We speak of pointers within the plan. That would normally indicate data with addresses within some containing address space. Here a pointer is relative to the beginning of the plan. Alternatively the compiler would produce the plan which would be loaded in its own section along with the code. Internal pointers within the plan would be adjusted as a normal loader function. Note that the loader need not be trusted. The check will fail if the plan is misloaded. If the misloaded plan passes anyway, then the code is safe for reasons not comprehended by the prover.
Closely connected to this are the several general plans to release storage:
For instance if the source were the language C and there were a struct declared in that source, then there would be a plan element (PE) that described each field of that struct and the PE for the struct would be an array of pointers to other PEs that describe the successive fields of the struct. The stack frame’s shape would likewise conform to some plan element, one such element for each routine that allocates stack frames.
While the PE for a struct is an array, the PE for an array is a struct, or perhaps more likely, a tagged union of PEs depending on the discipline used to bound the indexes used to access the array.
The above formats of meta data appeared in the earliest PL/I systems to support debugging and safe access to arrays whose size was unknown at compile time.
For polymorphism we introduce a PE, called a QPE (quantified PE) consisting of a prefix and another PE, p2. The prefix introduces a new type identifier which refers to no particular type. The identifier can appear in p2 in any context requiring a plan pointer. This is formally like a universal or existential quantifier in predicate calculus. A block of storage conforms to the QPE plan if for some plan z (not necessarily stored anywhere just now) that block would conform to p2 if a pointer to a plan for z were substituted for each occurrences of x in p2. Un settled issue: may z occur in plans referenced by p2? I think so but now the question of the scope of z arises. In predicate calculus scopes are contiguous and nested. We probably need a kindred discipline.
For each block of storage described by this PE there is some type x. The block has two words: The first word locates code in the form of as subroutine that takes a pointer to a block whose form is x. The second word is a pointer to data in form x.A PE in the above form can be used to describe a procedure value (a closure) such as provided by Algol, PL/I or Pascal. The first word locates the code for the procedure and the second locates the specific activation record that holds the values for the free variables in the source for the procedure.