Types and Programming Languages, by Benjamin Pierce

This is a highly technical book which has the combined styles of a math book and a book on software. There are theorems and proofs. There are many references to works from the 20’s thru the 50’s that were pure mathematics and were seen by few then as relevant to computers which were then somewhat over the horizon. The book has interesting perspectives on commercial computer languages such as Java, C and C++. The book even descends to issues of producing machine language for certain constructs.

I find a fair amount of information that is new to me. The book pays close attention to what can be built upon what. I was startled to see that cons, car and cdr can be built from pure lambda. I was amused that conflating true with 1 has roots far older than C.

The following is tentative. I have thought of types as sets of values. A particular formal system would first define values (with an eye towards organizing them into types of course) and then the types. This subtle distinction emerges, I think, in Pierce’s treatment of subtyping. We agree that until subtyping, a value is of exactly one type. (Algol 68 sticks to this and introduces unions which are sets of types. A union is not a type but a mood and variables and parameters can be declared as such.) Pierce says that a language might make the type int a subtype of float. In this case it is not clear to me whether he intends that there remain distinct values 1 and 1.0. Mathematicians are themselves vague and often inconsistent on this point within a short span in a book. [he seems to limit this to values whose machine representations are naturally substitutable.] I grant that this is the important case but in theory subtyping should allow other ‘embeddings’ of one type within another. His motivating example is that a record with field names and corresponding field types should be welcome anywhere that a record with a subset of those names and types is used. In this case the embedding is natural and should be honored. The machine representation may be impacted by this strategy however. The case of embedding integer values within floating point seems different to me somehow. I am not convinced we can limit ourselves to cases where the embedding is ‘natural’.

The integers as a subset of reals (with types as value sets) is an example of a type being a subset as well as a subtype. Types {x: Nat; y: float; ...} as a subtypes of {x: Nat} is an example of a vastly larger sets of values composing a suptypes. Such subtypes seem much different and certainly confusing. The book is sporadically concerned with implementation issues. One use of subtyping is to enable the scenario where

The goal is that neither producers of subtype values, nor consumers of the supertype values, need pay extra for correct and safe function. It may be possible for the platform (language system) to find a place to put code that transforms any such values after production and before consumption, and that this code need not necessarily run as often as either the producer or consumer. Consider, for instance, a call site that passes an integer value to a routine value that takes a float. The call site can include conversion code. Another way to achieve this efficiency is to find a form of expressing these values so that the transformation is vacuous. This later is achieved with the vtables of C++ but only by imposing limitations on subtyping. These rules allow the representation of subtypes (a block with a pointer to a vtable) to be interchangeable with representations of the supertype. This is so because the vtable format for the subtype includes the vtable format for the supertype as an initial portion. (Java relaxes these limitations and pays a call cost with their Interface mechanism.) I find this an annoying, yet very useful hack due to its efficiency. I have not understood multiple inheritance in this context but I suppose it is not obviously impossible.

There are many ramifications of subtyping, dependent compilation and independent compilation that escape me. Is the notion of subtyping more or less than C’s conventional notion of coercion of arguments to a subroutine?

Things get clearer to me if we posit that the language designer, not the programmer, determines the subtype relationships. That ints can be used in strong floating contexts (as in C) is an inflexible language feature. This perspective suggests that a Java compiler could produce code to dynamically create a new v-table at a call site where the argument type was a sub-type of the parameter type by virtue of an Interface. This might be faster than the current practice which, I think, is to search on each call and perhaps cache the search result.