There are several sorts of ‘union type’ about in various languages. They have in common that some expression denoting a value, can denote different types at different times not determined by the immediate context of the expression. The most common expression to be of a union type is an identifier. Applications of a function may also be of a union type. As the program runs the same expression will produce different types of values at different times.

The first discrepancy between languages is how the current type is known. In C it is the responsibility of program logic to know what the value will be. In C the following program is a deliberate violation of type:

typedef union{int i; float f} u;
int main(){u x;
x.i = 42; printf("%e\n", x.f);
return 0;}

Program logic determines that that variable x holds an integer as its floating point interpretation is used to print a value. I do not know whether language theorists consider this a valid program but such patterns are commonly used by useful programs predicated on certain bit patters representing computational values.

The next version of union requires that a value of type union include information about which alternative was the case for that particular value. Algol 68 was the language I first used that took that approach. The extra information is sometimes called the ‘tag’ and the language prevented constructs such as the above. The union type was conceived after the notion that a type was a set of values and that tag identify one of the types. This meant that alternative type values were of distinct types. In Algol68, such a variable was said to be of some “mood” which itself depicted a set of types. A union of two moods was merely the set theoretic union of the two.

The next union notion arose in ‘algebraic type theory’ where the program provided an arbitrary identifier for each alternative of a union type. Unlike the fields of a C union which composed a name space limited to that particular union type, the alternative names (called ‘constructor names’ in OCaml) are of the same scope as the name of the union type. This construct provided C’s enum function as well. Here is some OCaml code:

type 'w tree = Leaf of 'w | Node of ('w tree * 'w tree);;
Node (Leaf 3, Leaf 5);; ==> int tree = Node (Leaf 3, Leaf 5)

Two leaves of the same tree must have the same type. I wonder if the implementations exploit this economy.