I am currently intrigued by OCaml. Their approach to typing is promising; they call it type inferencing. I have seen and understood fragmentary stories about how to use this to gain the combined benefits of static typing (efficiency) and latent typing (expressiveness). I have seen enough to be intrigued, but not yet convinced.
First a short comparison with Scheme. Scheme is a very much different language and the Scheme report can be read by an intelligent person with scarcely any computer background. The OCaml definition requires many small concepts that are anything but elementary and are not explained in the official document. I think this is a property of the language and not the skill of the report authors. I think a more elementary description of OCaml is possible but that would serve a different audience. I would like to see how big it would be. I suspect that Haskel is worse on this spectrum. See this for more comparison.
OCaml gives up several venerable language features. Type coercion and polymorphic infix operators and functions are gone. Addition requires different infix operators for int and float. “3 + 4.2” has been valid since FORTRAN I. All expression contexts in OCaml are weak, except at the read-eval-print loop. This may be a small price to pay for the benefits.
Whereas an informal intuition of Scheme’s dynamic types suffices for much Scheme programming, a more formal understanding of types and Caml’s language for expressing them is necessary for Caml programming. A conversation between the programer and compiler is typically necessary before the compiler will accept a program and this conversation is mostly about types in the meta-language about types.
Two months have passed and I see more strengths and weaknesses of OCaml.
I feel more certain now that I have not found anything like a complete language definition.
(Indeed see this.)
I think that a brief definition is possible of the meaning of accepted programs.
My recent reading of Benjamin Pierce’s book “Types and Programming Languages” confirms my suspicion that there is a broad variety to the possible sets of acceptable programs.
There is a gamut of type inferencing schemes.
I don’t know yet which set Inria’s compiler accepts.
If you have a compiler to interact with as you write your programs, this is OK for some purposes.
It is not OK for some design processes where you plan a large body of code based on a misunderstanding of what the compiler will accept.
You may then have produced much code predicated upon your misunderstanding.
Pierce illustrates this with the marvelous example of the fix operator:
fix = λf. (λx. f (λy. x x y)) (λx. f (λy. x x y))
which provides for recursion but in a way that OCaml’s type inferencing cannot grok.
Type inferencing is a sort of proof seeking which fails safely.
It does not suffice to provide a correct program; OCaml requires one that it can understand.
It works in Scheme.
My division algebras code had to be rewritten to accommodate such limitations as well, but in that case it was easy and OCaml’s infix operators make the code look more nearly like the original 19th century notation.
This more commonly useful program may be impossible in OCaml.
Another worry here is longevity of code. Will future compilers accept code that is accepted by today’s compilers? One possible solution to this conundrum is to provide trusting compilers, which either check dynamically or don’t check at all. Another comfort is that today’s compilers are open source.
There are plausible rumors of good performance.
I get the feeling that OCaml may be suitable for large software; their concept of signature seems more principled and uniform than C’s prototypes.
It would be good to have a tool that reports inferred types for a particular program. The read-eval-print loop reports on types, but that tool is not easily used to study a program packaged for production.
I have a problem with both Scheme and OCaml. There are two textual styles of code. One binds several values in the top level and is thus easy to debug using a REPL. This goes against excessive scope for identifiers however. This thru this are a progression of modification from the verbose to the concise. I understand that module systems are supposed to ameliorate this. I dislike this extra conceptual overhead but I have not seen a solution to this problem.
Unification seems fundamental to OCaml. I think it makes a large class of code more readable. The construct match pattern with alternatives is perhaps mere sugar, but type inferencing to deduce the signature of a function is not even syntax, let alone syntactic sugar. I think there is shared mechanism here.
# let a b = match b with (x, y) -> 2 | [p; q; r] -> 3;; This pattern matches values of type 'a list but is here used to match values of type 'b * 'c # [3, 5; 6, 8];; - : (int * int) list = [(3, 5); (6, 8)] # type 'p btree = Empty | Node of 'p * 'p btree * 'p btree;; type 'a btree = Empty | Node of 'a * 'a btree * 'a btreeThe “'p” in the type definition is a mere transient parameter and not part of the type any more than 'x' is part of the Scheme function
# type xxx = Empty | Intg of int;; type xxx = Empty | Intg of int # Empty;; - : xxx = Empty # type 'p btree = Empty | Node of 'p * 'p btree * 'p btree;; type 'a btree = Empty | Node of 'a * 'a btree * 'a btree # Empty;; - : 'a btree = EmptyThe identifier “Empty” is silently swept away by a new type definition. Maybe batch compilation has warnings. In practice such identifiers can be localized to OCaml modules which reduces symbol pollution.
Clearly functions have one argument and achieve their normal ends thru Currying. The syntax and tutorial are clever enough so that some users will never need to understand Currying and that ‘->’ is right associative. Parts of the manual, such as section 6.4, require understanding Currying. It is a bit brutal to learn the concept of Currying without at least a small warning.
A value cannot be created until its type is defined. It seems that the type definition must precede the value creation and the expression that creates the value must be in scope of the type definition. I don’t know whether an expression can have a value whose type is not in scope. (6 weeks later scopes seem conventional and obvious. One pitfall that confused me is that "let x = 5;;" at the top level of the read-eval-print loop, or in a module definition bind x in the following text. "let x = 5 in exp" binds it only in xx.)
In section “6.3 Names” I think the following should be said:
# let a = 42;; val a : int = 42 # a;; - : int = 42 # module Zz = struct let a = 3 and b = a end;; module Zz : sig val a : int val b : int end # Zz.b;; - : int = 42
I presume that a typexpr of form “' ident” has no defining occurrence but that one should be imagined at the head of each typedef(?).
Is not “/” an infix-op?
In the Appendix there should be an index of technical terms that are not part of the language, such as “scope”, “binding”, “compilation unit”, “constructor”, “type constructor”, “class function” (as in the explanation of “Class application” in section 6.9.2), “compatibility of types” (as in “This type is still compatible with a variant type containing more tags.” in “Variant types” in section 6.4) etc. (The index of meta-concepts) The syntactic categories should be included or put in their own index such as this crude start at a Syntax index.
The syntax given in section 6.11 seems to produce “module Xx = struct module = struct end end;;” but the OCaml program dislikes that text.
I can’t find the semantics to go along with the syntax: let-binding ::= pattern = expr ∣ value-name { parameter } [: typexpr] = expr which is given here.
(let sm ((s 0)(k 99)) (if (< k 1) s (sm (+ (f k) s) (- k 1))))I have not found a similar OCaml construct. Google(OCaml "named let") corroborates that OCaml indeed lacks this and that some miss it. It seems I must give the recursive function a name even if I need it only once, and that name pollutes too large a scope.
let rec sm s k = if k<1 then s else sm ((f k) + s) (k - 1) in sm 0 99Perhaps this is as general. These both need tail-recursion.
type growth = Taller | Shorter | Same
type btype = LeftLean | RightLean | Balanced
and 'a tree = {key: 'a; tag: btype; left: 'a node; right: 'a node}
and 'a node = Tree of 'a tree | Empty;;
let w = {key = 42; tag = 3; left = Empty; right = Empty};;
it knows that w is an 'a tree for some 'a.
It has also confirmed that Empty in left = Empty is of type 'a node, etc.
See my transcription of a Scheme program.