I am currently intrigued by OCaml. Their approach to typing is promising; they call it type inferencing. I have seen and understood fragmentary stories about how to use this to gain the combined benefits of static typing (efficiency) and latent typing (expressiveness). I have seen enough to be intrigued, but not yet convinced. (As of 2010 I am convinced that there is a significant application realm where OCaml has a substantial advantage over other languages that I know.)
First, a short comparison with Scheme. Scheme is a very much different language and the Scheme 5 report can be read by an intelligent person with scarcely any computer background. The OCaml definition requires many small concepts that are anything but elementary and are not explained in the official document. I think this is a property of the language and not the skill of the report authors. I think a more elementary description of OCaml is possible but that would serve a different audience. I would like to see how big it would be. I suspect that Haskel is worse on this spectrum. See this for more comparison.
OCaml gives up several venerable language features. Type coercion and polymorphic infix operators and functions are mostly gone. (= < <= etc. remain.) Addition requires different infix operators for int and float. “3 + 4.2” has been valid since FORTRAN I. All expression contexts in OCaml are weak, except at the read-eval-print loop. This may be a small price to pay for the benefits.
Whereas an informal intuition of Scheme’s dynamic types suffices for much Scheme programming, a more formal understanding of types and Caml’s language for expressing them is necessary for Caml programming. A conversation between the programmer and compiler is typically necessary before the compiler will accept a program and this conversation is mostly about types in the meta-language about types.
Two months have passed and I see more strengths and weaknesses of OCaml.
I feel more certain now that I have not found anything like a complete language definition.
(Indeed see this.)
I think that a brief definition is possible of the meaning of accepted programs.
My recent reading of Benjamin Pierce’s book “Types and Programming Languages” confirms my suspicion that there is a broad variety to the possible sets of acceptable programs.
There is a gamut of type inferencing schemes.
I don’t know yet which set Inria’s compiler accepts.
If you have a compiler to interact with as you write your programs, this is OK for many purposes.
It is not OK for some design processes where you plan a large body of code based on a misunderstanding of what the compiler will accept.
One tactic is to interactively test the types you plan for your project.
The OCaml REPL is good for this.
The remaining uncertainty of inferencing rules can supposedly be accommodated by including explicit types when the compiler fails to infer.
Pierce illustrates this with the marvelous example of the fix operator:
fix = λf. (λx. f (λy. x x y)) (λx. f (λy. x x y))
which provides for recursion but in a way that OCaml’s type inferencing cannot grok.
Type inferencing is a sort of proof seeking which fails safely.
It does not suffice to provide a correct program; OCaml requires one that it can understand.
Here is a good description of simple type inferencing.
fix works in Scheme.
My division algebras code had to be rewritten to accommodate such limitations as well, but in that case it was easy and OCaml’s infix operators make the code look more nearly like the original 19th century notation.
This more commonly useful program may be impossible in OCaml.
Another worry here is longevity of code. Will future compilers accept code that is accepted by today’s compilers? One possible solution to this conundrum is to provide trusting compilers, which either check dynamically or don’t check at all. But then you must fall back on dynamic typing which is slow and an entirely different technology. Another comfort is that today’s compilers are open source.
let | f | x y | = | x + y/2 | in | (f 3 4)*(f 4 6) |
There are plausible rumors of good performance. Here is one point.
I get the feeling that OCaml may be suitable for large software; their concept of signature seems more principled and uniform than C’s prototypes and certainly the type system is stronger.
It would be good to have a tool that reports inferred types for a particular program. The read-eval-print loop reports on types, but that tool is not easily used to study a program packaged for production.
I have a problem with both Scheme and OCaml. There are two textual styles of code. One binds several values in the top level and is thus easy to explore using a REPL. This results in excessive scope for identifiers however. This thru this are a progression of modification from the verbose to the concise. I understand that module systems are supposed to ameliorate this. I dislike this extra conceptual overhead but I have not seen a solution to this problem.
Unification seems fundamental to OCaml. I think it makes a large class of code more readable. The construct match pattern with alternatives is perhaps mere sugar, but type inferencing to deduce the signature of a function is not even syntax, let alone syntactic sugar. I think there is shared mechanism here.
module Xx = struct module = Z struct end end;;
# let a b = match b with (x, y) -> 2 | [p; q; r] -> 3;; This pattern matches values of type 'a list but is here used to match values of type 'b * 'c # [3, 5; 6, 8];; - : (int * int) list = [(3, 5); (6, 8)] # type 'p btree = Empty | Node of 'p * 'p btree * 'p btree;; type 'a btree = Empty | Node of 'a * 'a btree * 'a btreeThe “'p” in the type definition is a mere transient parameter and not part of the type any more than 'x' is part of the Scheme function
# type xxx = Empty | Intg of int;; type xxx = Empty | Intg of int # Empty;; - : xxx = Empty # type 'p btree = Empty | Node of 'p * 'p btree * 'p btree;; type 'a btree = Empty | Node of 'a * 'a btree * 'a btree # Empty;; - : 'a btree = EmptyThe identifier “Empty” is silently swept away by a new type definition. Maybe batch compilation has warnings. In practice such identifiers can be localized to OCaml modules which reduces symbol pollution.
module Q = struct type a = A | B let z = B type b = B | C end;; # Q.z;; - : Q.a = Q.B # Q.B;; - : Q.b = Q.B
File aa.ml | let p n = print_int n |
File aa.mli | val p : int -> unit |
File ab.ml | Aa.p 13 |
Shell command | ocamlc aa.mli aa.ml ab.ml |
Shell command | ./a.out |
type a = int * 'b as 'b;;and functions taking and delivering such values:
# ((fun (i, r) -> r) : (a -> a));; - : a -> a = <fun>Such values may be created thus: let rec a = (1, (2, a)) in exp wherein the expression a=a is fatal.
type a = int * (int * (int * 'a)) as 'a type b = int * (int * 'b) as 'b;;a and b define the same type, or at least compatible types according to the compatibility test above.
let rec a = [a];; val a : 'a list as 'a = [[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[…]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
type a = A1 | A2;; let y x = match x with A1 -> 1 | A2 -> 2;; type b = A1 | B2;; type c = A1 | A2;; (A1 : b), (A1 : a), (A1 : c);; => - : b * a * c = (A1, A1, A1) ((A1 : b), (A1 : a), (A1 : c)) = (A1, A1, A1) => - : bool = true y A1;; => - : int = 1I cannot explain this. Type definitions generates new type constructors (if constructor is new), but not new types. Type constructors have their independent existence and types are mere sets thereof. When REPL wants to report the type of a constructor it just grabs the the most recent type that includes the constructor.
type a = A1 | A2;; let y x = match x with A1 -> 1 | A2 -> 2;; type b = A1 of int | B1;; y (A1 4);; => Error: The constructor A1 expects 0 argument(s), but is applied here to 1 argument(s)
Clearly functions have one argument and achieve their normal ends thru Currying. The syntax and tutorial are clever enough so that some users will never need to understand Currying and that ‘->’ is right associative. Parts of the manual, such as section 6.4, require understanding Currying. It is a bit brutal to learn the concept of Currying without at least a small warning.
A value cannot be created until its type is defined. It seems that the type definition must precede the value creation and the expression that creates the value must be in scope of the type definition. I don’t know whether an expression can have a value whose type is not in scope. (6 weeks later scopes seem conventional and obvious. One pitfall that confused me is that “let x = 5;;” at the top level of the read-eval-print loop, or in a module definition bind x in the following text. “let x = 5 in exp” binds it only in exp.) Algol 68, for all its documentation faults, made the scope of a definition abundantly clear.
In section “6.3 Names” I think the following should be said:
# let a = 42;; val a : int = 42 # a;; - : int = 42 # module Zz = struct let a = 3 and b = a end;; module Zz : sig val a : int val b : int end # Zz.b;; - : int = 42
I presume that a typexpr of form “' ident” has no defining occurrence but that one should be imagined at the head of each typedef(?).
Is not “/” an infix-op?
The syntax given in section 6.11 produces “”.
I can’t find the semantics to go along with the syntax: let-binding ::= pattern = expr ∣ value-name { parameter } [: typexpr] = expr which is given here.
(let sm ((s 0)(k 99)) (if (< k 1) s (sm (+ (f k) s) (- k 1))))I have not found a similar OCaml construct. Google(OCaml “named let”) corroborates that OCaml indeed lacks this and that others miss it. It seems I must give the recursive function a name even if I need it only once, and that name pollutes too large a scope.
let rec sm s k = if k<1 then s else sm ((f k) + s) (k - 1) in sm 0 99Perhaps this is as general. These both need tail call optimization.
Quote:
This page should point to an index of technical terms.
type growth = Taller | Shorter | Same type btype = LeftLean | RightLean | Balanced and 'a tree = {key: 'a; tag: btype; left: 'a node; right: 'a node} and 'a node = Tree of 'a tree | Empty;; let w = {key = 42; tag = 3; left = Empty; right = Empty};;it knows that w is an 'a tree for some 'a. It has also confirmed that Empty in left = Empty is of type 'a node, etc.
;; normal factorial definition: let rec fac n = if n= 0 then 1 else n * (fac (n - 1)) ;; tail call definition: let fac n = let rec fx a j = if j=0 then a else fx (a*j) (j-1) in fx 1 n ;; sum function: sum n f = (floating) sum of f(j) for 0<=j<n let sum n f = let rec s a j = if j=0 then a else s (a +. (f (j - 1))) (j - 1) in s 0. n
See my transcription of a Scheme program.
Some simple graphics
OCaml abstraction failure
Some OCaml programs:
Boundary of Complex
Sort