The opening sections introduce a simple tree walk problem and then proposes three different ways you might want to extend the program that would conventionally use mutation.
He then shows how each of these can be done within the emergent monad pattern.
Haskell compilers have come and gone since the paper was written and I collect here versions that work with the “Glasgow Haskell Compiler, Version 7.10.2” (“**ghc**” herein).
Note that I do not approve of omitting trailing semicolons.
Here are those results.
I have written very little Haskell code and I am impressed with the quality of diagnostics I get from the compiler.
Only once have I had one of these programs report a runtime error, and then it identified the culprit code.
Otherwise if it compiles it gets the right answer.

I cannot fault the problem, the program or the three extensions. That the Haskell sampler code is readable to even an OCaml programmer is dubious however.

Simple code => 42

Variation one => "42; divide by zero"

Variation two => (42,2)

Variation three =>

eval (Con 1972) <= 1972 eval (Con 2) <= 2 eval (Div (Con 1972) (Con 2)) <= 986 eval (Con 23) <= 23 eval (Div (Div (Con 1972) (Con 2)) (Con 23)) <= 42 42I have several things to say about these pieces of code.

Some may prefer this for variation one and yields “"Raise \"divide by zero\"; Return 42"”.

Variation two requires special attention. The counting here is done as (((((((0 + 1) + 1) + 1) + 1) + 1) + 1) + 1) whereas we could exploit associativity of + and do (((1 + 1) + 1) + ((1 + (1 + 1)) + 1)) as we follow the shape of the input tree, like this (=>(42,2)). This is a nice Haskell program but it does not fit the monad pattern introduced in this paper. This allows parallel computing as in variation three but some applications require the non-associative pattern. The sampler that Wadler gives us ignores associativity but allows application logic to control the order of additions. Sometime this serialization is logically necessary. This should not be done unless necessary but Wadler’s code shows us how to do it:

- We pass down a State value at every call site for
`eval`. - The 2nd call site passes the State returned from the first and thus the machine is required to finish the first before beginning the second.
- The temporally first call in the last line which provides the initial 0.

We collect the monads for comparison.

data M a = Raise Exception | Return a type M a = State -> (a, State) type M a = (Output, a)In each case we have:

eval :: Term -> M IntNote that in variation two when

The intended audience for this paper is unclear but to follow the paper one must keep in mind that “`a -> b -> c`” is always to be read as “`a -> (b -> c)`” just because “`f x y`” is to be read as “`(f x) y`”.
More notation is that “`(*) a b`” is equivalent to “`a * b`”.

Now I critically examine the sentence beginning “What sort of operations”.
“*M*” is only metaphorically a type.
More precisely it is informally a function (a ‘constructor’) that runs at compile time to map one type to another.
But one learns by metaphors, so we proceed.

Now I work thru the critical expression “*m*λa.n*” for my sake and perhaps the sake of some readers of this note.

m*λa.n ≡ (*) m λa.nfrom which we deduce that we expect m to be of type

m :: M a a :: a n :: M b \a -> n :: a -> M b (*) :: M a -> ((a -> M b) -> M b) (*) m :: (a -> M b) -> M b m*λa.n :: MbThis reminds me of balancing chemistry equations.

Regarding the parens in the third equation after the sentence with “easily rewritten”, I think that learning to grok the equation without parens is part of the monad initiation and I do not mean this in a pejorative sense.
*eval (Div t u) = eval t * λa.eval u*λb.unit (a÷b)*

This may be a faithful version of the code from section 2.6. Wadler seems to assume the constructor “I” as a primitive, perhaps so that he can later say “M = I” in characterizing the ‘identity monad’ but as far as I know such an equation is too meta for ghc. Beware the UTF-8 five pointed star (★); ghc does not approve of “*” in this role.

The code of 2.7 lives here and yields “"42; divide by zero"”, or here which yields “"Raise \"divide by zero\"; Return 42"”.

The code of 2.8 lives here and yields “(42,2)”. (OCaml & Scheme too)

The code of 2.9 lives here and yields:

eval (Con 1972) <= 1972 eval (Con 2) <= 2 eval (Div (Con 1972) (Con 2)) <= 986 eval (Con 23) <= 23 eval (Div (Div (Con 1972) (Con 2)) (Con 23)) <= 42 42In the revisited version two there is the expression “

We contrast the four definitions of the ★ operator now. In each case the type is:

(★) :: M a -> (a -> M b) -> M bThe respective definitions are:

a★k = k a m★k = case m of Raise e -> Raise e; Return a -> k a m★k = \x -> let (a, y) = m x in k a y m★k = let (x, a) = m in let (y, b) = k a in (x ++ y, b)These three definitions may cover most common monad patterns.

Conforming to the monad pattern is error prone. As I adapted a Scheme program to this pattern, I had to resort to Scheme’s print statement.

See this too.

When I took physics I was impressed at the pattern of deriving, from simple fundamental equations, complex equations to predict some complex phenomenon. Then, using only math, one simplifies those equations, turns them into programs which give the right answer provided you did the math and code correctly. The intermediate expressions are often obscure or even opaque; intermediate formulae often have no physical interpretation that I can discover. Yet in the end your program predicts observed results. I find some Haskell programs like this. I have not learned the program transformations that correspond to those of physics.

Just now, a few days after writing the above, I was forced to do just such a mechanical transformation of physics equations and was aided by expressing the equations in functional form which I could code in Scheme and relate that Scheme code to the physics in my head. I then tediously transformed that Scheme code to a form that I could feed into the Wolfram indefinite integral server. I failed at the latter transformation when I tried to do it in my head. I left crumbs behind as I modified the definition of pqc here. Each line that now begins with a semicolon (comment) was tested against a non trivial test case during the transformation. More physics.

Haskell does not memoize for you. See Fibbobnaci which takes 111 ns per bottom call. I don’t know how it could have been feasibly otherwise. I can think of no general feasible memoizing plan.