Thesis: the Scheme construct (set! (string->symbol ...) ...) is almost as bad as C’s construct (int *)(random()) = 0;
Scheme constructs with “!” are annoying, yet seem occasionally vital. I tried to write an efficient simulation of a mutable array based upon immutable arrays. There was a logarithmic overhead with a bad constant coefficient.
I think that OCaml addresses this problem better than either Scheme or my proposal below. Perhaps OCaml’s compromise is better engineering. The pattern below is consonant the Keykos solution where it works well. The pattern is perhaps not so good for a language.
Mutable Scheme things are said to denote or be bound to locations. To evaluate a variable conceptually you first determine the location to which it is bound and then fetch what is currently stored in that location.
I explore here ramifications of binding a variable directly to a value instead of to a location. In the process replace “variable” with “identifier”. In official Scheme (lambda (x y) ...) binds x and y to new locations and then puts the arguments in those locations. Rees has proposed a new construct, the cell which, like the current location is mutable. In this scheme a cell is a first class value in its own right. I propose to go farther than Rees. Here is a table of before and after situations.
R5RS | Whatever |
location is not a type | Cell is a type: (cell? (make-cell 3)) => #t |
Lambda binds variable to new location which initially holds the argument. | Lambda binds identifier to argument. |
set! is Syntax | set! is a Standard procedure |
Locations cannot be passed in messages | Cells, being fist class, can be arguments. |
A pair has two locations within | A pair is immutable and has two values, one or both of which may be a cell, of course. |
A vector or string has a location per element. | Vectors of cells are practically necessary.
Do we need immutable vectors? Another type?? |
If “x” is a variable then the expression “x” evaluates to the value currently held in the location to which x is bound. | If “x” is a variable (bound to a cell) then fetching the contents of that cell must be explicit or involve syntactic sugar not characteristic of Scheme. I tentatively propose “.x” as short for Rees’s (cell-ref x). This is from Bliss. |
If you need to mutate a parameter you can just do it: (lambda (x) ... (set! x 42) ... x ...). | You must plan a bit in order to mutate a parameter:
(lambda (y) (let ((x (new-cell y))) ... (set! x 42) ... .x ...) |
The <init>s are evaluated in some order and then bound to the respective variables. It is an error to refer to such a variable before it is bound.It would then be better to replace “<init>” by “expression” and also “variable” by “identifier”, as they can no longer vary. It must be made clear that the evaluation of “(lambda (x) (cons x y))” refers to neither x nor y while the evaluation of “(cons x y)” refers to both.
Scheme code to double a number is still (lambda (x) (+ x x)) but no locations are conceptually involved.
(lambda (x)(let* ((n (vector-length x))(y (make-vector n))) (do ((j 0 (+ j 1)))((= j n) y)(vector-set! y (vector-ref x j) j))))requires some sort of mutable arrays. It is not clear whether an immutable vector is practically required as well.
The yield of (make-vector 10 (new-cell)) will not serve as a mutable vector for “(new-cell)” is evaluated just once and we need 10 cells.
We need a new vector forming standard procedure such as:
Procedure: (proc->vector k n proc)
which returns a vector with n elements, the i’th being the yield of (proc i). proc must be a procedure of one argument.
Now we can rewrite the above program:
(lambda (x)(let* ((n (vector-length x)) (y (proc->vector n (lambda (i) (new-cell))))) (do ((j 0 (+ j 1)))((= j n) y)(set! (vector-ref y .(vector-ref x j)) j))))The above is obscure and incompatible with some of the rationale of the original Scheme vector. The initialization of the new vector: (proc->vector n (lambda (i) (new-cell))) seems to allocate storage locations for cells obliviously to the storage for the array. Enough hackery in the interpreter could ameliorate this but this makes it more difficult for the programmer to manage the efficiency of his program. Algol68 supports just such constructs as “array of references to int”, but such constructs are seldom useful. An Algol68 programmer would instead use the construct “reference to array of ints” which is just Scheme’s vector, except the “int” is not required in Scheme.
I see no computational requirement for a Scheme vector beyond a vector of locations, which is the current Scheme vector. Pedagogically I would introduce the vector as a close relative to the cell.
There may however be a reason for sharing vectors between procedures that should be unable to communicate via the locations within a vector. I address this issue in a not so orthogonal proposal to add sensory access to Scheme.
I presume that the top level environment would be immutable. Much grief stems from trying to provide the correct side effects for: (set! car cdr). Doing it correctly produces even more grief! If I understand correctly some (most?) Scheme implementations run slower due to the requirement to constantly monitor the meaning of car. I know no reason why it is good that the meaning of car can change half way thru evaluating some unrelated expression, except for remaining faithful the semantics of variables. Each separate yield of the Scheme Factory must have its own copy of locations holding all the Standard Procedures, lest instances gossip among themselves. The solution is to banish the variables, except in those few cases where they are needed.