Some Implementation Ramifications of Sensory Access in Scheme
A pair and its seepair as proposed, imply
that the underlying pointer carry a bit to distinguish which construct
is identified.
This bit is the read only bit. It is also needed for strings and vectors.
I don’t know how Scheme systems typically code the dynamic type
but it is tempting to put the run time type in the pointer.
If the object representation in RAM is large then type information can go there.
If a cons cell is a
pair of 32 bit words and each word holds a value reference, and some bits
are reserved for type, the address portion of the pointer is restricted.
It is probably feasible to steal bits at the right end of the address knowing
that all values are located at an address which is a multiple of 8. How
many pointer codes do we need?
- cons cells are ubiquitous and deserve their own code so that they need
not be marked in RAM.
- The seepair needs a separate code.
- Vectors, strings, ports and procedures can be distinguished in RAM and
share a pointer code.
- Sensory versions of the above share a different pointer code.
- For the vector there is both the rovec and seevec which must share storage
and may thus not share pointer codes. An additional indirection layer for
the rarer of these is possible.
- Small immediate integers are frequent and need their own code.
- Values for booleans and chars are few enough to share space with integers,
perhaps.
It looks like there is room.
There is a too clever hack here that works on some machines.
Code that dereferences one of these types of pointer should always presumably know which type is expected.
Instead of spending an extra instruction setting the low 3 bits to zero, supply an offset in the load instruction to make the low bits zero.
If the instruction set does not support unaligned loads then this will fault indicating a type fault.
On some machines this saves an instruction and also provides a free check.