I describe the implementation ideas in machine language for that is how they were conceived. I suspect that implementing these ideas in C might cost very substantial performance. I would love to be proven wrong. (2004: Perhaps I was wrong.) It is not clear to me how best to express this in C except to simulate a machine obeying pseudo machine language. I see now that the scheme amounts to an efficient use of continuations. Perhaps Scheme continuations could do the trick, but again not efficiently. Perhaps we could describe this code plan as heavy use of continuations in a context without garbage collection. And in Keykos?
As in SVR4 streams, the idea is to provide a small fixed set of stream transformations but with the ability to dynamically compose these into data-flow like networks of nodes. Each node in the net performs some particular transformation from the fixed set. Each node is implemented as a small memory block with:
The Varian Data Machines 620i had three registers, A, B and X which fit our scheme as if it had been designed for us. The entry point for each transformation routine assumed that X would locate the node, Register A would hold the next character in the input stream to be processed by the node and B would locate the node to pull whenever more characters from the stream were required. In other words A is the first character and B is a promise for the rest. B is sort of like Scheme’s continuation.
Each node pointer within a node would correspond to a conceptual character stream. Each time that pointer was used, by branching to the code that it located, the next character of that stream had its fleeting existence.
So far we have described the situation for nodes that produce one output for each input but we have not described “pull”. Some routines have pull entry points that assume no input character. A node that produces an infinite stream of ‘a’s will have only a pull entry point. Branching to that entry point produces yet another ‘a’. At such an entry point neither A nor B are meaningful. In all cases the first word of block for a node will locate the entry point for some transformation. Both push and pull entry points assume that X locates a node which will know where any yield is to be delivered.
The above is the idea in a nut-shell. Below are some examples of stream processing. Dispatch is merely shared code.
There are probably several bugs in the above “code”. Note that most of the steps amount to one machine instruction. We average about 7 machine instructions per node per character processed, even on a severely “register starved” machine. Note also that in contrast to the presumably more efficient batching of blocks of characters, this scheme runs the character stream thru the processing nodes without placing them in RAM or even cache for each node.
Note that the code is entirely reentrant and uses no stack. It might be described as using register B as a link register and doing tail call optimization. I don’t know how to code the stutter routine that way however as that seems to require explicitly naming the continuation.
At the edge of the network of nodes would be buffers for comm links in the case of Tymnet. These buffers are to improve link efficiency at minimal cost to latency.
A similar dichotomy bears on a node with two outputs. The deterministic discipline would store the continuation from an incoming character among with the character, and send the character along to now of the down stream nodes with a continuation to this node. When control is regained the same character is sent to the other downstream node with the continuation originally received with this character.
The first discipline if adhered to by all nodes, leads to output streams whose content is not sensitive to order of events.
There may be design issues here to avoid deadlocks, but I have seen none.