I now have the reference:
On the Design of Display Processors T. H. Myer,
I. E. Sutherland;
Communications of the ACM, Vol 11, No. 6, June 1968
This paper is the original and much more complete.
It also tells how the cycle was broken.
The time was the early 60’s. The milieu was Digital Equipment Corporation and MIT. Early computer displays plotted points. The DEC PDP 1 and IBM 704 had remarkably similar hardware for which the program would load an x and y value into common registers and issue a plot command. (140 μs for the 704) The hardware would briefly display a small point on the screen at those coordinates. If the screen had a persistent phosphor and the program plotted points fast enough and the room lights were low then a very useful image might appear on the screen. While television was already common the cost of refresh memory ruled out today’s raster scan. The PDP-1 and the 704 had neither the megabit memory required to serve as a raster scan refresh buffer, nor adequate memory bandwidth. When the program was not plotting points, the screen was black. The PDP-1 had a light pen that was a hand held photo cell that would provide a program testable state that would be true briefly after plotting a point within view of the light pen. This design followed the work on the Tx-10 at Lincoln Labs which had supported the pioneering work of Ivan Sutherland and his famous Sketchpad program.
Then as now there was a continuing decrease in hardware costs. Architects wondered how they could overcome the awkward fact that the screen would go black whenever the CPU diverted its attention from the consuming task of refreshing the screen. The solution was the display list. Hardware was built to read an array of point coordinate pairs from core memory and plot them on the screen. The array was thus processed until the program told it to stop. The CPU could then construct display lists and go on to other tasks while the user examined the static image.
The program was most likely to want to modify part of the image while avoiding the cost of recomputing, or even copying the entire display list. The display refresh hardware was enhanced with an ability to recognize a pattern, stored among the plot data, directing it to another area of memory for more display data. The program could then construct a cycle of several lists which it could modify in a dialog with the user.
It soon became evident that some patterns were repeated at different parts of the screen. The letter A might be required several times in the display list with different coordinates. Each occurrence of A on the screen would require a list of points with x and y adjusted for that location. A subroutine like function was added to the display hardware that passed an x and y coordinate offset parameter value as it displayed points from another location in memory. A return instruction would go back to the saved location of the routine invocation command.
A common pattern was the line segment. The CPU spent too much time doing the simple arithmetic required to construct the display list for the points of a segment. Arithmetic and conditional capability were added to the display hardware.
With conditionality it was discovered how to plot a circle if only the hardware had one more register. A new yet familiar problem arose. The display hardware began to be occupied with complex algorithms beyond keeping points on the screen. Should we add a simple display list mechanism to off load the display processor?
At this point it had become clear that the display hardware design had grown to become a general purpose computer. The idea of multi processing was already well known but seldom practiced. The conventional wisdom regarding multi processing was that it interfered with the economy of scale that then ruled computer design; bigger computers were more cost effective than simpler ones. The foregoing design cycle had seemed at each step to yield an improved design. A paradox like Escher’s circling monks. It would seem that ‘better’ is not transitive. More likely there were a variety of concepts of ‘better’ in play.
The resolution of the paradox was highly contingent on the hardware design rules that are no longer relevant. The pit fall of reincarnation of function is still with us. Sometimes the layers are inhomogeneous which leads to conceptual costs, namely it is too hard to understand the layered systems. Sometimes the inhomogeneities linger in legacy architectures long after their original justifications are forgotten.
From what I hear of current (2005) GPU design we are now in the midst of another display reincarnation cycle! See “General-Purpose Computation Using Graphics Hardware” too.