Many programs required keeping a numerical model of a simulated physics problem that was much larger than the core (RAM) memory. For a few thousand time steps all the model data would be passed over to compute the same data for the next time step. On the Univac I, 701 and 704 magnetic tapes were fast enough and capacious enough to do the job. Typically the model was a 2D array of points in a mesh with 10 to 30 floating point numbers per mesh point. The obvious way was to put the model sequentially on mag tape, one tape block per row in the mesh. Two or three rows would be read in from tape which is enough to compute the first row of the next time step. Thereafter a row would be read, a row calculated and a row written until the next time step had been entirely calculated and written to tape. The two tapes would then be rewound and the process repeated with the roles of the two tapes reversed.
Rewinding tapes was not much faster than reading or writing them. Usually four tape drives were available and the model data would be split equally between two tape reels. After the first half of the array had been read from the first reel, that reel would be rewound and quickly thereafter when half of the next time step had been written out, the first output reel would be rewound. There was thus no waiting for rewinds.
With the advent of the IBM 709 with its ‘data channels’, this tape I/O was overlapped with computing. This provided the expected performance improvement but did not impact the tape logic described above.
About 1957 a new programmer at Livermore came up with the following clever scheme but I can’t imagine how he thought of this simple trick. He divided the mesh into 3 equal parts. Parts 1 and 2 were initially recorded on reel A which was then rewound. Part 3 was recorded on reel B which was not rewound. The calculation now commenced reading part 1 from reel A and writing onto the end of reel B. When part 1 of the array had been processed reel B was rewound and writing was switched to reel C. Part 2 of the mesh was read from reel A and written to reel C. When this was done reel B had finished rewinding and was ready to supply part 3, which was read and output went to reel C. When part 3 was finished reel C was rewound. At this point the next time step of the simulation was complete and part 1 was on the end of reel B, which had finished rewinding, while parts 2 and 3 were on reel C, which was rewinding. We now begin the next complete sweep of the mesh by reading part 1 from reel B and writing on reel A, when part 1 is done we rewind reel B and continue reading part 2 from reel C which has just finished rewinding. Part 2 is also written to reel A. When part 2 is done we rewind reel A and write part 3 to reel B. This is where we came in. We have overlapped tape rewind with the rest of the tasks with only three tape drives and reels.
A Univac I had ten magnetic tapes. I think that the only other IO device on the machine was the operator’s typewriter. There were certainly no card devices in the Univac system. Tapes were critical to the 2D Lab applications. 1000 words of memory would not hold even one row of our 2D meshes. As each point was processed a point’s worth of data was written out and another point read in. One output record was data to inform the physics as it passed this point in the adjacent row. Another output record, on a different tape, was for when it came time to do the next time cycle for this point. The physicist Bryce Dewitt wrote the first 2D hydro application for the Univac I. That was all before I got there in 1955. 4 tape drives would suffice; with eight drives rewind time was avoided. With the above trick six drives would have sufficed.