The Small Bootstrap Routine

Prerequisites to this story:

On the IBM 704 an address designated a 36 bit word, not a byte.
An instruction occupied one word.
how the IBM 704 performed IO

Bootstrapping relied on the follow special hardware function: A button on the operator’s console caused the machine to stop whatever it was doing and pretend to branch to a routine such as the following:

Read select card reader;
copy (word) to address 0;
copy (word) to address 1;
transfer unconditionally to 0.

The first two instructions on the card would thus appear in first two words of memory and the CPU would begin to obey the code at 0. The hardware then was back in ordinary mode except, of course, the card reader was still moving and offering 22 more words to be copied to where ever the new program said.

The index registers of the 704 were new to this machine and the conventional bootstrap would appear on the card as:

Load some index register (XR) with 22
A: Copy word to (24 − (XR val))
Decrement XR by 1 and branch to A if still positive.

The effective address is computed by subtracting the index register value from the address in the instruction. Subsequent instructions on the card were able to save the accumulator and other index registers which would useful for routines to report machine state after a program failure. (Such routines typically wrote contents of core to magnetic tape for later offline printing. Such a printout was called a “dump”.)

It was long thought that any routine would need to destroy at least 10 words in low core to hold code to copy the rest of core memory to drum for subsequent copying to tape.

One day one of the new programmers came in and said that he had a version that destroyed only 8 words. I looked at the first word and it was a NOP! I put down the card and constructed the entire short routine. Everyone else who tried the challenge—make the first instruction a NOP—was also able to construct such a program. Knowing that the first word was unnecessary so limited the options that each move was forced until the rest was trivial. This is the most striking example that I know of a harder problem being easier to solve than the more liberal problem. Of course that strategy fails when there is no solution to the harder problem.

This was starling because the parsimonious hardware copied only two words to memory before leaving the rest to software. This was seen as the scarcest resource and to squander 50% of the most precious resource, and then supply a superior solution was remarkable.

Here is a 2006 reconstruction of the necessary solution: The instruction at location 1 on the card must copy something into location 2 lest the machine obey whatever was there when the bootstrap began. Thus the program on the card reads so-far: NOP; COPY 2;. The naïve thinking would demand that location n of the card would be COPY N+1; but this is infinite regress and does not account for the next card to be read. There is one alternative: the program can be extended thus: NOP; COPY 2; TRA 1;. (The TRA command sets the program counter and is the old name for “branch”.) This too is not fruitful so we code another copy instead and then a TRA: NOP; COPY 2; COPY 3; TRA 1;. Now we can begin to save the machine state: the pre boot accumulator is still intact and can now be saved with a STO instruction. We have: NOP; COPY 2; COPY 3; TRA 1; STO 7; STX 6,1. Instructions 4 and up on the card are successively copied into memory location 2 (by the instruction at 1), executed there. Eventually one wants more code to be placed in memory and this can be done by another copy instruction on the card, whose operand is the card word following the copy.

The 704 was over-designed! It would have sufficed for the special hardware boot function to copy just one word. This elegant hack also solved a problem for loading hardware diagnostics when index register function was suspect.

8 Queens

Another such problem is placing 8 queens on a chess board so as not to attack each other. Trying systematically, without a computer, in the obvious way leads to an hour or so of unfruitful work. Requiring central symmetry leads to a solution in about a minute.