The naïve division (used in many early computers) was to repeatedly:

- subtract divisor from current remainder (initially the numerator),
- Wait for carries to propagate (Where all the time goes in naïve implementations)
- If the result is negative:
- Retrieve the old remainder if available and otherwise recompute it by adding,
- record a 0 as a quotient bit.

- If the result is not negative record 1 as another quotient bit.
- Shift remainder and quotient left by one.

The **CDC 6600** did just about the same thing.
Being binary, it produced two quotient bits per clock (100 ns).

Each cycle **the Stretch (IBM 7030)** did a table lookup using several high order bits from each of the normalized remainder and divisor.
The result was an approximation of the next several bits of quotient.
I think that it was guaranteed to produce four bits of quotient each cycle (300 ns) but would take advantage of up to nine bits when luck prevailed.
It averaged about six bits per cycle.
Here is a recent paper on this algorithm.
It differs in some respects from what I remember but I will leave my recollections in place.
I can vouch for Dura Sweeney (the ‘S’ in ‘SRT’) being there during the design of the Stretch.

The **Illiac II** is rumored to have done non restoring division without performing carry assimilation each cycle.
In other words the running remainder was expressed redundantly.
I know no other details (not even these!).
It seems that they could have done carry assimilation on say the top ten bits so as to see farther ahead.
(Wikipedia says that the Illiac II divide was designed by Robertson who co-invented the SRT division algorithm.
This fits my rumor.)

The **IBM/360 model 91** did a table look up on the high bits of the divisor at the beginning of the divide.
This provided a few bits, n, of approximation to the reciprocal of the divisor.
Numerator and denominator were multiplied by this so that the divisor of the resulting division problem would be close to one.
This makes guessing quotient bits trivial.
(divide 3.141593 by .999543.) This resulted in a quotient that might be one off.
This was documented.
See this.

The Motorola AltiVec architecture uses this scheme.

I would guess that the **Pentium** used tables like the 195.
I have heard that the problem was tracked down to an error when someone cut and pasted some table from an earlier design.
I now (2006) have better information.
There was a proof that the division algorithm was correct.
This was an excuse not the check that part of the design as carefully as they might have.
It seems no one read the proof carefully.
After the error was discovered, they found the bug in the proof.