This is aimed at IEEE double precision floating point division. I record here the images that occupy my head as I translate what seems to be a high level view of a fast and easily understood division process into an orgy of boolean gates. This is a related historic multiply scheme.

Given two positive normalized D’s (double precision IEEE floats) n and d, ½ ≤ n, d < 1, find q = n/d which means find the largest q (which is a D) such that q*d ≤ n. This means rounding towards 0 which I will content myself with for now. I speak here as if D’s actually denote numbers in the mathematical sense.

Here is the ‘high level view’ of the scheme which rumor says was proposed by Gene Amdahl for the IBM 360 model 195. We replace the n/d problem with (Nn)/(Nd) where N is a small positive integer and both Nn and Nd are computed exactly, and for which the same q is also the largest D such that q*Nd ≤ Nn. N is chosen so that the high several bits (about 9) of Nd are all 1’s. This choice may be from a table of 9 bit integers indexed by bits 12 thru 19 (counting the leftmost bit as bit 0) of the IEEE representation of d. Good ‘trial divisors’ as taught in grammar school in the 1940’s may now be found as the high 9 bits of the successive remainders of the grammar school long division process. This should yield 7, 8 or 9 bits of quotient per ‘cycle’. Such cycles can employ carry save adds to reduce the levels of logic.

Further details may be divided into three subjects:

  1. Choosing N, table, table with interpolation, ad-hoc fixes of edge cases;
  2. Iterations given a good lower bound on Nd;
  3. Optimizing the constituent multiplications and coping with reduced intermediate precision.

Here are details for subject 2.

Here is some early work that I do not now rely on, except for the ideas that emerged there.