To divide two IEEE 64 bit floating point numbers we consider a simpler problem: to divide two integers n and d such that 252 ≤ n, d < 253. We take this simplification to be unproblematical, even in regards to the various rounding modes.

Below mathematical (not software) conventions are in effect! The following expressions are exact reals and for positive y, x/y denotes infinite precision division ((x/y)y = x) and x÷y denotes the integer quotient of dividing x by y and discarding any remainder. I.e. x÷y is the largest integer k such that ky ≤ x. The remainder x%y for positive y, is x−(x÷y)y. For positive integers x and y, x%y is exactly C’s “x%y”. Theorem: 0 ≤ x/y − x÷y < 1.

The floating point division without rounding means finding the largest IEEE representable real q such that qd ≤ n.
If n < d then q = ((253n)÷d)2−53 is exactly the number we need.
If d < n then q = ((252n)÷d)2−52 is exactly the number we need.
If n = d then q = 1 is exactly the number we need.
Routine dvg here manifestly does this calculation
dvg(num, den)<<(num>=den) = 253q. We take dvg to be our gold-standard for unrounded division.

For positive integers r ((253rn)÷rd)2−53 = ((253n)÷d)2−53 and ((253rn)%rd) = r((253n)%d). We can choose an r, 28 < r < 29 that depends on d so that 253+8 < rd < 253+9.

For every integer d such that 28 ≤ d < 29 there is an integer r such that 28 ≤ r < 29 and
130339=0x1fd23 ≤ rd < r(d+1) ≤ 217 = 131072= 0x20000. We tabulate these values of r indexed by d−28.

A recurrent worry throughout these plans is that even the exponent of the quotient is unknown until the very end. It can be determined early if necessary and here is how at fairly low cost. The result exponent is the difference of the operand exponents unless the fraction of the denominator is larger than the fraction of the numerator whereupon the exponent is one less. There is no danger of a rounding action increasing the exponent for the nearest this comes to happening is the calculation (1 − 2−52)/(1 − 2−53) under directions to round away from 0. For this to happen the unrounded ratio would have to be > (1 − 253).
(1 − 2−53)2 = (1 − 2−52 + 2−106) > (1 − 2−52).
(1 − 2−52) < (1 − 2−53)2. Dividing by (1 − 2−53) yields:
(1 − 2−52)/(1 − 2−53) < (1 − 2−53).
End of Worry!

We must consider rounding and whether to round before or after knowing the final exponent. If we round first then we may find that we must round again and I see no guarantee that either step will avoid a long carry. I think that two rounds will not both cause long carries but exploiting the unproven theorem that they will not both be slow seems difficult and error prone. We pursue establishing the exponent early, or at least well before the end of the division. Facilely since “true” in C is 1 and “false” is 0 we have that exponent of result is (exponent of numerator) − (exponent of denominator) + (fraction of numerator ≥ fraction of denominator). Comparing fractions may be more than a clock cycle but it can be done in time to know how many quotient bits are needed so as to do the rounding just once.

This version of the code performs all the fairly simple test cases correctly while collecting up to 7 bits of quotient per cycle. It veers toward comprehensibility and away from some tricky logic on how to avoid waiting for an initial threshold comparison between numerator and denominator.

There are constants rs, lts, bi which are are design parameters. The following combinations now work: (9, 8, 1), thru (9, 8, 7) and (10, 9, 1) thru (10, 9, 8). The code omits most of the checks and reporting of intermediate values so the central logic can be seen better.

There might be something faster or cheaper than table lookup — simple interpolation, for instance. is yet to be incorporated.

Adding CSA (not now in working order)


by computing q=(n2k)÷d and keeping the remainder r. q is an integer and (q+R)2−53 is the desired real number that can be expressed exactly as a 64 bit IEEE floating point value. R is either 0 or 1 depending on the remainder and d.