Finding Quadratic Residues

Gauss spent a good deal of his life worrying about quadratic residues. The digits 0, 1, 4, 5, 6 and 9 are the quadratic residues mod 10. They are just those digits which appear in the units position of squares. They are values of k for which the equation x2 = k (mod 10) may be solved. We are concerned about quadratic residues modulo n, the number that we wish to factor.

Actually it would work if we could find almost any numbers xi where we could factor (xi2 + kn) into our small primes, but it seems prudent to attempt this for the smallest possible numbers of this form. The easiest possible case is where k=0. Alas that leads to the trivial solutions and we must abandon it. Here’s Why. Choosing k=−1 seems promising.

The magic of the algorithm is that the product of numbers of the form (xi2 + kn) is also of that form. Numbers of that form expressible in our prime base will also be expressible in our prime base.

We will take yi to be (floor(sqrt(n))+i)2 − n. These will produce positive yi’s whose size is only a little more than half the size of n. There remains the difficult problem of determining which of these yi’s produce values (yi2−n) which are products of our prime base and discarding the rest.

Now it is time for hundreds of computers and brute force, or Shamir’s sieve. I gather from Shamir’s paper that in the software solution, an array of about 108 counters is allocated, one counter for each i in some range of i’s. Then a double loop begins. The outer loop considers in succession, each element pj of the prime base. It adds a few bits of log(pj) to each counter i if (yi2−n) is divisible by pj. This is easier than it may sound for those form a pattern of period pj in the array. Here’s why. After each prime has been considered those counters with a large accumulation are examined more closely. If the (yi2−n) for that counter is indeed factored by the factor base then its counter will be large. yi is discarded unless the factor base completely factors (yi2−n). The necessity to economize on bits in approximating the log accounts for some trade offs here.

Most such runs result in no hits, Different computers simultaneously do different ranges and one range after another. The hits are called “relations” and are gathered centrally.