Quadratic Residues

Finding Quadratic Residues

Gauss spent a good deal of his life worrying about quadratic residues. The digits 0, 1, 4, 5, 6 and 9 are the quadratic residues mod 10. They are just those digits which appear in the units position of squares. They are values of k for which the equation x² = k (mod 10) may be solved. We are concerned about quadratic residues modulo n, the number that we wish to factor.

Actually it would work if we could find almost any numbers x_i where we could factor (x_i² + kn) into our small primes, but it seems prudent to attempt this for the smallest possible numbers of this form. The easiest possible case is where k=0. Alas that leads to the trivial solutions and we must abandon it. Here’s Why. Choosing k=−1 seems promising.

The magic of the algorithm is that the product of numbers of the form (x_i² + kn) is also of that form. Numbers of that form expressible in our prime base will also be expressible in our prime base.

We will take y_i to be (floor(sqrt(n))+i)² − n. These will produce positive y_i’s whose size is only a little more than half the size of n. There remains the difficult problem of determining which of these y_i’s produce values (y_i²−n) which are products of our prime base and discarding the rest.

Now it is time for hundreds of computers and brute force, or Shamir’s sieve. I gather from Shamir’s paper that in the software solution, an array of about 10⁸ counters is allocated, one counter for each i in some range of i’s. Then a double loop begins. The outer loop considers in succession, each element p_j of the prime base. It adds a few bits of log(p_j) to each counter i if (y_i²−n) is divisible by p_j. This is easier than it may sound for those form a pattern of period p_j in the array. Here’s why. After each prime has been considered those counters with a large accumulation are examined more closely. If the (y_i²−n) for that counter is indeed factored by the factor base then its counter will be large. y_i is discarded unless the factor base completely factors (y_i²−n). The necessity to economize on bits in approximating the log accounts for some trade offs here.

Most such runs result in no hits, Different computers simultaneously do different ranges and one range after another. The hits are called “relations” and are gathered centrally.