Some of the math ideas go back to about 1950 and I reminisce here. I described some of the math here.

I describe the meaning of a signal that flows between two boxes in the larger machine.
Perhaps there are millions of such messages each clock cycle within the machine, each traversing its own **bus** (set of wires).
By “factory parameter” we mean a value associated with some individual piece of hardware whose actual value is fixed upon manufacture.
Text in italics is redundant.

All the primes below some limit B ⋍ 3⋅10^{10} are divided into a nested sequence of sets (the prime tree) thus:

S_{0} is the set of all primes less than B;

S_{n} = S_{4n+1} ⋃ S_{4n+2} ⋃ S_{4n+3} ⋃ S_{4n+4}.

S_{4n+1}, S_{4n+2}, S_{4n+3} and S_{4n+4} are pairwise disjoint.
The sets are fixed at the factory.

There are many arithmetic sequences of interest.
A_{i} = {n⋅p_{i} + r_{i}|n∈Z}.
Collisions are sought among these sequences.
The p_{i}’s are primes less than B.
There are typically two sequences per prime.
The r_{i}’s are system configuration parameters which are constant for long durations; they depend on the number being factored.

- For some j and for some s, s⊆S

For each bus there are integers J, j and m such that for every message on that bus n = j mod 2^{J} and s = S_{m}.
J, j and m are factory parameters.
Each bus thus has a collection of factory parameters (J, j, m).
*On a particular bus the only values that vary from one message to the next is n and k.
The actual message caries only information about n and k.*

On a particular bus successive messages are for increasing n.
When the log value is zero the message may be omitted (and usually is).
*The messages on a particular bus are for some factory fixed set of primes.
The messages on a particular bus are for some factory fixed number modulo 2 ^{J−2}*
Design goal: Most buses carry such a message each cycle but there will be bubbles on the buses.

Buses carry messages in only one direction. The n’s of successive messages on a bus are strictly monotonic. Most buses have a box as described below at either end.

Data flow from boxes each specializing in different small ranges of primes to boxes specializing in sets of numbers partitioned by their particular residue modulo some power of 2.

Boxes also retain a brief history of the sources of messages so as to determine the ancestry of messages with large k. Access to this history is not described yet. The box keeps a record of recent decisions sufficient to recall how each message was produced while effects of that message remain in the system.

While the system has a shared clock there will be a latency due to buffers at the input to boxes that accommodate bursty traffic. This may result in a skew of several hundred cycles. When a valuable result is found we stop the system clock and probe the memories of the relevant boxes to identify the contributors. I don't know how many results are wanted and this stopping may be too expensive.

As a message arrives at a box it is placed in one of 4 buffers depending on bits J and J+1 of n. Each such buffer may have up to 4 messages placed there in each cycle.

Shamir’s paper segregates primes into three size categories. I suspect that this is necessary.

I feel an itch to write some code to determine some of these parameters more closely.