Below “ACK” for “acknowledgment”
Here is a list of the per-interface data structures that I foresee—presuming Format D.
“The fiber” below refers either to the incoming fiber for the interface, or the outgoing fiber.
An “Input indexed” or “Output indexed” ring means successive entries of the ring correspond to successive packets as they respectively arrive or leave on the interface.
Indexed rings have entries of uniform size and an index can find things therein.
- IPR: Input payload ring
- 2 or 3 fibers full, receives successive payloads from the fiber.
For most packets and most output fibers, packets stay here until downstream ACK.
Some bit in IIB is the official indication of a payload being allocated.
- IHR: Input header ring
- receives successive packet headers from the fiber, promptly consumed by software without fiber latencies.
- OHR: Output header ring
- Built by software, 1 by 1, as input headers are consumed.
Also holds occasional syndrome commands.
Read by hardware to fetch data to be transmitted on the fiber.
- IIOHLR: Input indexed ring of locators of output headers
- The jth input packet is scheduled to be transmitted by the output header there, in an output header for another interface—needed to drive sweep up of input payload ring.
Used to process NACKs from downstream to find payloads that must be retransmitted.
- IIB: Input indexed busy bit ring parallel to IIOHLR,
- used to reclaim space in Input payload ring upon downstream ACKs
- ELR: ring of syndrome command locators.
- Locates commands to send syndrome in output header ring.
Used to process NACKs.
ELR entry also records output index.
Indexed by syndrome serial number.
- OIL: Output Indexed Locator
- Output indexed ring of bit pointers into IIBs—maps output indexes to input (interface num and index).
Used to process ACKs and NACKs from downstream.
The only rings that the hardware knows are IPR and IHR which it writes, and OHR which it reads.
Sweep Up
A critical issue is end of life for a payload in the IPR.
Liveness is recorded in IIB which, being an array of bits, can be efficiently searched for stragglers.
- The happy termination of a payload in IPR is an ACK from a downstream node.
Such ACKs come in batched groups by error syndrome reports from downstream.
The syndrome index in that report locates an entry in ELR.
That in turn determines a span of output indexes which are used to deallocate payloads (bit in an IIB) via OIL.
- A NACK is more complex since the payloads must be located.
The NACK provides a syndrome index which locates a syndrome command (via ELR) which locates the first output header in OHR for the packets that must be retransmitted.
Janus moves the payloads out of the way of new incoming payloads.
- One design would be to declare NACKs when a timeout for ACK from downstream had expired and timeout provided sufficient time for the above contingency to be employed.
If that is not feasible then a scan of the IIB can find and move the stragglers.
One activity runs far enough ahead of the input payload cursor (IPR:WriterE) to make way for new packets.
For the Input payload ring (IPB) of each particular interface we frequently look to see if the write cursor (IPR:WriterE) of that ring is too close to IPR:ReaderB.
((ReaderB − WriterE) too small)
If so we consult IIB for bits indicating straggling payloads.
We build (or augment) a pseudo output header ring which directs the DRAM mover to move the payload into cold storage.
When an interrupt or poll tells us that this move is done we increment ReaderB.
The input from the fiber will not overwrite our data for it consults ReaderB.
There is the issue of the output header in the OHR which will or already has informed the hardware of the location of the payload that we moved.
Here is one solution.
In a transaction that may or may not be feasible:
- stop the world.
- see if OHR:ReaderE has reached the output header of the moved payload.
- If so
- declare train wreck which maintains integrity and does not loose packets, but causes retransmission because of inadequate storage planning.
- If not
- then
- change OHR:WriterB back(SIC) to protect the header, if that would not violate ReaderE < WriterB. (otherwise declare train wreck, or argue that the transmission hardware will keep ahead of the receiving hardware.
I know no simple check that this likely event has happened.)
- Change address of payload in output header,
- change OHR:WriterB back to where you found it.
Even in the good case the input hardware may drop bits on the floor as it runs into our new boundary at OHR:WriterB
- start the world.
Train wrecks are bad because they are produced by excess load and negatively impact thru put—a sort of conflagration or chain reaction.
Presumably you buy more capacity or raise prices when this happens.
Here is a list of the activities that I foresee:
- Examining input headers and creating corresponding output headers.
- Moving stragglers from input payload rings after most have been sent and ACKed.
This reads and resets the Input busy bits.
Perhaps this includes packets launched on long fibers with high latency ACKs.
- Moving stragglers from input header rings.
Perhaps this is unified with former.
The most primitive pressure is to move the input payloads out of the way of newer payloads.
Such a payload lives until:
- it is ACKed,
- it is marshaled to depart on a long fiber,
- retransmission is required.
The last two cases require moving the payload out of the input stream area.
Long outgoing fibers are detected upon examining the input header.
Moving the payload may be done with asynchronous hardware, in essence with a pair of interfaces with no intervening fiber thus making all fibers short.
We learn or require retransmission either as a result of timeout or NACK.
Scenarios for a datagram passing thru a node: (“A:” for amortized)
- Hardware deposits header in input header ring.
- Hardware deposits payload in payload ring.
- A: Hardware deposits first error syndrome after our packet in header ring.
- A: Hardware updates input header ring cursor.
- A: Program verifies error syndrome.
- Program visits header and copies it to output header ring.
- A: Program puts ACK in output header for upstream node.
- A: Hardware transmits ACK.
- Hardware reads and transmits output header.
- Hardware transmits payload.
- A: Hardware deposits ACK from downstream.
- A: Program sees ACK and learns span of output serial numbers of packets on outgoing link.
- Program passes over output headers in OIL between error commands and turns off bit in IIB.
At step 12 there are the following data to be freed:
- payload in input payload ring,
- header in output header ring.
The ACK from downstream identifies the syndrome number which
a block of output packets by an ACK count which indexes into the ‘ring of first output header ring offsets’.
The logic that placed a “send error syndrome” in the output header ring recorded the cursor for the ring at that time.
The output header for the packet
Do we need the ‘Output indexed ring header locator’?
Perhaps to mop up the output headers.
Kaput?
There is a 1-1 correspondence between output headers and Output header locator.
They are in the same order. The output header includes the index of the locator and the locator includes the ring offset of the header. (We need better names!)
Output busy bit has same index as output header locator.
The output header includes the index into the input busy bit ring.
OIB: Output indexed busy bit ring
not needed ?