Managing Coroutines

I use “coroutine” to refer to a class of solutions to problems where the fundamental subroutine facility of the platform is employed symmetrically to better solve a class of problems. The first language, other than assembler, that I know that could do this sort of coroutines is Scheme. Very recently the GO language expresses coroutines.

We describe here the first stage of a coroutine technology for the Intel x86-64 bit environment. This technology is in the same category as the compiler. It depends greatly on the ABI of the platform as does the compiler. It is expressed in assembler because the necessary function cannot be expressed in C. This particular version is debugged on Mac OS and clang. It is not much technology however.

The Scene

There are two bodies of C code known here only as left and right. There are call sites in the respective bodies to PutGetL and PutGetR. In general one of these bodies is running and the other is waiting at one of its PutGetx call sites. As the running body reaches a PutGetx(z) call site its argument z becomes the value of the PutGetx'(z') expression in the other body; that body then proceeds with value z while the first awaits a value. A value of type vR is transferred as left calls PutGetL and thus delivers that value as the value of PutGetR where right is waiting for such a value—and conversely.

The technology is all in the 30 machine instructions of PutGet.s. The header file x.h is universal for now. The header file defines the types vL of the messages flowing to the left and vR for the messages flowing to the right. Perhaps surprisingly the machine language code is not very sensitive to these types but for now we will be inflexible. We assume here that left launches right by calling begin which actually returns the argument provided by the right body in its first invocation of PutGetR. Calling begin(x) creates a new stack and then calls startR(x) which should be provided by the technology user in the form of an entry to a subroutine which initiates the right body of the coroutine.

Mac Demo

As the most trivial demo we supply the following: main.c assumes the rôle of left and calls begin. R.c defines startR and holds the right body in one line of C code. Now:
clang main.c R.c PutGet.s -Wmost -O3
./a.out
works for me.

Linux Demo

gcc main.c R.c PutGetLin.s -O3 -Wall -std=c99
./a.out
works for me on Linux.

Quitting, a semantic quandary

It is unclear how to reclaim the stack space when done. Typically one side discovers that there is no more to do and must somehow signal this to the other side. Sometimes a special reserved value can serve to signal. A shared global variable will do. It is tempting to allow the right body to return from the startR function which could lead to recovery of the stack space, but that leaves the left body in an awkward state. Just now I do not solve this problem.

Perhaps the side that called begin should be responsible for disbanding its partner. Perhaps the partner has some cleanup to do. (??) A common case is for the partner to be accepting no values. It is an awkward modification to take a value and test it on each transfer.

Modifying vL and vR

The machine code for PutGetX is fairly insensitive to the types of the arguments. It depends on the ABI. Just now PutGetL is identical to PutGetR and the code is not duplicated. The type int is 32 bits but I think that any 64 bit type works as well. In fact the x86 ABI allows a double floating value to be passed without changing the machine code. Other CPU architectures, even other ABI's might not work so well.

Loose Ends

Reclaim the extra stack when done.

No guidance for multiple coroutines in one program (But see this.)

Perhaps first argument to PutGetR should not appear as begin’s value.

No stack overflow protection.

No test for malloc failure.


Notes to myself as I debugged this.
What I forget between my bouts with gdb.
Here is another discussion of the problem along with several ugly solutions. My solution is ugly in its own way.
Wikipedia discusses solutions such as this under “stack-based coroutines” here.
I do not understand all of this.
this may be worth attention.
I need to understand this for Haskell.