UP
There are some applications where the checkpoint-restart
facility is not adequate. A transaction-oriented system
will want to accept a transaction to update a data base
and at some point give the user a positive acknowledgement
that the transaction will be remembered. The application
cannot normally afford to wait until the next system-wide
checkpoint to give the acknowledgement.
The following argument shows that in such a system, transactions
that change the data base {"write transactions"} must be
idempotent. That is, it never hurts to do them twice.
Suppose a user submits a write transaction and Gnosis
crashes before he gets any acknowledgement. He knows Gnosis
has crashed because his Tymnet circuit is zapped. {It will
at least be zapped when Gnosis restarts.} He cannot know
whether Gnosis has committed to remember the transaction.
Gnosis may have crashed just before the transaction reached
it from Tymnet, or it may have crashed just after the acknowledgement
was sent to Tymnet {and the acknowledgement was lost by
the circuit being zapped}. Therefore he must resubmit the
transaction when Gnosis comes up. In case Gnosis has remembered
the transaction, it must be idempotent.
Transactions that are not idempotent can usually be
made so by simple expedience. For example, "transfer $100
from account A to account B" is not idempotent. But "transfer
$100 from account A to account B associating unique transaction
number N with this transaction, unless number N has been
used before" is.
See also:
Gray, J. Notes on operating systems. Report RJ 3120,
IBM Res. Ctr., San Jose, Calif., Oct. 1978. "A definitive
report of locking and recovery in a database system."
Lampson, B., and Sturgis, H. Crash recovery in a distributed
system. Xerox Res. Ctr., Palo Alto, Calif., 1976 {working
paper}.
Such applications will have to store a record
of recent transactions in some non-volatile storage. We
call that record a journal and we call the procedure (_journalizing).
One such form of storage is to send the information to
a different computer over a network. The storage in the
other computer may be volatile, but if the failures of the
two computers are uncorrelated and infrequent, the storage
is essentially non-volatile.
{ni}There is a proposal to allow pages in Gnosis to hold
information which is less volatile than normal.
It is significant that only data, not keys, are stored.
To support journalizing
we provide a special (_journal page).
Only the kernel can write into the journal page.
At restart, these values are set before any processes run.
Specifications:
0-7: The TOD {time-of-day clock in standard epoch} of
the most recent checkpoint. {This field is updated after
every checkpoint.}
8-15: The TOD when the checkpoint was taken from which the
system was restarted.
16-23: The TOD at the last restart.
The rest of the journal page is currently zero but may be
redefined.
GNOSIS MACLIB member JOURNALP is a macro which defines a
DSECT for the journal page.
See (p2,jp) for access to
the page key to the journalizer page.
Here is an example of journalizing, written in Algol68.
Sema mutex = Level 1;
Int local restart count := 0;
Ref Int restart count = locations 16 to 23 in journal page;
Flex [0:] Transaction nonvolatile storage;
Int next serial number;
Proc process a transaction (Transaction transaction) = Acknowledgement:
(
Down mutex; # only one process at a time here #
While local restart count < restart count Do # replay
transactions since the restart #
local restart count := restart count;
While Upb(nonvolatile storage) >= next serial number Do
Od;
If modifies database(transaction) Then
Acknowledgement a = read from database(transaction);
Up mutex;
a )
# Program Notes:
Both read and write transactions must go through this
procedure. An Acknowledgement includes any data read.
The acknowledgement returned will be sent to the user through
Tymnet; if the Tymnet circuit is zapped, the acknowledgement
should be discarded {it may be incorrect}. #
Exercise: Prove this works. {Knuth would rate this
about M37.}
See (p3,lc) about some speculative
ideas.