Many years ago I read some papers on the architecture of the phone company’s ESS systems (before SS7 I think). I was impressed that they had a theory of resets that helped organize some ideas on how to build systems with high availability even amid functional upgrades. I cannot now find those documents but I will recount here some of the ideas that led us to adopt persistence for the Keykos design.

The ESS had several reset levels, perhaps four or five. They were in a partial ordering, even a simple ordering in the cases they explained. I do not recall them in detail but here are a few:

They used terms such as “level three reset” I recall, but Google is ignorant of that phrase. Low level number resets were mild. Such terms came from hardware engineering and such resets could be triggered by various means but after installation the most severe resets were not expected in the lifetime of a typical installation.

Application to Keykos

I took from their theory of resets the idea that application upgrades should reset what was necessary for the task at hand, and no more. Most systems today have a meagre set of resets: These are crude tools—special forms of reset. Keykos meters can stop portions of the system but retain the option to continue. What portions can be thus controlled is itself part of the application design, just as in the design of digital hardware.

The space bank provides means to reclaim space when the application that allocated it goes berserk.

In practice we found our long lived abstracted states quite robust and soon forgot our fear of state rot in old objects. We had the good fortune to be running on hardware with a mean-time to failure of perhaps a year (it was a main-frame) and well duplexed disk storage. We almost never resorted to a tape checkpoint except to assure ourselves that is was possible. IBM’s VSAM was indeed an abstracted file system that maintained order with early versions of balanced tree mechanisms. I never even heard misgivings that the data was not out in plain view.