Data Abstraction

Procedural abstraction is nearly as old as when programming languages provided user defined subroutines. The procedure is hidden from its caller while the behavior is accessible to the degree which the caller probes the behavior. Data abstraction is more recent and occurs when the caller passes as an argument, a data structure whose internal form is accessible only to the procedure.

Data abstraction is a platform architecture feature or language feature that limits the code that can directly access the representation of a particular collection of data. The code to which such access is limited is called the custodial code here. Static abstraction is generally (always?) a language feature and has little runtime cost. Perhaps C++ is the best known language providing static abstraction; A reference to an instance of a class may not suffice to access a field in that instance, except by code within the definition of the class. Of course C++ is derived from C which fails to enforce the protection rules that the design of the C & C++ types seem designed to provide.

Dynamic data abstraction is provided today mainly by kernels using hardware features originally designed to allow kernels to protect clients from each other and remain in control when client code misbehaves. With the ideas of Morris languages can also provide dynamic abstraction. Just now I know no languages that support Morris’s ideas. Stiegler’s mechanism serves here in some cases. The Keykos platform uses ideas closely related to those of Morris.

Data abstraction is motivated on several grounds, usually just one at a time:

This feature requires formally delimiting the custodial code to the platform and this designation must itself conform to relevant protocols.

Dynamic data abstraction has runtime costs. It is dynamic in two senses:

The Keykos brand provides dynamic abstraction with new abstractions arising at run time, à-la Morris. Some computer languages provide static abstractions fixed at compile time. Some hardware architectures have made dynamic abstraction quite low cost whereas conventional hardware requires a trip thru the privileged code. Integrated Development Environments, to my knowledge, do not support the 2nd purpose of abstraction via language features.

Synergy goes beyond abstraction to protect data from access outside custodial code even when that data is held by agents outside that code. Abstraction goes beyond synergy by providing some verifiable type information.

Nexus on data abstraction