Efficiency of Hashed Orders

Efficiency of Dispatching

If the hash is moderately white then bits from the hash can index into an array of heads of chains of OCPs (Order Control Points). I imagine these OCP's in a memory array called the dispatch table, but do not require it here. The OCP might contain:

Next, Link to the next OCP in chain.
Code, Address of code that implements this order.
CHM, Mask for co hash,
CH, Co hash

The dispatch code might read as follows:OCP * ht[1 << hts]; ... OCP * ocpp = ht[(oc>>n)&((1 << hts) -1)]; while(ocpp){ui32 cv = oc ^ ocpp -> CH; if(! (cv & ocpp -> CHM)) {(*occp->Code)(cv); redispatch;} ocpp = ocpp ->Next; } order code is invalid.

This slightly obscure code requires some explanation. hts is a small integer known at compile time, KCPP time even. ht is the array of chain heads. ocpp is the cursor that walks down one of these chains. n is a tuning parameter, known at compile time and private to the object, that can be used to improve the hash efficiency in light of the symbolic order codes. hts may be tuned along with n. cv plays a dual role. If this OCP is the intended one, then cv will have just a few low order bits turned on which are passed to the routine to perform the order. If the order is "Fetch" for a node, then all bits to the left of the right 4 bits will be zero, and those four bits are passed to the routine to select the slot. In this case CHM will be 0xfffffff0. CHM = -1 for singleton orders. This makes the curious assumption that the order to fetch slot j for a node is H_n("Fetch") ^ j. I shall argue that this is a good plan.

I think that this might amount to about 10 instructions and 2 data cache misses in the commonest case where the first OCP is the correct one. If the dispatching code is in a page shared between objects then there are likely to be no cache misses for the instructions.

Second thoughts on this data structure

Efficient Type Checking

The subscript n in the previous equation was to note that the hash depends on the type of the invoked object. H_n("Fetch") might be taken as H("Node__Fetch") or as H(H("Node"), "Fetch"). I prefer the latter but cannot come up with a good reason. Perhaps a conventional call on an object can provide its type hash more conveniently.

Hash Collisions

Perhaps we should use a good hash like SHA1, just because it is so easy. Even a poor hash will serve pretty well however. A 32 bit hash can be had as merely the low 32 bits of a strong hash. If an object were to have 64K orders then occasional hash collisions would begin to happen. This is very bad design and even in this case modest renaming of orders solves the problem. If an object has 4K order clusters each of size 16 (Node__Fetch is such a cluster) then collisions begin to arise. I see ne reason to coerce all objects into this pattern. Indeed the conversion can be gradual.

Efficiency of Invocation Site

With a program such as KCPP going over the source before it gets to the compiler the array ht and OCP's can be automatically provided where the programmer merely specifies the symbolic order codes.

The call sites can be handled by a local transformation if we follow our current style of including the type of the invokee in symbolic form the order code for the invokee. A random 32 bit order code requires more infomarion than typical current order codes; the call site is likely to require about one extra instruction to hold the extra information. If something like header files are also available then compile time diagnostics are availbale. Lacking that only run time diagnostics are available. That is still much better than mysterious misbehavior due to misinterpretation of an order code delivered to an invokee of the wrong type. I think it is a good trade off.

The Command System

The command system can handle this easily. If any character string appears in the context of an order code, the hash function can be locally performed. Command files invoking these objects will certainly have symbolic order codes.

Really Polymorphic Orders

KT, KT+4, and perhaps a few more will remain unhashed, lest you need to know the type of an object to ask its type.