UP

Design Proposal

See (kernel-logic,scheduler) concerning current kernel behavior.

A Proposal

Assume that we associate a priority with a meter. A meter would have a manifest priority recorded as a data key in the meter {settable by anyone with a node key to the meter}. The real priority of a meter would the lesser of its manifest priority and the real priority of the meter designated in slot 1. The real priority of the primordial meter is infinite.

There are a small set of priority values with a simple ordering between them. Perhaps two values suffice.

There is a CPU queue for each value. When time is sliced the sliced domain is put on the queue indicated by the real priority of its meter. When a domain is selected to run from a cpu queue the first domain on the non-empty queue of highest priority is selected. When a fork is done and the resulting new process has a different priority the process with the higher priority is run and the other is put on the cpu queue indicated by its priority.

How it Solves the Problem

The Tymnet adaptor would have a meter of priority higher than most other programs. There is a set of domains P, called the performance kernel. If M is a real high priority meter then either all keys to M are held in domains of P or domains of P hold node keys to M {in order to throttle excessive use of M}.

Such a policy might be that no more than 50 ms be used on one of these meters in a 1 sec interval.

The programs of P have the authority necessary to insure that the Tymnet adaptor gets sufficient time sufficiently often.

Bloom Style Charge Sets

This design is more like the simple charge sets (p2,soon-sets) than like the full charge sets (chrgset). The idea is to use Bloom filters to represent sets statistically.

The Scheme

Select a medium quality hash function h from CDA numbers to integers between 0 and 2**15; i.e. from 24 bit numbers to 15 bit numbers.

Establish charge sets as they are now {for simple CS's} but allocate just one page to represent the set.

If domain under set s references a page or node with CDA = c then turn on bit h(c) in the page representing set s.

Otherwise the scheme is identical to the current simple charge sets.

The main advantages of Bloom filters here are:

They are even easier for the kernel to implement than simple chargesets {no page overflow!}.

They provide for high speed "set synergy tests".

The scheduler holds the charge set keys. Call the set of charge sets S.

The scheduler can read the set representation for these sets and in a simple and fast routine determine the size of unions of the various subsets of S. The scheduler can then select "compatible subsets" of S which will fit in core together {taking into consideration shared pages and nodes} and allocate a time for them to run.

Motivation for the Invalid Meter Trap

When the meter is invalid, there is no source of resources for activity except that we cause the domain to jump to its domain keeper. Notifying the keeper that something is wrong is extremely useful. The alternative {which we once took} is to delete the process, which leads to something stopping and no one knowing why.

The keeper may also have an invalid meter, but the unchargeable activity cannot be infinite because of the following argument. If a keeper is not a gate, we don't call it. If it is a start key, the keeper becomes busy. If it is a resume key, the key is deleted. So with each jump, the number of available domains plus the number of resume keys decreases, but cannot go negative.

As an added precaution against unchargeable activities interfering with chargeable ones, the resourceless domain is put at the end of the CPU queue before it traps.

Problems with the Invalid Meter Trap

The kernel now causes a domain with a process and an invalid meter to trap. This is the only exception to the rule that domains with invalid meters could cause no actions.

Once the kernel caused no such traps. It was changed because invalid meters were a common symptom of programming bugs and the change made it easier to find such bugs.

One problem with the current design is described at (p3,badmet).

Another problem is that the current design interferes with the following strategy which is otherwise the only solution to a problem that I describe here.

When a space bank guard is called to reclaim the material from which some object is built, the object may be running. As the guard deletes the pages and nodes of the object it may take actions that it would never have taken except for the interference of the guard.

If the design of the guard were changed slightly to include a meter node reference, the guard could "shut down power" by deleting the meter as its initial action and thus terminate the actions of the object before it had taken any undesigned actions.

This assumes that all of the domains of the object act under the indicated meter. This is at least frequently easy to arrange.

Possible Kernel Hooks for Scheduling

See (p3,admeter) for some older scheduler design.

The current kernel scheduler is extremely crude. Every 200 ms the running job is placed at the end of the CPU queue and then the first member of that queue is started {(slice)}. No domain is immune to time slicing.

The best that can be easily argued for this scheme is that it is easy to understand and that programs that use little time are unlikely to be sliced frequently. Its simplicity leads to low overhead.

The problem: Suppose domain D provides a quick public service and many domains call it frequently.

Is it worth paying some penalty in performance and complexity to avoid this case?

How serious is it that D will occasionally be sliced? Would the system indeed be better if D were not sliced?

If there were no external schedulers, CPU bound domains would share the CPU equally.

There are a number of considerations which mitigate toward other policies. Meters were invented, in part, to implement such policies. Can meters solve this problem also?

It seems most likely that the callers of D will be within some scheduling entities.

The mutual goals of the internal and external schedulers should be written down. They aren't yet {but see (scheduler,)}. I will posit a few here for the purposes here.

First Aspect: Use hardware efficiently. Arrange for maximal concurrency.

Adhere to externally specified policies on resource utilization by various applications.

Adhere to externally specified policies on responsiveness by certain applications.

Second Aspect: Support cohabitation of compatible applications.

We define "application" here:

A body of code and a set of domains and a requirement for CPU, core residence and disk I/O.

We define "compatible" here:

A set of applications is compatible if the total metered resources do not exceed that available on the entire system.

These ideas imply the scheduling of applications. We do not intend to deny the existence of decision support activity or problem solving activity {which includes program development}.

The Multiple Traps Caused by a Single Meter Exhaustion

The section (p3,meter-pain) describes the inconvenience associated with the fact that the exhaustion of a meter can cause several traps. I mention here the seed of an idea that might circumvent that problem.

Suppose that each of the three meter counters has a new state that is entered when exhaustion causes some domain to trap and that that a counter in that state will not cause further traps. This might take just a few commands in the kernel and save both complexity in schedulers and overhead. This state would be logical.

Releasing the Stall Queue

While a meter keeper is running due to exhausted resources other domains under the same meter may try to run. These other domains will be placed on the stall queue of the keeper.

If the keeper replenishes the meter then when the keeper returns to the resume key that he holds the first member of this stall queue will be placed on the CPU queue. Two of the domains under this meter will thus be quickly restarted. Other members will wait, however, for the worrier. This seems to be an undesirable effect that is difficult to program around.

I propose a capability very much like the RETURNER that is implemented in the kernel, which places all members on the jumper's stall queue on the CPU queue. This key would be useful by being called or returned to.

Segment keepers may have similar needs for such a key. In general any program that removes an impediment from other programs could use such a key.

Trojan Meters

See (p2,req-pit) concerning a subtle security problem introduced (historically) by explicit calls to meter keepers {(metercall)}.

There is another anomaly with meter keepers in that they do not get to see order code kt. This is so that existing meters without a keeper will respond to the kt call.

One solution would be to define the "no-call" bit for the meter which would cause the meter key to reply with kt+2 to any order code but kt. Perhaps also M(kt;==>6;M1) would produce a meter key like M except with the "no-call" bit.

I have deinstalled the explicit meter call stuff until we arrive at and implement some such solution.