Kanerva’s Memory Idea

I read a report by Pentti Kanerva, done at the Stanford Center for the Study of Language and Information, about a scheme for how memory might work. It is Report No. CSLI-84-7, March 1984. The title is “Self-propagating Search: A Unified Theory of Memory”. I recorded some of his ideas here from memory when I could not find the report. I will probably warp them so if you find this interesting you should track down the original. I do not pretend to add anything original here. Kanerva has recently written “Fully Distributed Representation” (.pdf) (and more recently “Hyperdimensional Computing”)

The idea may be presented in analogy with computer memories. For those few who don’t know how computer memories work I recapitulate here the facts relevant to the analogy. A location where data can be remembered in RAM is identified and located with an address which is a set of bits. There is in the machine an address bus with a wire for each of these bits. In a computer there may be from about 16 to about 40 wires in the address bus. If there are n bits in an address then there are 2ⁿ locations for data that can be addressed by that computer. For most modern computers each address is for one byte or eight bits.

As the computer runs, electric circuits provides signals on the address bus and each wire is set to either 0 or 1. The address is determined by the logic of the program. These signals constitute the address of some data that the program wants to access. Sometimes the intent of the program is to modify what is at that location. Usually it is to see what data is already at that location. Each location has just one address and one address accesses at most one location. Sometimes it is impossible to add more memory to a computer because all possible addresses are in use.

Kanerva’s idea proposes that there is one (or perhaps several) address busses in the brain. I recall that he is careful to say that such a bus is probably not localized, nor is the generation of the address localized. Unlike the computer these busses have many more bits than a computer designer would consider necessary. Kanerva suggests between 100 and 1000 bits in an address. This is not to suggest that there are 2¹⁰⁰ bits that the brain remembers. To access a memory or memory component requires an address that is close to the right address. Two addresses are close if most of their bits are the same. Indeed there may be no right address, merely a cluster of similar patterns that stimulate the memory. Surprising to many is the large tolerance for error on such a redundant address bus while still getting the right data. Kanerva reports on these combinatorics extensively. There is no suggestion that this ‘address bus analog’ be localized in the brain, indeed that would be surprising. The various signals of the ‘address’ would be carried by distinct neurons and perhaps regenerated to reach a larger set of memory ‘cells’.

A pattern (memory) that emerges when such a memory location is accessed may include further addresses, perhaps more than one per location. If such addresses are themselves put on the address bus then a sequence of memories replays.

In about 1956 the linked list data structure was discovered in computer programming. This is the idea that a piece of data will include the address of related data. Current students of computer science can scarcely conceive of programming without such structures. I can remember the day and book clearly when I became aware of the idea; but that’s another story.

The subjective experience that: {Given a tune fragment, one can recall what follows the fragment, but not what precedes the fragment} is naturally explained by this model. Computer memories are unable to quickly find the address of a location with given data. If we remember a tune by several notes forming an address to locate the next few notes, then it is easy to carry the tune forward but not backwards.

Here is a tantalizing bit of evidence.

If each of the bit signals on an address bus were distributed widely through out a large portion of the brain then neurons with about an equal number of inhibitory and excitatory synapses could respond much as a computer memory location. (Pardon the mangled terminology, I am not versed in neuro-anatomy.) A major question for this model is whether new memories are formed by adjusting synapses (to be negative or positive) or instead new neurons being allocated. Computers allocate old unused locations to new data.

This model thus permits a degree of physical distribution. Related ideas need not be stored nearby. Indeed this was the great charm when the idea was discovered in computers. It meant that you didn’t have to plan where you put your data so as to make room for related data. Some programs could not anticipate such future requirements in time to choose locations for the early data.

As applied in computers these structures are brittle and a damaged bit can make them unusable. The corresponding Kanerva structure is not damaged by a single failed synapse. A memory could be lost with the loss of one axon. For a more robust programming scheme see Bloom filters.

We might require an explanation of why the brain has more than the necessary number of address bits. Well we don’t know that it does have more. I don’t recall whether Kanerva says why he thinks that it does. It is not easy to write a program, however, to quickly find data that is similar to given data. My intuition is that a Kanerva like structure might improve computer design for some problems. Computers seem awkward at solving some classes of clustering problems efficiently. Another reason is that evolution has not discovered or needed economy on the address bus.

I think that I can see how Kanerva like structures can evolve from what the engineer would call random logic, i.e. logic with no such organizing principles as described above. General hardware to move data from the retrieved memory (the data bus) onto the memory bus is a fundamental invention that the earliest computers lacked. Such a structure may have evolved just to support sequential replay of memories. (Left brain stuff.) Ironically 1954 computers had much more efficient sequential replay mechanisms but those mechanisms lacked the power of linked lists. (In 1954 you incremented a binary number to form an address of the next location in a sequence.)

Kanerva’s Fully Distributed Representation,
My Notes on a Kanerva Paper
My riff on these ideas
Tie-in with Kahneman’s “associative activation”
Searle’s Perspective

We must relate these ideas with what is known about Long Term Potentiation

Kanaverva’s recent survey of similar ideas

Kanerva’s ideas should influence neural net hardware. Perhaps it does. I wonder.