Whatever Next

Andy Clark’s Whatever Next?

Predictive brains, situated agents, and the future of cognitive science

As I begin to read “Whatever Next?” by Andy Clark, I see from the first paragraph that I have already some ideas that I want to posit here so I will know better what ideas I already had as I began.

I think that the brain at the most basic level notes correlations in the information (signals) that it gets. That information is already organized by how it arrived at the brain and when it arrived. At this level there is no concept, or even preconcept of causality. Brains had no adaptive advantage without motor signals going out to parts of the body including muscles. This much brain function can already cause the organism to move in response to ‘bad’ signals. No causality concept need inhabit such a brain for adaptive advantage. Much much later the brain began to have models of the world and the models were useful only because they included causality. Some signals or simple functions thereof have a valence (good or bad) and the organism would adjudicate between choices of action depending on the valence of the likely outcomes according to the causation model. Statisticians sometimes say that causality cannot be established without intervention. This is sort of a converse, causation is a useless (to us critters) without intervention.

Having read beyond the first few paragraphs I realize the Clark is speculating on the interval over which I jumped in the “much much later” above.

That way (borrowing from work in linear predictive coding—see below) depicts the top-down ﬂow as attempting to predict and fully “explain away” the driving sensory signal, leaving only any residual “prediction errors” to propagate information forward within the system ... This is highly analogous to what video coders do to compress visual data in order to conserve bandwidth.

OK that was Clark’s next observation. Just this morning I began to delve into VP9.

Page 3: “Hierarchical approaches in which ...”
Prediction is about the future. Each prediction I have seen mentioned so far falls in a slightly larger category of noticing that if such-and-such holds here, then most likely something else holds elsewhere, or even something particular happened in the past, which is relevant because it bears on the future. In short implications. If I see this, I expect that—all to the end of thriving. Anticipating the future is only the end goal of seeing patterns; extending patterns to the past in support of foreseeing the future is an important tactic.

The backward connections allow the activity at one stage of the processing to return as another input at the previous stage. So long as the successfully predicts the lower level activity, all is well, and no further action needs to ensue. But where there is a mismatch, “prediction error” occurs and the ensuing (error-indicating) activity is propagated to the higher level. But see some related ideas. I think that the evident differences in the style of description of what is going on are unimportant. In either style effects flow in both directions . The two styles consider opposite directions to be primary.

Talk of ‘free energy’ reminds me of Lagrangian methods of describing the mechanical world. I miss talk of ‘frames’ introduced in Inside Jokes. Frames are memories of past situations, and generalizations thereof, that guide expectations. Perhaps “hierarchical generative models” is about the same as frame.

What is most distinctive about this duplex architectural proposal ... is that it depicts the forward flow of information as solely conveying error, and the backward flow as solely conveying predictions. This is like video compression that omits key frames—not good engineering! Perhaps this difference in explanation is mere relabeling of the parts of the mechanism. In any case the errors must identify features in the model responsible for the prediction and such means of identification are obscure. It is important to know whether the paw or the mane of the lion is off-color. The model that produces the prediction must include orientation of the modeled beast including where its various parts are in the raw image. It is like a graphics program trying to figure out what the user clicks on when the program has generated a 2D image from a 3D model. The mouse information provides only two coördinates!

I think that there is another important mechanism in place that is closely related to these and might explain some of the conundrums. See this. We have the sensation that our eyes have transferred the entire image to us. It seems, however, that what we have have gained is a map that supports quick and efficient access to the scene and the scene is not in our head. Such access requires coding of “where in the scene” information and such coding bears on the format of error signals.

Experimental tests have also recently been proposed (Maloney & Mamassian 2009) which aim to “operationalize” the claim that a target system is (genuinely) computing its outputs using a Bayesian scheme, rather then merely behaving “as if” it did so. This is most peculiar distinction! The IBM 1620 did multiplication by table lookup. Does that bring into question whether it could multiply?

This mechanism differs from what I describe and is more complex, I think. Both schemes rely on feed back. The problem is that the higher level mechanisms probably had to contend with too many possibilities for what the next word will be.

As thought, sensing, and movement here unfold, we discover no stable or well-specified interface or interfaces between cognition and perception. I agree that this is surprising and counter to early plans to understand the brain. In retrospect and in light of evolution it is much less surprising. Early organisms needed simple analogs to all of these functions and in retrospect there was no reason for evolution to evolve a boundary. The integrated solution espoused here has advantages that a boundary would impede. It would be good to study the software that drives cars to see whether there is such a boundary and if so whether it would be good to engineer it away.

How can a neutral imperative to minimize prediction error by enslaving perception, action, and attention accommodate the obvious fact that animals don’t simply seek a nice dark room and stay in it? Surely staying still inside a darkened room would afford easy and nigh-perfect prediction of our own unfolding neural states? A delicious question. A simple answer is hunger. Perhaps there is more than one valence mechanism. Then there are ‘goals’ that must, of course, be fit into this framework.

P 15.5 I pursue a fanciful evolutionary path to systems such as suggested in the paper. When there were few neurons some simple constant boolean functions could relate motor output to sensory input. Evolution stretched this so as to have many stages and indeed loops in the information paths. I envision a diagram with causes at the left, and neuron signals flowing to the right with motor output to the right. The loops are absolutely critical but I ignore them in this simple description. Servo feedback is a very simple mechanism that nature must have found as soon as there were more than a few neurons. In important and simple cases the effect is to achieve some particular useful motion in the presence of varying resistance. Once discovered nature extended this invention from right to left in our diagram and at the left end we have the notion that the sense organisms are “trying to inform downstream agents about relevant facts-of-the-matter, The agent tells the sense organ what it presumes and the sense organ tells the agent how it is wrong.