Prob Aprox Correct

Probably Approximately Correct

Leslie Valiant

Valiant gives names to familiar processes to help us think and theorize about learning. He outlines desiderata of a theory of learning which is very near to what it means to be intelligent. I mostly agree with them. The perspective is more nearly mechanical than psychological.

As I read

L 304: The first chapter tries to convince the reader that the goal is revolutionary. I don’t think so, but I do look forward to the Author’s proposals. I think that he has a novel and profitable perspective. Valiant turns analytic ability to mundane epistemology—about time!

L 412: Valiant wants a more quantitative theory of evolution. Sounds like a good idea.

L 420: I am getting a whiff of ‘learning to learn’.

L 588: “Prior to Turing, mathematics was dominated by the continuous mathematics used to describe physics, in which (classically, anyway) changes are thought of as taking place in arbitrarily small, infinitesimal increments.”
Don’t forget number theory. Euclid proved the infinitude of primes. 19th century number theory remains graduate fodder today. Not to mention ‘combinatorics’.

L 643: Don’t take the lower bound on multiply complexity too seriously because it is actually better than Valiant says for numbers of several thousand bits. (Indeed see Karatsuba result later.) Few results are proven for lower bounds.

L 725:

Prove some impossibility result that shows, for example, that for the model defined no algorithm exists that takes fewer than so many steps. This is seldom possible. Sometimes it is and then it is useful.

L 745:

Is multiplication an inherently harder task than addition, or does it just appear to be so? The practical question is how much does it take to multiply two 15 digit numbers, not to multiply two million digit numbers. Someone said: “We do not live in asymptopia.”.

L 777: The public key presentation is very good—Collatz’s problem too.

L 999: It is indeed noteworthy that the Turing tape and DNA are one dimensional and digital. I had not noticed this. Turing’s machines are pathologically inefficient and I hope that Valiant does not try too hard at Turing tape algorithms.

L 1004: I think that ‘protein expression circuit’ needs a long clear definition. I have a long muddy and probably wrong impression of it: Some times, (as contrasted with other times in the same cell), the concentrations of various proteins control what other proteins the DNA produces by blocking or facilitating some particular gene for that latter protein. This influence is in part by the controlling proteins sticking to introns near the gene.

Valiant proceeds to describe the notion further, somewhat complementing my description.

I am quite sure that the first introns said something like: “Don’t do the next exon if there is much niacin about.” Some proteins will evolve only to become part of a circuit and thus support more indirect rules while the primitive rules remain simple. This is a far easier space to explore than the function class described by Valiant.

L 1069: Concerning the connection between cognition and computing:

Because of subsequent slow progress in making this connection concrete or useful, some have despaired that it can never be made into more than metaphor, and that for fundamental reasons it cannot be made into a science. I disagree. I believe that developing any new science is fraught with challenges, and that we are making progress in this area at about the pace that might be reasonable to expect. I agree with valiant.

L 1179: The URN: Valiant should say “draw without replacement” to make sense of the rest of his text.

L 1341: Valiant makes points much like this and this. This chapter could be titled “Math Envy” of philosophers who try to make ordinary knowledge and logic meet mathematical standards. Induction works just fine for purposes of survival. This is a complementary conundrum to the observation of the unreasonable effectiveness of mathematics.

L 1550:

Rather, one expects to see a sequence of concepts, such that the later ones become accessible to the learner only after the earlier ones have been mastered. There is an important difference between learning a concept and mastering it. You learn it in the class that teaches it and you master it in a subsequent class that relies heavily on the concept. 50 years later you still have those that you mastered. Those that you merely learned can at least be refreshed more easily.

L 1558:

When programming a computer, the programmer needs to understand what exactly the existing programs already implemented on the machine do. How old school. I wish it were so. There is, however, an extant culture that believes this. I do too. Valiant goes on with his talk about the dog concept to explain how the new programmer is expected to learn a language starting with “Hello World”.

L 1571: I think that abstraction and ‘clumping’ are necessary concepts here in an adequate theory of learning.

L 1592: I think that Valiant’s ‘teacher’ concept does not exclude a book, especially a text book. Interactive teachers, human or computer, have advantages.

L 1637: Valiant asks how intelligence might have evolved. Bravo! Many biologists won’t even admit that intelligence is an important human characteristic.

L 1679:

The genome further encodes a network, one that describes the conditions, in terms of the concentrations of proteins present in a cell at one time, that are necessary for a particular other protein to be expressed. That is a strong statement. It is often assumed as a starting point. The only obvious additional input data that occur to me is temperature and light.

L 1723:

Choosing the class of functions with which to explore these questions presents a stark dilemma, the predicament between Scylla and Charybdis. Indeed. It may be necessary to explore intron logic more closely in order to choose which functions to explore.

L 1728:

an actual evolution algorithm This may be a bad notion. We indeed need to know quantitatively how genotype and phenotype navigate the space of possibilities. The ‘algorithm’ metaphor sucks for a lack of a site for it to run. Algorithms are executed by machinery. From the little reading I have done the gamut of functional hacks found in introns defies any simple characterization. Thus I don’t especially object to Valiants broad set.

L 1770: In Valiant’s conception of ‘ideal function’ makes it important that the class of functions not include too many functions that cannot be expressed in DNA. It may be 100 years too soon to achieve Valiant’s program.

L 1844:

Biological systems are also believed to be highly modular. Indeed, that would seem to offer the only chance for us to ever understand them. However, it is not quite as obvious what advantages modularity offers in the case of biology.] It would be good to examine some such biological modules to compare sorts of ‘modulism’. Most of our organs have each some particular function. ‘Division of labor’ is good in biology as well as economics.

At this point in the book I record the following. I think it may be fruitful to speak of evolution as a form of learning. There is a sense in which even plants understand things about their environment. I think it is useful to speak of algorithms the higher animals employ to learn. I think it is not useful, however, to seek the ‘algorithm’ by which evolution learns. When animals learn there are things happening in the brain which is the site of the algorithm. There is no such site, nor algorithm, in the case of the process by which evolution learns.

L 1981: I am queasy about the notion that the genome must approach the ‘ideal function f’ in order for the species to thrive. It must soon avoid fatal consequences and thereafter merely meander uphill, ‘up’ being a multidimensional vector. What bothers me is that each individual contributes only about one bit of in this learning. Of course there are only about 10¹⁰ bits to be learned which I find a surprisingly small number. We need several times that number of ancestors, but not that many generations.

L 1996:

Hence, PAC learning is at least as powerful as SQ learning. Indeed, it turns out that some PAC learnable classes are not SQ learnable, and hence that SQ is more constrained than PAC. PAC may be faster, but it comes to the wrong conclusion more often too.

L 2018: I think that valiant assumes that evolution is working on only one problem at a time and that one SNP is on trial for the solution of just one problem. He still seems to think that we need to find nature’s algorithm. There is no such algorithm for there is no site at which to express it. The mechanisms are already all in plain sight but we may not have noticed all of the various ‘failure modes’ of DNA copying which are in fact what progress is due to. Valiant seems to argue for the existence of something more obscure or even mystical. Here is my take.

Did we evolve to evolve? Well, the invention of DNA was a such a sort of thing. DNA copy correction mechanisms may have improved evolution by slowing it down.

L 2047:

For both problems there is no chance of testing exhaustively all the 2n possibilities if n is large, such as 20,000. This is the problem adequately addressed in “Climbing Mount Improbable” by Dawkins. These functions that have sufficed to define Life, are gradual, not precipitous. The exclusive or of several proteins as a production condition, never occurs, I would wager.

L 2087: As I read the book I find Valiant relying too heavily on aspects of computing theory that I found questionable as he introduced them. Computers do all sorts of useful exponential things but are yet totally incompetent of solving some polynomial problems. We do not live in asymptopia, and neither does nature.

L 2123: I still object to reasoning about the ‘ideal solution’ f. It smells of teleology. It might be rescued if one could show that choice of f is immaterial, but I can’t see how to do that.

L 2228: I would accuse Valiant of “syllogistic” reasoning in place of “inductive” reasoning in this context, where he praises inductive reasoning, but take that with a grain of salt. At least he has noticed that some computers induce. Incidentally doing math requires much induction.

L 2280: I like section 7.2 and have some quibbles. Valiant contrasts reflexes and reasoning. ((My insight is that reasoning is itself a form of reflex, but this is not to denigrate Valiant’s points. It is an evolutionary explanation.)) We acquire reflexes and also patterns: “There are no rabbits to be found when it is raining.”. These are stored in different parts of the head. The latter patterns are rather like the rules that DNA conveys on when to produce some protein. The logic of higher animals is analogous to but not a descendent of these intron rules. It started out simple and largely remains simple today. (Wrong context.)

L 2325:

Turing’s proof that the Halting Problem is not computable can be viewed as an early warning of what has been called the doom of formalism. It is doomed more simply: too complex in most cases. Valiant makes nearly the same point next.

L 2436: I love the Galton quote.

L 2476: In contrast to Valiants use of computability theory, I think his analogies between brain and computer hardware are good and highly relevant.

L 2689: Mostly we do not live in a physical world, we live mainly in a cultural world. Our heads mainly process expressed thoughts of others, or wonder what others are thinking. Maybe Eagles live in a more physical world.

L 2876:

One of the chief goals of this chapter is to explain how the ecorithmic view gives a persuasive explanation of why artificial intelligence is proving so much more difficult to achieve than expected. This may sound like an excuse, but no one can deny that a good excuse is badly needed. Touché.

L 2892: Valiant keeps saying that computer learning is already common place. He should enumerate a few concrete successes. Valiant goes on to mention a few later. Google’s machine translation is now mostly learned in Valiant’s sense and people can sample its worth. (It is useful.)

L 2984:

There is no reason to believe that the states of babies are easy to describe. The algorithms of evolution may be theoryful, but their results remain, from our perspective, theoryless. Well said.

L 3009:

Leaving aside learning processes after birth, what can we say about the problem of characterizing babies at birth? I think that not all of the DNA brain heritage has been delivered to the brain at birth. A year or so must pass during which both nature and nurture operate. Morphogenesis goes on after birth due in part to the small birth canal.

L 3214:

The learning and reasoning carried out by an organism during its life had limited impact that outlived the individual. … except for the progeny of that individual—thus the evolutionary pressure on reasoning.

L 3263:

An individual could for the first time acquire circuits that represented the knowledge gained from experience by thousands of different people, without having to go through their experiences. Better yet different people can learn different specialities from the past and collectively apply this heritage—‘division of labor’. The ‘last universalist’ was at least a century ago and when you include the crafts probably much further back.

New Scientist: 2016 March 26 issue: page 33: “Intelligent Evolution” is on this subject.

Turing’s computing theories are very poor quantitatively. Distinguishing between P and NP won’t do it. I agree that it needs to be done. People working with neural nets might come up with something sometime, but not soon I think.