Tuesday, August 28, 2018

Real and Artificial lntelligence


BOOK REVIEW: Common Sense, The Turing Test and The Quest for Real AI by Hector Levesque (MIT Press, 2017)

Levesque’s lucid and brief (156 pages) book is an elegant and timely antidote to the overblown hype, snaky language, and grandiloquent assumptions lacing the popular writing on AI.  It’s surprising that much of this overwrought writing is by bona-fide computer science and other prominent professionals.   Levesque uses very little jargon and carefully defines for the general reader what specialized terminology he does introduce.  Going flat out on accessibility, he even makes a point of using almost no math in the entire book, but can’t resist a brief description with simple algebra examples of how math actually works in computing machines, demystifying it a little. 

A major theme is that the current exclusive focus on what Levesque calls adaptive machine learning (AML) in image recognition, self-driving cars, medical diagnoses, and similar applications typically using neural networks, has severe limitations.  It is basically “training on massive amounts of data,” which is a radically different approach than the “good old-fashioned AI” (GOFAI) of the past several decades, which attempted to emulate thinking.  GOFAI sought “common sense,” which quote references an oft-referenced paper by AI pioneer John McCarthy, “Programs with Common Sense,” from 1956, the year of the famous meeting at Dartmouth by some foundational early thinkers about AI (McCarthy, Allen Newell, Herbert Simon, Marvin Minsky et al).  GOFAI was much more concerned with using language, symbols, and knowledge to compute solutions than the current emphasis on machine learning from training on big data sets.

Levesque pays great attention to the question of what we mean by intelligence in ourselves and in machines, distinguishing him from the run of AI savants these days.  Levesque, unlike almost all other AI writers, actually gives a definition of intelligence: “People are behaving intelligently when they are making effective use of what they know to get what they want” (p. 40).  This puts the emphasis on knowledge, stressing the requirement that ability to deduce from knowledge not directly related to a subject under consideration is a critical component of intelligence.  Intelligence handles the unexpected.  It does so through its ability to bring a wide array of knowledge to any problem.  This is “common sense.”  The “knowledge representation hypothesis” is derived from Leibniz and explicated by philosopher Brian Smith (pp 119 – 122).  It’s basic implication for AI is summarized in three lengthy bullets, which I with temerity abbreviate here as:
- An intelligent system must have an extensive knowledge base, stored symbolically.
- The system processes the knowledge base using logical rules to derive new symbolic representations that go beyond what was explicitly represented in the knowledge base.
 - Conclusions derived from the above drive actions.

Levesque shows how AI systems designed based on this representation will be more likely than AML systems to be both reliable and predictable.  To Levesque, reliability and predictability are essential, and the AML approach de-emphasizes them in favor of achieving a preponderance of positive results.

But Levesque doesn’t say that the knowledge representation is all we need, or that AML doesn’t have a place in AI.  He introduces the concept of “The Big Puzzle,” which is that intelligence has many aspects that we don’t completely understand, like different parts of a jigsaw puzzle that we have to solve separately and then bring together.  Parts of the big puzzle include language (symbolic representation), psychology, neuroscience, and evolution, among others.  He notes that one large difficulty in solving the big puzzle is reconstructing a process from its output.  This is inherently difficult, and he demonstrates the point brilliantly with a description of a very simple computer program (introducing neatly some basic algorithmic concepts in the process).  When you see the output, you get his point about how difficult it would be to determine how the program worked, just by examining the output.  Similarly, watching a bird or an airplane fly gives little clue how to design a flying machine. His approach is to take “the design stance,” which he describes by example of the Wright brothers’ approach of understanding in general the processes needed to lift an object moving in air and designing the most practical way to achieve it, rather than learning to fly by imitating birds, which was tried and tried and never worked. He’s saying that neural nets are like trying to fly by imitating birds.  The analogy is limited because neural nets clearly have achieved impressive results, but the results are more like effective data processing rather than what we would call intelligence in a person.   

The Turing test has come to be seen as a valid way to judge whether a computer exhibits human-level intelligence.  Its basis is that intelligence can only be judged by “externally observable behavior,” i.e. its results.  Levesque has an issue with the Turing test.  It requires that the computer be judged on is ability to “fake” human reasoning and common sense, rather than meeting a more objective standard. 
 As a way to overcome the shortcomings of the Turing test, Levesque points out the value of (and has done a lot of research on) Winograd schemas in testing AI systems.  Winograd schemas have simple right and wrong answers. They are statements with a specific but ambiguous pronoun that requires background knowledge or “common sense,” outside the specific facts under consideration, to decide between two specific alternative understandings. Example:
“The trophy would not fit in the brown suitcase because it was too small.”  What was too small?
- the suitcase?
- the trophy?

Levesque questions whether general intelligence AI may even be a worthwhile goal, and notes the trend in actual AI development is for specialized intelligent systems for assistance with specific tasks vs human-level general AI.  He notes that even chess-playing programs are not treated as competition by human players, who today use them for practice and advice in human-to-human competition, which is what they are interested in.  He’s doubtful that anyone will find artificial general intelligence worth paying for.  He also thinks the danger of an autonomous “singularity” taking control is overblown, and shouldn’t be considered on a par with other societal dangers like pollution and overpopulation.  He gives little credence to the idea of superintelligence happening spontaneously or by accident: “Inadvertently producing a superintelligent machine would be like inadvertently putting a man on the moon!”  Autonomy is the real risk, not superintelligence, and autonomy can be carefully controlled.  (Don’t let cars self-drive, for example.  Airline pilots use auto-pilot, they monitor and maintain control over it at all times, never ceding autonomy.)

I’ve been surprised that most of the writing over the past decade on AI is, unlike Levesque, really fuzzy about what we mean by “artificial” intelligence and “general” intelligence, but then goes on to use those terms extensively and as central topics.  If we are going after human-level or “general” intelligence, which these authors commonly assume, or if we’re afraid it may arise spontaneously from our machines, then it seems important to have at least a working definition of intelligence to focus on.  Defining the goal, or the threat, seems an essential first step.  Human-level intelligence is usually considered the model for “general” intelligence, implying that human-level intelligence is something we understand well.  Sometimes there is a discussion of why this is assumed, sometimes involving a discussion of the difficulty of defining intelligence and/or general intelligence, but then concluding that human intelligence is the one we know, so it’s the best reference we have. But how well do we really know it? The idea is then extended to the human brain being the only example and thus model of a general intelligence in nature, eliding a discussion of what it is that makes the human brain uniquely intelligent.  And how?  Exactly how?

It appears the target for general intelligence keeps moving.  When I was a teenager in the 1960’s, it was said that if a machine could be taught to play chess well enough to beat humans, it would be convincing evidence that it had achieved a human level of intelligence.  That goal has been exceeded, but no one is saying that Deep Blue thinks, like a human or otherwise.  Things touted as intelligent behavior at one time are called something else after a machine does them.    

It seems like what is called AI these days by people who are actually making it, rather than writing about it (sometimes the same people in different roles), doesn’t at all aspire to general intelligence.  Instead, its goals are lesser things like greater efficiency in image and speech recogniton, automatic car navigation, unbiased medical diagnostics and similar goals for specific products.  The “intelligence” is usually that the algorithm is deeply heuristic and “learns” or is programmed to try different approaches to the specific problem in order to optimize a solution.  Yet many of the books and commentary on AI (For example, Superintelligence, 2014, by Nick Bostrom, and The Master Algorithm, 2015, by Pedro Domingos, both valuable reading) seem to assume that AI is leading to something akin to “general” human-type intelligence, either by intention or by accident.  Early this year, “Deep Learning: A Critical Appraisal,” (https://arxiv.org/abs/1801.00631) by NYU professor of cognitive science Gary Marcus drew much attention by pointing out that the neural net-based deep learning approach appears to be “approaching a wall,” and “must be supplemented by other techniques if we are to reach artificial general intelligence.”  The analysis is impressive, identifying 10 specific shortcomings, all of which are things that seem worth considering in conceiving of what “general intelligence” might be.  But the ten things don’t add up to anything like a complete description of what general intelligence is.  There were 10 blind men and an elephant…

There seems to be a gap here, that is widening, between what the practitioners are actually pursuing vs a fuzzier but widely assumed goal of general intelligence as not only worthwhile but the ultimate goal.  One could conclude from this that the goal (or threat) of artificial “general” intelligence at “human-level” or above is proving to be a chimera (or a false threat).  But it could also be true that more clarity around what we mean by “general” and “human-level” intelligence in a machine would go a long way toward helping us see the real value of deliberately pursuing a more general artificial intelligence, and how to achieve it, as well as the real danger of its threat.