On Intelligence is a book written by Jeff Hawkins[1] on the nature of intelligence, both artificial and natural. It’s an expansion of and answer to an age-old question I’ve referred to before: how do we think? What does it mean to be intelligent? What is consciousness? Might it be possible to create an intelligent machine? And if we could–what would that mean for society? In this essay I’ll attempt to describe his theory, its motivations, and how it can be a powerful and useful tool in any field–even if you’re not very interested in AI.
Humankind has always been ambivalent towards other intelligent beings. In some of our stories we romanticize them, in others we deify them, and in still others they are the villains. Throughout history, our encounters with each other’s cultures have been exemplified by equal parts fear, wonder, and disdain; the idea of human-made intelligence only sharpens these emotions. The prospect of “playing god” triggers deeply-ingrained social and religious warnings against seeking knowledge we “weren’t meant to have”. These fears have been reinforced by literature and art throughout the ages–everything from the cautionary tales of Icarus and the Fall from Eden to more modern, literal stories such as Frankenstein and The Terminator. As our understanding of the world has increased, our fear of its unexplained mysteries has lessened–yet the mind seems uniquely inexplicable, an island of mystery in a sea of certainty, the last remnant of “magic” in nature. Even many self-professed materialists approach the problem of creating intelligent machines with startling indifference towards how the human mind actually works. Instead of drawing inspiration from the workings of our brains, the more common tendency is to treat human intelligence as a “black box” whose workings are not only unexplainable, but irrelevant.
The height of this deliberate “function, not method” blindness is the famed Turing intelligence test. The idea behind the test is quite simple: if a machine can act like a human so well that another human can’t tell the difference, it is said to have passed the test and demonstrated genuine intelligence. Easy enough, right? Unfortunately, it’s not at all clear that intelligence is what’s really being measured here. Telemarketer robots, for example, use a combination of ambiguous, pre-recorded phrases and basic sound detection to give listeners the impression of a responsive (if somewhat pushy) human salesperson on the other end of the line. In many cases, their interaction with the ‘bot is brief enough that they never realize it was a phony, so it could then be said to have passed the Turing test (or at least, a limited version of the test). Yet the program underlying such a ‘bot is ridiculously simple, many orders of magnitude simpler than the web browser you are currently using to read this essay. Just because the telemarketing robot more effectively impersonates a human, does that really make it more “intelligent” than your browser? Or Google’s translator service? Or ASIMO?
Strictly speaking, the Turing test as originally proposed eliminates certain crutches that the telemarketer ‘bot relies on (like limited social context, meaning the tiny set of expected interactions is easier to brute-force; and limited time, meaning less opportunity to notice cracks in the AI’s facade) by taking place in the context of a free-form conversation, held over a text-based interface such as a keyboard and screen, over an indefinite amount of time. The bigger problem, however, is not the test’s lack of formality or its possible loopholes, the problem is that it isn’t even clear what we mean when we say an AI must act “like” a human. Should there be any limits to the content of the conversation? Should our prospective AI be expected to be able to improvise a poem or short story on demand? Or speak multiple languages? Or tell jokes? Should it be expected to exhibit human prejudices, political affiliations, or religious beliefs? Should it replicate human qualities like subjectivity, irrationality, and emotion? Most of these traits are irrelevant to what we really mean when we talk about intelligence, yet the Turing test lumps them all together under the same functional definition. By defining intelligence as a “black box” that depends only on outputs, not processes, it unashamedly side-steps the more difficult but infinitely more relevant question: what exactly do we mean when we talk about “intelligence”?
In principle, there’s nothing wrong with a functional approach–if the Wright brothers, for instance, had tried to duplicate the method birds use to fly, they would never have made it off the ground. (Literally!) We still don’t have the technology to reproduce a bird’s flight with a machine–the mechanics and feedback involved are simply too sophisticated. Yet airplanes are not worse off for their difference in method–in many ways, they are better! They can reach greater speeds, they are more energy-efficient at high speed, and unlike birds they retain full control and stability even without power. Many of the greatest inventions in history–the wheel, for instance–have had no relation to their functional counterparts in nature. Yet there is a crucial difference in our search for artificial intelligence that makes this approach unfeasible: the Wright brothers may not have known exactly how birds flew, but they at least had a clear idea of what flight itself was. The same is not true of intelligence–the Turing test amounts to nothing more than a cop-out. “What is intelligence? Gee, I have no idea–but I know that humans have it. So if we can build a machine that acts human, it must be intelligent!” This is like trying to discover fire by building a volcano: incredibly difficult, completely unnecessary, and useless even if it succeeds. If the Wright brothers had followed a similar line of thinking, they would have ended up with a convincing but flightless clockwork bird instead of an airplane. No wonder AI’s progress has been so excruciatingly slow: if our definition of an intelligent machine is “something that acts like a human,” of course we’re going to spend all our time trying to make clever automatons that do a good job at fooling people, but aren’t capable of much else–which is exactly what most attempts at passing the Turing test have amounted to.
It’s tempting to try to “fix” the Turing test to eliminate ambiguities such as emotion, opinion, and belief, but this approach is doomed to failure. A functional test of any kind will never work, because the fact is intelligence is entirely independent of function! To illustrate, consider the thought experiment of the Chinese room, where a man who speaks no Chinese sits inside a room and takes slips of paper (unseen) from a Chinese-speaking person outside the room. The man inside the room has a book of instructions (the “program”) telling him exactly what characters to write in response to any question he receives, so that the person outside the room soon becomes convinced that they are conversing with an intelligent entity that does indeed speak and understand Chinese. Yet what exactly is doing the “understanding”? It is not the man writing the responses–he neither reads nor writes Chinese, but is simply following instructions mindlessly. Neither is it the book–books, being inanimate, can no more be intelligent than can a rock. Similarly, it does not seem reasonable to say that the room understands what is being written independent of the book or the man, and claiming that intelligence is an “emergent” property of all three is conveniently circular (i.e., the room is intelligent because the room is intelligent). So although an entity such as the Chinese room may demonstrate intelligent functions, that does not necessarily imply that it is intelligent in and of itself.
Not only that, but the converse is also true: an entity may exhibit no outward, behavioral signs of intelligence, yet still be actively engaging in intelligent thought and understanding. You, at this very moment, are engaging in unambiguously intelligent behavior (reading) without showing any external signs that you are doing so. Someone standing behind you would have no way of knowing whether you are actually reading this essay or just staring blankly at the screen. Your brain is definitely doing something while you read–neural scans would show tremendous activity–yet that activity produces no functional result at all. The truth of the matter is that a functional definition of intelligence will never be enough, because intelligence itself is not a function–it is a method, an algorithm, a way of processing the world that is independent not only of our interactions with that world, but even of the type of information being processed.
The “black box” mentality, encouraged by the apparent functional divisions of the different brain regions discovered in MRI scans, has encouraged AI researchers to customize their solutions to their specific problem domains (ironically, largely ignoring the Turing test in the process). The motivation is obvious: when working with different functions (identifying images vs. proving theorems, for instance), a functional mindset encourages different methods for different types of input and behavior as being more efficient. And indeed, this approach often is more efficient–many types of image-processing, calculation, and analysis programs are far faster and more precise at what they do than the human brain could ever be. Yet what they do is not intelligence. If fact, while the brain appears to associate different regions with different types of processing tasks, the underlying cellular structure of the neocortex (the part of the brain most crucial for all our “higher-level” intelligent abilities) is regular and completely uniform. Every section of neocortex has the same readily-observable neural structure and organization–meaning that whatever those neurons are doing, they’re doing the exact same thing everywhere in the brain. Whatever method our brain uses to process its inputs, that method is exactly the same for all inputs: a picture of a chair, the sentence “this is a chair”, the smell of sawdust, and the feel of wood beneath our fingers are all processed using the same fundamental algorithm, regardless of our behaviors. Even the motor cortex–the part of our brain that controls our conscious actions, the part responsible for producing our behavioral “output”–has the exact same structure and uses the exact same processing method as the “input” of all our senses! And as countless studies have shown, that method is capable of adapting to an astounding variety of inputs without any fundamental change in structure. For example, persons born blind can be taught how to see using a special camera that stimulates their tongue, an organ that seemingly could not be further removed from vision. Far from a “black box” where only the inputs and outputs are important, intelligence is in fact the opposite: the same universal method can serve for any type of input or output.
The natural question then is: what is this miracle method? How does the human brain manage to perform such an incredible variety of difficult tasks using only a single, uniform algorithm? The key, as Hawkins argues, is memory. Unlike a computer, the human brain does not calculate what it is seeing–instead, it remembers patterns of input it has seen in the past, and makes predictions based on those memories. However, the way in which our neocortex stores memories is very different from the way in which computer memory works:
- First, the neocortex recalls patterns auto-associatively. This simply means that a part or a distorted version of a pattern will recall the original, whole pattern–patterns are associated with themselves.
- Second, the neocortex stores patterns in an invariant form. This means that the aspects of the patterns that get remembered are the ones that do not change between instances, rather than the minute details that are different every time–so while a computer would store an image of a face with a pixel-by-pixel level of precision and detail, a human brain will store a memory of that same face by reducing it to a detail-independent abstraction, making special note only of features that deviate from the norm (such as a scar or tattoo, for instance).
- Third, the neocortex stores patterns in a hierarchy. This means that lower hierarchical levels of the neocortex store smaller, more detailed aspects of the incoming pattern, and they pass their invariant representations on to higher regions. A small detail will only move up in the hierarchy if lower regions are unable to match that detail to a known pattern (see next point).
- Finally, the neocortex associates patterns with each other. This means that certain patterns that accompany each other frequently are remembered, and are expected to occur together again in the future. Usually, the order of these patterns is irrelevant (such as the “eye, eye, nose, mouth” pattern group representing the “face” invariant), but in some cases such as hearing or touch, sequence matters a great deal.
This last point is the crux of Hawkins’ argument: patterns that have occurred together in the past are expected to occur together again in the future. That expectation of future inputs–in other words, prediction–turns out to be the distinguishing feature of intelligence! “Understanding” something, by Hawkins’ theory, simply means being able to make predictions about it. We say we “understand” a puzzle if we can predict the steps necessary to solve it. We say we “understand” a natural phenomenon if we have a theory that predicts how it will behave–and we say that theory (and our understanding) is incomplete if there are some situations it can’t predict. We even say we “understand” our friends or family when we’re able to predict how they will behave–and if we are lied to or cheated by someone we though we knew, violating our predictions of their behavior, we say we no longer understand them at all.
Our brains are so massive and we are able to predict things at levels of such complexity that the process of understanding simpler objects seems qualitatively different from understanding a person–yet due to the uniform structure of the neocortex, we know that it is not. Our ability to “understand” a pencil lying on a table is no different from our ability to predict a friend’s behavior, with the sole exception that predicting the pencil’s behavior is a whole lot easier. If you leave it on a table, it will stay there until moved. If you drop it, gravity will cause it to fall. If you pick it up, it will produce certain patterns of sensation in your fingertips consistent with its size, shape, orientation, and texture; and it will resist the movement of your fingers in a way that’s consistent with its mass and hardness. Prediction is still crucial even to this basic type of understanding, but unlike our interactions with people that prediction is almost entirely below the conscious level. This is easy to see with a little thought experiment: what if the pencil was sitting on a desk in front of you, with you looking at something else off to the side, and the pencil suddenly moved? What if you were holding it absentmindedly, focusing on something else, and it abruptly changed size or texture? You would notice immediately, even if you weren’t paying any conscious attention to it. Why? Because even though it wasn’t part of your conscious attention, your brain was still making continuous predictions about it. At every instant, your brain is sending tremendous amounts of feedback data to your senses and lower regions in the memory hierarchy–even more, in fact, than it receives as feedforward input! Your brain is constantly and unilaterally predicting everything it expects to see, hear, smell, taste, and feel. Prediction is the bedrock, and indeed, the very nature of intelligence and understanding–at all levels of complexity and for all problem domains. An AI that fails to predict its environment is not exhibiting intelligence at all.
It should be clear now how the Turing test muddies the issue of designing and building intelligent machines: while clever programs can indeed exhibit intelligent behavior, the machine itself is not the source of this intelligence. Think back to the Chinese room for a moment: the person standing outside the room is convinced they’re conversing with an intelligent entity. And indeed, it seems that something, somewhere must understand what is being written in order to *ahem* predict the correct responses. We can say, then, that the room exhibits intelligent behavior. But where is the intelligent method? Where are the predictions coming from? It is not the man inside the room–if his book gives him the wrong instructions, he will follow them mindlessly and never know the difference. Neither is it the book itself–it’s just an inert hunk of paper, and if a page fell out or an ink blot rendered some text illegible it would never “know” the difference either. In fact, it is the book’s author–the programmer, not the program–from whom these predictions originate. The instructions the programmer sets down are the embodiment of their own understanding, and the Chinese room is merely “borrowing” that understanding to produce intelligent-seeming behavior. Without any mechanism to learn from its environment and make its own predictions, the Chinese room (and by analogy, any computer program) is no more intelligent than a wind-up toy is alive. What would happen, for instance, if we asked the person outside the room to start using a made-up word in the conversation? Unless the program included instructions on how to remember and make sense of (in other words, learn and predict) new words, the room would not be able to give correct responses to the new word and would cease to seem intelligent. On the other hand, if the book did contain such instructions, I think we could feel confident in saying that the man inside would indeed have actually learned the new word, and would himself be exhibiting intelligence in its use from then on out.
In addition to clarifying the distinction between intelligent behavior and intelligence per se, Hawkins’ theory also helps clear up the confusion between intelligence and humanity. Qualities such as humor, creativity, and emotion–qualities that the Turing test casually lumps under the umbrella of “intelligence”–are distinct from the act of understanding itself. (In the case of humor and creativity, intelligence may certainly be necessary, but it is by no means sufficient.) Not only that, it makes clear we have nothing to fear from our intelligent creations growing resentful and rising up against us–resentment, jealousy, self-awareness, independence, and even self-preservation are not necessary features of an intelligent machine. In fact, it would actually take more work to add these features in than it would to just leave them out! Just as the robots that clean our houses, serve our coffee and make our cars don’t look anything like we thought they would, so will intelligent machines largely do without human emotions and behaviors.
Not only could this theory be considered a paradigm shift in the field of AI, it also proves to be a valuable tool in a myriad of other fields. Having studied some psychology in college, for instance, I’ve been able to use Hawkins’ predictive theory of intelligence to help make sense of everything from cognitive dissonance to optical illusions, stereotyping and prejudice, and even the phantom limb phenomenon. It has obvious and non-obvious applications in everything from economics and education to philosophy and politics, or even biology–after all, DNA is just another type of memory organisms use to make predictions about their environment, with natural selection as the method of learning. All life exhibits intelligence! Even consciousness itself, perhaps the most infamous unsolved problem in human history, can be understood in terms of making predictions about the self.[2] Once you start thinking of intelligence and understanding as prediction, you start seeing it crop up everywhere both in and around us. It’s such a vastly different approach that I hesitate to call it AI at all–the “artificial” in “artificial intelligence” now seems ironically appropriate, like how a stuffed bird hung by a string could be said to exhibit “artificial flight”. In truth, a machine built using the principle of prediction as its bedrock would exhibit genuine intelligence–of the same kind, though perhaps of a different degree, than our own.
These true machine intelligences–call them “MI”–will in theory be capable of everything the human mind can do. But, as is so often the case with new technologies, their most useful applications will be things humans can’t do, or find boring–and the most exciting things we can’t do tend to be things we haven’t even thought of yet. MIs will not be limited to human perception or speed–they will be able to operate on senses and scales as exotic and varied as we can imagine. It would be trivial, for instance, to imagine an MI that gets its input from infrared or radio waves in addition to the visible spectrum–an astronomical machine, perhaps, tirelessly looking for signs of life elsewhere in the universe; or a security camera system watching for intruders. But it is also possible to imagine MIs that process data from microscopic sensors, or from complex weather stations spaced miles apart, or from periodic chemical analysis of water or mineral samples. Even purely abstract or digital data would just be another sense to such machines–everything from utility usage to U.S. census data to tweets and Facebook posts could be intelligently analyzed, and used to make sophisticated predictions about everything from global climate change and the price of bread in developing nations, to which new movie you’re most likely to enjoy or what thermostat setting will save you the most energy. The possibilities–like intelligence itself–will be limited only by our imaginations.
Notes:
[1] Inventor of the Palm Pilot, among other things.
[2] I’ve since written another essay that explores this idea in much more depth.