How to Create a Mind: The Secret of Human Thought Revealed - читать бесплатно онлайн полную версию книги автора Ray Kurzweil (CHAPTER 11 OBJECTIONS) #16

CHAPTER 11 OBJECTIONS

If a machine can prove indistinguishable from a human, we should award it the respect we would to a human—we should accept that it has a mind.

Stevan Harnad

T he most significant source of objection to my thesis on the law of accelerating returns and its application to the amplification of human intelligence stems from the linear nature of human intuition. As I described earlier, each of the several hundred million pattern recognizers in the neocortex processes information sequentially. One of the implications of this organization is that we have linear expectations about the future, so critics apply their linear intuition to information phenomena that are fundamentally exponential.

I call objections along these lines “criticism from incredulity,” in that exponential projections seem incredible given our linear predilection, and they take a variety of forms. Microsoft cofounder Paul Allen (born in 1953) and his colleague Mark Greaves recently articulated several of them in an essay titled “The Singularity Isn’t Near” published in Technology Review magazine.¹ While my response here is to Allen’s particular critiques, they represent a typical range of objections to the arguments I’ve made, especially with regard to the brain. Although Allen references The Singularity Is Near in the title of his essay, his only citation in the piece is to an essay I wrote in 2001 (“The Law of Accelerating Returns”). Moreover, his article does not acknowledge or respond to arguments I actually make in the book. Unfortunately, I find this often to be the case with critics of my work.

When The Age of Spiritual Machines was published in 1999, augmented later by the 2001 essay, it generated several lines of criticism, such as: Moore’s law will come to an end; hardware capability may be expanding exponentially but software is stuck in the mud; the brain is too complicated; there are capabilities in the brain that inherently cannot be replicated in software; and several others. One of the reasons I wrote The Singularity Is Near was to respond to those critiques.

I cannot say that Allen and similar critics would necessarily have been convinced by the arguments I made in that book, but at least he and others could have responded to what I actually wrote. Allen argues that “the Law of Accelerating Returns (LOAR)…is not a physical law.” I would point out that most scientific laws are not physical laws, but result from the emergent properties of a large number of events at a lower level. A classic example is the laws of thermodynamics (LOT). If you look at the mathematics underlying the LOT, it models each particle as following a random walk, so by definition we cannot predict where any particular particle will be at any future time. Yet the overall properties of the gas are quite predictable to a high degree of precision, according to the laws of thermodynamics. So it is with the law of accelerating returns: Each technology project and contributor is unpredictable, yet the overall trajectory, as quantified by basic measures of price/performance and capacity, nonetheless follows a remarkably predictable path.

If computer technology were being pursued by only a handful of researchers, it would indeed be unpredictable. But it’s the product of a sufficiently dynamic system of competitive projects that a basic measure of its price/performance, such as calculations per second per constant dollar, follows a very smooth exponential path, dating back to the 1890 American census as I noted in the previous chapter. While the theoretical basis for the LOAR is presented extensively in The Singularity Is Near, the strongest case for it is made by the extensive empirical evidence that I and others present.

Allen writes that “these ‘laws’ work until they don’t.” Here he is confusing paradigms with the ongoing trajectory of a basic area of information technology. If we were examining, for example, the trend of creating ever smaller vacuum tubes—the paradigm for improving computation in the 1950s—it’s true that it continued until it didn’t. But as the end of this particular paradigm became clear, research pressure grew for the next paradigm. The technology of transistors kept the underlying trend of the exponential growth of price/performance of computation going, and that led to the fifth paradigm (Moore’s law) and the continual compression of features on integrated circuits. There have been regular predictions that Moore’s law will come to an end. The semiconductor industry’s “International Technology Roadmap for Semiconductors” projects seven-nanometer features by the early 2020s.² At that point key features will be the width of thirty-five carbon atoms, and it will be difficult to continue shrinking them any farther. However, Intel and other chip makers are already taking the first steps toward the sixth paradigm, computing in three dimensions, to continue exponential improvement in price/performance. Intel projects that three-dimensional chips will be mainstream by the teen years; three-dimensional transistors and 3-D memory chips have already been introduced. This sixth paradigm will keep the LOAR going with regard to computer price/performance to a time later in this century when a thousand dollars’ worth of computation will be trillions of times more powerful than the human brain.³ (It appears that Allen and I are at least in agreement on what level of computation is required to functionally simulate the human brain.)⁴

Allen then goes on to give the standard argument that software is not progressing in the same exponential manner as hardware. In The Singularity Is Near I addressed this issue at length, citing different methods of measuring complexity and capability in software that do demonstrate a similar exponential growth.⁵ One recent study (“Report to the President and Congress, Designing a Digital Future: Federally Funded Research and Development in Networking and Information Technology,” by the President’s Council of Advisors on Science and Technology) states the following:

Even more remarkable—and even less widely understood—is that in many areas, performance gains due to improvements in algorithms have vastly exceeded even the dramatic performance gains due to increased processor speed. The algorithms that we use today for speech recognition, for natural language translation, for chess playing, for logistics planning, have evolved remarkably in the past decade…. Here is just one example, provided by Professor Martin Grötschel of Konrad-Zuse-Zentrum für Informationstechnik Berlin. Grötschel, an expert in optimization, observes that a benchmark production planning model solved using linear programming would have taken 82 years to solve in 1988, using the computers and the linear programming algorithms of the day. Fifteen years later—in 2003—this same model could be solved in roughly 1 minute, an improvement by a factor of roughly 43 million. Of this, a factor of roughly 1,000 was due to increased processor speed, whereas a factor of roughly 43,000 was due to improvements in algorithms! Grötschel also cites an algorithmic improvement of roughly 30,000 for mixed integer programming between 1991 and 2008. The design and analysis of algorithms, and the study of the inherent computational complexity of problems, are fundamental subfields of computer science.

Note that the linear programming that Grötschel cites above as having benefited from an improvement in performance of 43 million to 1 is the mathematical technique that is used to optimally assign resources in a hierarchical memory system such as HHMM that I discussed earlier. I cite many other similar examples like this in The Singularity Is Near.⁶

Regarding AI, Allen is quick to dismiss IBM’s Watson, an opinion shared by many other critics. Many of these detractors don’t know anything about Watson other than the fact that it is software running on a computer (albeit a parallel one with 720 processor cores). Allen writes that systems such as Watson “remain brittle, their performance boundaries are rigidly set by their internal assumptions and defining algorithms, they cannot generalize, and they frequently give nonsensical answers outside of their specific areas.”

First of all, we could make a similar observation about humans. I would also point out that Watson’s “specific areas” include all of Wikipedia plus many other knowledge bases, which hardly constitute a narrow focus. Watson deals with a vast range of human knowledge and is capable of dealing with subtle forms of language, including puns, similes, and metaphors in virtually all fields of human endeavor. It’s not perfect, but neither are humans, and it was good enough to be victorious on Jeopardy! over the best human players.

Allen argues that Watson was assembled by the scientists themselves, building each link of narrow knowledge in specific areas. This is simply not true. Although a few areas of Watson’s data were programmed directly, Watson acquired the significant majority of its knowledge on its own by reading natural-language documents such as Wikipedia. That represents its key strength, as does its ability to understand the convoluted language in Jeopardy! queries (answers in search of a question).

As I mentioned earlier, much of the criticism of Watson is that it works through statistical probabilities rather than “true” understanding. Many readers interpret this to mean that Watson is merely gathering statistics on word sequences. The term “statistical information” in the case of Watson actually refers to distributed coefficients and symbolic connections in self-organizing methods such as hierarchical hidden Markov models. One could just as easily dismiss the distributed neurotransmitter concentrations and redundant connection patterns in the human cortex as “statistical information.” Indeed we resolve ambiguities in much the same way that Watson does—by considering the likelihood of different interpretations of a phrase.

Allen continues, “Every structure [in the brain] has been precisely shaped by millions of years of evolution to do a particular thing, whatever it might be. It is not like a computer, with billions of identical transistors in regular memory arrays that are controlled by a CPU with a few different elements. In the brain every individual structure and neural circuit has been individually refined by evolution and environmental factors.”

This contention that every structure and neural circuit in the brain is unique and there by design is simply impossible, for it would mean that the blueprint of the brain would require hundreds of trillions of bytes of information. The brain’s structural plan (like that of the rest of the body) is contained in the genome, and the brain itself cannot contain more design information than the genome. Note that epigenetic information (such as the peptides controlling gene expression) does not appreciably add to the amount of information in the genome. Experience and learning do add significantly to the amount of information contained in the brain, but the same can be said of AI systems like Watson. I show in The Singularity Is Near that, after lossless compression (due to massive redundancy in the genome), the amount of design information in the genome is about 50 million bytes, roughly half of which (that is, about 25 million bytes) pertains to the brain.⁷ That’s not simple, but it is a level of complexity we can deal with and represents less complexity than many software systems in the modern world. Moreover much of the brain’s 25 million bytes of genetic design information pertain to the biological requirements of neurons, not to their information-processing algorithms.

How do we arrive at on the order of 100 to 1,000 trillion connections in the brain from only tens of millions of bytes of design information? Obviously, the answer is through massive redundancy. Dharmendra Modha, manager of Cognitive Computing for IBM Research, writes that “neuroanatomists have not found a hopelessly tangled, arbitrarily connected network, completely idiosyncratic to the brain of each individual, but instead a great deal of repeating structure within an individual brain and a great deal of homology across species…. The astonishing natural reconfigurability gives hope that the core algorithms of neurocomputation are independent of the specific sensory or motor modalities and that much of the observed variation in cortical structure across areas represents a refinement of a canonical circuit; it is indeed this canonical circuit we wish to reverse engineer.”⁸

Allen argues in favor of an inherent “complexity brake that would necessarily limit progress in understanding the human brain and replicating its capabilities,” based on his notion that each of the approximately 100 to 1,000 trillion connections in the human brain is there by explicit design. His “complexity brake” confuses the forest with the trees. If you want to understand, model, simulate, and re-create a pancreas, you don’t need to re-create or simulate every organelle in every pancreatic islet cell. You would want instead to understand one islet cell, then abstract its basic functionality as it pertains to insulin control, and then extend that to a large group of such cells. This algorithm is well understood with regard to islet cells. There are now artificial pancreases that utilize this functional model being tested. Although there is certainly far more intricacy and variation in the brain than in the massively repeated islet cells of the pancreas, there is nonetheless massive repetition of functions, as I have described repeatedly in this book.

Critiques along the lines of Allen’s also articulate what I call the “scientist’s pessimism.” Researchers working on the next generation of a technology or of modeling a scientific area are invariably struggling with that immediate set of challenges, so if someone describes what the technology will look like in ten generations, their eyes glaze over. One of the pioneers of integrated circuits was recalling for me recently the struggles to go from 10-micron (10,000 nanometers) feature sizes to 5-micron (5,000 nanometers) features over thirty years ago. The scientists were cautiously confident of reaching this goal, but when people predicted that someday we would actually have circuitry with feature sizes under 1 micron (1,000 nanometers), most of them, focused on their own goal, thought that was too wild to contemplate. Objections were made regarding the fragility of circuitry at that level of precision, thermal effects, and so on. Today Intel is starting to use chips with 22-nanometer gate lengths.

We witnessed the same sort of pessimism with respect to the Human Genome Project. Halfway through the fifteen-year effort, only 1 percent of the genome had been collected, and critics were proposing basic limits on how quickly it could be sequenced without destroying the delicate genetic structures. But thanks to the exponential growth in both capacity and price/performance, the project was finished seven years later. The project to reverse-engineer the human brain is making similar progress. It is only recently, for example, that we have reached a threshold with noninvasive scanning techniques so that we can see individual interneuronal connections forming and firing in real time. Much of the evidence I have presented in this book was dependent on such developments and has only recently been available.

Allen describes my proposal about reverse-engineering the human brain as simply scanning the brain to understand its fine structure and then simulating an entire brain “bottom up” without comprehending its information-processing methods. This is not my proposition. We do need to understand in detail how individual types of neurons work, and then gather information about how functional modules are connected. The functional methods that are derived from this type of analysis can then guide the development of intelligent systems. Basically, we are looking for biologically inspired methods that can accelerate work in AI, much of which has progressed without significant insight as to how the brain performs similar functions. From my own work in speech recognition, I know that our work was greatly accelerated when we gained insights as to how the brain prepares and transforms auditory information.

The way that the massively redundant structures in the brain differentiate is through learning and experience. The current state of the art in AI does in fact enable systems to also learn from their own experience. The Google self-driving cars learn from their own driving experience as well as from data from Google cars driven by human drivers; Watson learned most of its knowledge by reading on its own. It is interesting to note that the methods deployed today in AI have evolved to be mathematically very similar to the mechanisms in the neocortex.

Another objection to the feasibility of “strong AI” (artificial intelligence at human levels and beyond) that is often raised is that the human brain makes extensive use of analog computing, whereas digital methods inherently cannot replicate the gradations of value that analog representations can embody. It is true that one bit is either on or off, but multiple-bit words easily represent multiple gradations and can do so to any desired degree of accuracy. This is, of course, done all the time in digital computers. As it is, the accuracy of analog information in the brain (synaptic strength, for example) is only about one level within 256 levels that can be represented by eight bits.

In chapter 9 I cited Roger Penrose and Stuart Hameroff’s objection, which concerned microtubules and quantum computing. Recall that they claim that the microtubule structures in neurons are doing quantum computing, and since it is not possible to achieve that in computers, the human brain is fundamentally different and presumably better. As I argued earlier, there is no evidence that neuronal microtubules are carrying out quantum computation. Humans in fact do a very poor job of solving the kinds of problems that a quantum computer would excel at (such as factoring large numbers). And if any of this proved to be true, there would be nothing barring quantum computing from also being used in our computers.

John Searle is famous for introducing a thought experiment he calls “the Chinese room,” an argument I discuss in detail in The Singularity Is Near.⁹ In short, it involves a man who takes in written questions in Chinese and then answers them. In order to do this, he uses an elaborate rulebook. Searle claims that the man has no true understanding of Chinese and is not “conscious” of the language (as he does not understand the questions or the answers) despite his apparent ability to answer questions in Chinese. Searle compares this to a computer and concludes that a computer that could answer questions in Chinese (essentially passing a Chinese Turing test) would, like the man in the Chinese room, have no real understanding of the language and no consciousness of what it was doing.

There are a few philosophical sleights of hand in Searle’s argument. For one thing, the man in this thought experiment is comparable only to the central processing unit (CPU) of a computer. One could say that a CPU has no true understanding of what it is doing, but the CPU is only part of the structure. In Searle’s Chinese room, it is the man with his rulebook that constitutes the whole system. That system does have an understanding of Chinese; otherwise it would not be capable of convincingly answering questions in Chinese, which would violate Searle’s assumption for this thought experiment.

The attractiveness of Searle’s argument stems from the fact that it is difficult today to infer true understanding and consciousness in a computer program. The problem with his argument, however, is that you can apply his own line of reasoning to the human brain itself. Each neocortical pattern recognizer—indeed, each neuron and each neuronal component—is following an algorithm. (After all, these are molecular mechanisms that follow natural law.) If we conclude that following an algorithm is inconsistent with true understanding and consciousness, then we would have to also conclude that the human brain does not exhibit these qualities either. You can take John Searle’s Chinese room argument and simply substitute “manipulating interneuronal connections and synaptic strengths” for his words “manipulating symbols” and you will have a convincing argument to the effect that human brains cannot truly understand anything.

Another line of argument comes from the nature of nature, which has become a new sacred ground for many observers. For example, New Zealand biologist Michael Denton (born in 1943) sees a profound difference between the design principles of machines and those of biology. Denton writes that natural entities are “self-organizing,…self-referential,…self-replicating,…reciprocal,…self-formative, and…holistic.”¹⁰ He claims that such biological forms can only be created through biological processes and that these forms are thereby “immutable,…impenetrable, and…fundamental” realities of existence, and are therefore basically a different philosophical category from machines.

The reality, as we have seen, is that machines can be designed using these same principles. Learning the specific design paradigms of nature’s most intelligent entity—the human brain—is precisely the purpose of the brain reverse-engineering project. It is also not true that biological systems are completely “holistic,” as Denton puts it, nor, conversely, do machines need to be completely modular. We have clearly identified hierarchies of units of functionality in natural systems, especially the brain, and AI systems are using comparable methods.

It appears to me that many critics will not be satisfied until computers routinely pass the Turing test, but even that threshold will not be clear-cut. Undoubtedly, there will be controversy as to whether claimed Turing tests that have been administered are valid. Indeed, I will probably be among those critics disparaging early claims along these lines. By the time the arguments about the validity of a computer passing the Turing test do settle down, computers will have long since surpassed unenhanced human intelligence.

My emphasis here is on the word “unenhanced,” because enhancement is precisely the reason that we are creating these “mind children,” as Hans Moravec calls them.¹¹ Combining human-level pattern recognition with the inherent speed and accuracy of computers will result in very powerful abilities. But this is not an alien invasion of intelligent machines from Mars—we are creating these tools to make ourselves smarter. I believe that most observers will agree with me that this is what is unique about the human species: We build these tools to extend our own reach.