How to Create a Mind: The Secret of Human Thought Revealed - читать бесплатно онлайн полную версию книги автора Ray Kurzweil (NOTES) #18

NOTES

Introduction

1. Here is one sentence from One Hundred Years of Solitude by Gabriel García Márquez:

Aureliano Segundo was not aware of the singsong until the following day after breakfast when he felt himself being bothered by a buzzing that was by then more fluid and louder than the sound of the rain, and it was Fernanda, who was walking throughout the house complaining that they had raised her to be a queen only to have her end up as a servant in a madhouse, with a lazy, idolatrous, libertine husband who lay on his back waiting for bread to rain down from heaven while she was straining her kidneys trying to keep afloat a home held together with pins where there was so much to do, so much to bear up under and repair from the time God gave his morning sunlight until it was time to go to bed that when she got there her eyes were full of ground glass, and yet no one ever said to her, “Good morning, Fernanda, did you sleep well?,” nor had they asked her, even out of courtesy, why she was so pale or why she awoke with purple rings under her eyes in spite of the fact that she expected it, of course, from a family that had always considered her a nuisance, an old rag, a booby painted on the wall, and who were always going around saying things against her behind her back, calling her churchmouse, calling her Pharisee, calling her crafty, and even Amaranta, may she rest in peace, had said aloud that she was one of those people who could not tell their rectums from their ashes, God have mercy, such words, and she had tolerated everything with resignation because of the Holy Father, but she had not been able to tolerate it any more when that evil José Arcadio Segundo said that the damnation of the family had come when it opened its doors to a stuck-up highlander, just imagine, a bossy highlander, Lord save us, a highlander daughter of evil spit of the same stripe as the highlanders the government sent to kill workers, you tell me, and he was referring to no one but her, the godchild of the Duke of Alba, a lady of such lineage that she made the liver of presidents’ wives quiver, a noble dame of fine blood like her, who had the right to sign eleven peninsular names and who was the only mortal creature in that town full of bastards who did not feel all confused at the sight of sixteen pieces of silverware, so that her adulterous husband could die of laughter afterward and say that so many knives and forks and spoons were not meant for a human being but for a centipede, and the only one who could tell with her eyes closed when the white wine was served and on what side and in which glass and when the red wine and on what side and in which glass and not like that peasant of an Amaranta, may she rest in peace, who thought that white wine was served in the daytime and red wine at night, and the only one on the whole coast who could take pride in the fact that she took care of her bodily needs only in golden chamberpots, so that Colonel Aureliano Buendía, may he rest in peace, could have the effrontery to ask her with his Masonic ill humor where she had received that privilege and whether she did not shit shit but shat sweet basil, just imagine, with those very words, and so that Renata, her own daughter, who through an oversight had seen her stool in the bedroom, had answered that even if the pot was all gold and with a coat of arms, what was inside was pure shit, physical shit, and worse even than any other kind because it was stuck-up highland shit, just imagine, her own daughter, so that she never had any illusions about the rest of the family, but in any case she had the right to expect a little more consideration from her husband because, for better or for worse, he was her consecrated spouse, her helpmate, her legal despoiler, who took upon himself of his own free and sovereign will the grave responsibility of taking her away from her paternal home, where she never wanted for or suffered from anything, where she wove funeral wreaths as a pastime, since her godfather had sent a letter with his signature and the stamp of his ring on the sealing wax simply to say that the hands of his goddaughter were not meant for tasks of this world except to play the clavichord, and, nevertheless, her insane husband had taken her from her home with all manner of admonitions and warnings and had brought her to that frying pan of hell where a person could not breathe because of the heat, and before she had completed her Pentecostal fast he had gone off with his wandering trunks and his wastrel’s accordion to loaf in adultery with a wretch of whom it was only enough to see her behind, well, that’s been said, to see her wiggle her mare’s behind in order to guess that she was a, that she was a, just the opposite of her, who was a lady in a palace or a pigsty, at the table or in bed, a lady of breeding, God-fearing, obeying His laws and submissive to His wishes, and with whom he could not perform, naturally, the acrobatics and trampish antics that he did with the other one, who, of course, was ready for anything, like the French matrons, and even worse, if one considers well, because they at least had the honesty to put a red light at their door, swinishness like that, just imagine, and that was all that was needed by the only and beloved daughter of Doña Renata Argote and Don Fernando del Carpio, and especially the latter, an upright man, a fine Christian, a Knight of the Order of the Holy Sepulcher, those who receive direct from God the privilege of remaining intact in their graves with their skin smooth like the cheeks of a bride and their eyes alive and clear like emeralds.

2. See the graph “Growth in Genbank DNA Sequence Data” in chapter 10.

3. Cheng Zhang and Jianpeng Ma, “Enhanced Sampling and Applications in Protein Folding in Explicit Solvent,” Journal of Chemical Physics 132, no. 24 (2010): 244101. See also http://folding.stanford.edu/English/About about the Folding@home project, which has harnessed over five million computers around the world to simulate protein folding.

4. For a more complete description of this argument, see the section “[The Impact…] on the Intelligent Destiny of the Cosmos: Why We Are Probably Alone in the Universe” in chapter 6 of The Singularity Is Near by Ray Kurzweil (New York: Viking, 2005).

5. James D. Watson, Discovering the Brain (Washington, DC: National Academies Press, 1992).

6. Sebastian Seung, Connectome: How the Brain’s Wiring Makes Us Who We Are (New York: Houghton Mifflin Harcourt, 2012).

7. “Mandelbrot Zoom,” http://www.youtube.com/watch?v=gEw8xpb1aRA; “Fractal Zoom Mandelbrot Corner,” http://www.youtube.com/watch?v=G_GBwuYuOOs.

Chapter 1: Thought Experiments on the World

1. Charles Darwin, The Origin of Species (P. F. Collier & Son, 1909), 185/95–96.

2. Darwin, On the Origin of Species, 751 (206.1.1-6), Peckham’s Variorum edition, edited by Morse Peckham, The Origin of Species by Charles Darwin: A Variorum Text (Philadelphia: University of Pennsylvania Press, 1959).

3. R. Dahm, “Discovering DNA: Friedrich Miescher and the Early Years of Nucleic Acid Research,” Human Genetics 122, no. 6 (2008): 565–81, doi:10.1007/s00439-007-0433-0; PMID 17901982.

4. Valery N. Soyfer, “The Consequences of Political Dictatorship for Russian Science,” Nature Reviews Genetics 2, no. 9 (2001): 723–29, doi:10.1038/35088598; PMID 11533721.

5. J. D. Watson and F. H. C. Crick, “A Structure for Deoxyribose Nucleic Acid,” Nature 171 (1953): 737–38, http://www.nature.com/nature/dna50/watsoncrick.pdf and “Double Helix: 50 Years of DNA,” Nature archive, http://www.nature.com/nature/dna50/archive.xhtml.

6. Franklin died in 1958 and the Nobel Prize for the discovery of DNA was awarded in 1962. There is controversy as to whether or not she would have shared in that prize had she been alive in 1962.

7. Albert Einstein, “On the Electrodynamics of Moving Bodies” (1905). This paper established the special theory of relativity. See Robert Bruce Lindsay and Henry Margenau, Foundations of Physics (Woodbridge, CT: Ox Bow Press, 1981), 330.

8. “Crookes radiometer,” Wikipedia, http://en.wikipedia.org/wiki/Crookes_radiometer.

9. Note that some of the momentum of the photons is transferred to the air molecules in the bulb (since it is not a perfect vacuum) and then transferred from the heated air molecules to the vane.

10. Albert Einstein, “Does the Inertia of a Body Depend Upon Its Energy Content?” (1905). This paper established Einstein’s famous formula E = mc².

11. “Albert Einstein’s Letters to President Franklin Delano Roosevelt,” http://hypertextbook.com/eworld/einstein.shtml.

Chapter 3: A Model of the Neocortex: The Pattern Recognition Theory of Mind

1. Some nonmammals, such as crows, parrots, and octopi, are reported to be capable of some level of reasoning; however, this is limited and has not been sufficient to create tools that have their own evolutionary course of development. These animals may have adapted other brain regions to perform a small number of levels of hierarchical thinking, but a neocortex is required for the relatively unrestricted hierarchical thinking that humans can perform.

2. V. B. Mountcastle, “An Organizing Principle for Cerebral Function: The Unit Model and the Distributed System” (1978), in Gerald M. Edelman and Vernon B. Mountcastle, The Mindful Brain: Cortical Organization and the Group-Selective Theory of Higher Brain Function (Cambridge, MA: MIT Press, 1982).

3. Herbert A. Simon, “The Organization of Complex Systems,” in Howard H. Pattee, ed., Hierarchy Theory: The Challenge of Complex Systems (New York: George Braziller, Inc., 1973), http://blog.santafe.edu/wp-content/uploads/2009/03/simon 1973.pdf.

4. Marc D. Hauser, Noam Chomsky, and W. Tecumseh Fitch, “The Faculty of Language: What Is It, Who Has It, and How Did It Evolve?” Science 298 (November 2002): 1569–79, http://www.sciencemag.org/content/298/5598/1569.short.

5. The following passage from the book Transcend: Nine Steps to Living Well Forever, by Ray Kurzweil and Terry Grossman (New York: Rodale, 2009), describes this lucid dreaming technique in more detail:

I’ve developed a method of solving problems while I sleep. I’ve perfected it for myself over several decades and have learned the subtle means by which this is likely to work better.

I start out by assigning myself a problem when I get into bed. This can be any kind of problem. It could be a math problem, an issue with one of my inventions, a business strategy question, or even an interpersonal problem.

I’ll think about the problem for a few minutes, but I try not to solve it. That would just cut off the creative problem solving to come. I do try to think about it. What do I know about this? What form could a solution take? And then I go to sleep. Doing this primes my subconscious mind to work on the problem.

Terry: Sigmund Freud pointed out that when we dream, many of the censors in our brain are relaxed, so that we might dream about things that are socially, culturally, or even sexually taboo. We can dream about weird things that we wouldn’t allow ourselves to think about during the day. That’s at least one reason why dreams are strange.

Ray: There are also professional blinders that prevent people from thinking creatively, many of which come from our professional training, mental blocks such as “you can’t solve a signal processing problem that way” or “linguistics is not supposed to use those rules.” These mental assumptions are also relaxed in our dream state, so I’ll dream about new ways of solving problems without being burdened by these daytime constraints.

Terry: There’s another part of our brain also not working when we dream, our rational faculties to evaluate whether an idea is reasonable. So that’s another reason that weird or fantastic things happen in our dreams. When the elephant walks through the wall, we aren’t shocked as to how the elephant could do this. We just say to our dream selves, “Okay, an elephant walked through the wall, no big deal.” Indeed, if I wake up in the middle of the night, I often find that I’ve been dreaming in strange and oblique ways about the problem that I assigned myself.

Ray: The next step occurs in the morning in the halfway state between dreaming and being awake, which is often called lucid dreaming. In this state, I still have the feelings and imagery from my dreams, but now I do have my rational faculties. I realize, for example, that I am in a bed. And I could formulate the rational thought that I have a lot to do so I had better get out of bed. But that would be a mistake. Whenever I can, I will stay in bed and continue in this lucid dream state because that is key to this creative problem-solving method. By the way, this doesn’t work if the alarm rings.

Reader: Sounds like the best of both worlds.

Ray: Exactly. I still have access to the dream thoughts about the problem I assigned myself the night before. But now I’m sufficiently conscious and rational to evaluate the new creative ideas that came to me during the night. I can determine which ones make sense. After perhaps 20 minutes of this, I invariably will have keen new insights into the problem.

I’ve come up with inventions this way (and spent the rest of the day writing a patent application), figured out how to organize material for a book such as this, and come up with useful ideas for a diverse set of problems. If I have a key decision to make, I will always go through this process, after which I am likely to have real confidence in my decision.

The key to the process is to let your mind go, to be nonjudgmental, and not to worry about how well the method is working. It is the opposite of a mental discipline. Think about the problem, but then let ideas wash over you as you fall asleep. Then in the morning, let your mind go again as you review the strange ideas that your dreams generated. I have found this to be an invaluable method for harnessing the natural creativity of my dreams.

Reader: Well, for the workaholics among us, we can now work in our dreams. Not sure my spouse is going to appreciate this.

Ray: Actually, you can think of it as getting your dreams to do your work for you.

Chapter 4: The Biological Neocortex

1. Steven Pinker, How the Mind Works (New York: Norton, 1997), 152–53.

2. D. O. Hebb, The Organization of Behavior (New York: John Wiley & Sons, 1949).

3. Henry Markram and Rodrigo Perrin, “Innate Neural Assemblies for Lego Memory,” Frontiers in Neural Circuits 5, no. 6 (2011).

4. E-mail communication from Henry Markram, February 19, 2012.

5. Van J. Wedeen et al., “The Geometric Structure of the Brain Fiber Pathways,” Science 335, no. 6076 (March 30, 2012).

6. Tai Sing Lee, “Computations in the Early Visual Cortex,” Journal of Physiology—Paris 97 (2003): 121–39.

7. A list of papers can be found at http://cbcl.mit.edu/people/poggio/tpcv_short_pubs.pdf.

8. Daniel J. Felleman and David C. Van Essen, “Distributed Hierarchical Processing in the Primate Cerebral Cortex,” Cerebral Cortex 1, no. 1 (January/February 1991): 1–47. A compelling analysis of the Bayesian mathematics of the top-down and bottom-up communication in the neocortex is provided by Tai Sing Lee in “Hierarchical Bayesian Inference in the Visual Cortex,” Journal of the Optical Society of America 20, no. 7 (July 2003): 1434–48.

9. Uri Hasson et al., “A Hierarchy of Temporal Receptive Windows in Human Cortex,” Journal of Neuroscience 28, no. 10 (March 5, 2008): 2539–50.

10. Marina Bedny et al., “Language Processing in the Occipital Cortex of Congenitally Blind Adults,” Proceedings of the National Academy of Sciences 108, no. 11 (March 15, 2011): 4429–34.

11. Daniel E. Feldman, “Synaptic Mechanisms for Plasticity in Neocortex,” Annual Review of Neuroscience 32 (2009): 33–55.

12. Aaron C. Koralek et al., “Corticostriatal Plasticity Is Necessary for Learning Intentional Neuroprosthetic Skills,” Nature 483 (March 15, 2012): 331–35.

13. E-mail communication from Randal Koene, January 2012.

14. Min Fu, Xinzhu Yu, Ju Lu, and Yi Zuo, “Repetitive Motor Learning Induces Coordinated Formation of Clustered Dendritic Spines in Vivo,” Nature 483 (March 1, 2012): 92–95.

15. Dario Bonanomi et al., “Ret Is a Multifunctional Coreceptor That Integrates Diffusible- and Contact-Axon Guidance Signals,” Cell 148, no. 3 (February 2012): 568–82.

16. See endnote 7 in chapter 11.

Chapter 5: The Old Brain

1. Vernon B. Mountcastle, “The View from Within: Pathways to the Study of Perception,” Johns Hopkins Medical Journal 136 (1975): 109–31.

2. B. Roska and F. Werblin, “Vertical Interactions Across Ten Parallel, Stacked Representations in the Mammalian Retina,” Nature 410, no. 6828 (March 29, 2001): 583–87; “Eye Strips Images of All but Bare Essentials Before Sending Visual Information to Brain, UC Berkeley Research Shows,” University of California at Berkeley news release, March 28, 2001, www.berkeley.edu/news/media/releases/2001/03/28_wers1.xhtml.

3. Lloyd Watts, “Reverse-Engineering the Human Auditory Pathway,” in J. Liu et al., eds., WCCI 2012 (Berlin: Springer-Verlag, 2012), 47–59. Lloyd Watts, “Real-Time, High-Resolution Simulation of the Auditory Pathway, with Application to Cell-Phone Noise Reduction,” ISCAS (June 2, 2010): 3821–24. For other papers see http://www.lloydwatts.com/publications.xhtml.

4. See Sandra Blakeslee, “Humanity? Maybe It’s All in the Wiring,” New York Times, December 11, 2003, http://www.nytimes.com/2003/12/09/science/09BRAI.xhtml.

5. T. E. J. Behrens et al., “Non-Invasive Mapping of Connections between Human Thalamus and Cortex Using Diffusion Imaging,” Nature Neuroscience 6, no. 7 (July 2003): 750–57.

6. Timothy J. Buschman et al., “Neural Substrates of Cognitive Capacity Limitations,” Proceedings of the National Academy of Sciences 108, no. 27 (July 5, 2011): 11252–55, http://www.pnas.org/content/108/27/11252.long.

7. Theodore W. Berger et al., “A Cortical Neural Prosthesis for Restoring and Enhancing Memory,” Journal of Neural Engineering 8, no. 4 (August 2011).

8. Basis functions are nonlinear functions that can be combined linearly (by adding together multiple weighted-basis functions) to approximate any nonlinear function. A. Pouget and L. H. Snyder, “Computational Approaches to Sensorimotor Transformations,” Nature Neuroscience 3, no. 11 Supplement (November 2000): 1192–98.

9. J. R. Bloedel, “Functional Heterogeneity with Structural Homogeneity: How Does the Cerebellum Operate?” Behavioral and Brain Sciences 15, no. 4 (1992): 666–78.

10. S. Grossberg and R. W. Paine, “A Neural Model of Cortico-Cerebellar Interactions during Attentive Imitation and Predictive Learning of Sequential Handwriting Movements,” Neural Networks 13, no. 8–9 (October–November 2000): 999–1046.

11. Javier F. Medina and Michael D. Mauk, “Computer Simulation of Cerebellar Information Processing,” Nature Neuroscience 3 (November 2000): 1205–11.

12. James Olds, “Pleasure Centers in the Brain,” Scientific American (October 1956): 105–16. Aryeh Routtenberg, “The Reward System of the Brain,” Scientific American 239 (November 1978): 154–64. K. C. Berridge and M. L. Kringelbach, “Affective Neuroscience of Pleasure: Reward in Humans and Other Animals,” Psychopharmacology 199 (2008): 457–80. Morten L. Kringelbach, The Pleasure Center: Trust Your Animal Instincts (New York: Oxford University Press, 2009). Michael R. Liebowitz, The Chemistry of Love (Boston: Little, Brown, 1983). W. L. Witters and P. Jones-Witters, Human Sexuality: A Biological Perspective (New York: Van Nostrand, 1980).

Chapter 6: Transcendent Abilities

1. Michael Nielsen, Reinventing Discovery: The New Era of Networked Science (Princeton, NJ: Princeton University Press, 2012), 1–3. T. Gowers and M. Nielsen, “Massively Collaborative Mathematics,” Nature 461, no. 7266 (2009): 879–81. “A Combinatorial Approach to Density Hales-Jewett,” Gowers’s Weblog, http://gowers.wordpress.com/2009/02/01/a-combinatorial-approach-to-density-hales-jewett/. Michael Nielsen, “The Polymath Project: Scope of Participation,” March 20, 2009, http://michaelnielsen.org/blog/?p=584. Julie Rehmeyer, “SIAM: Massively Collaborative Mathematics,” Society for Industrial and Applied Mathematics, April 1, 2010, http://www.siam.org/news/news.php?id=1731.

2. P. Dayan and Q. J. M. Huys, “Serotonin, Inhibition, and Negative Mood,” PLoS Computational Biology 4, no. 1 (2008), http://compbiol.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pcbi.0040004.

Chapter 7: The Biologically Inspired Digital Neocortex

1. Gary Cziko, Without Miracles: Universal Selection Theory and the Second Darwinian Revolution (Cambridge, MA: MIT Press, 1955).

2. David Dalrymple has been a mentee of mine since he was eight years old in 1999. You can read his background here: http://esp.mit.edu/learn/teachers/davidad/bio.xhtml, and http://www.brainsciences.org/Research-Team/mr-david-dalrymple.xhtml.

3. Jonathan Fildes, “Artificial Brain ‘10 Years Away,’” BBC News, July 22, 2009, http://news.bbc.co.uk/2/hi/8164060.stm. See also the video “Henry Markram on Simulating the Brain: The Next Decisive Years,” http://www.kurzweilai.net/henry-markram-simulating-the-brain-next-decisive-years.

4. M. Mitchell Waldrop, “Computer Modelling: Brain in a Box,” Nature News, February 22, 2012, http://www.nature.com/news/computer-modelling-brain-in-a-box-1.10066.

5. Jonah Lehrer, “Can a Thinking, Remembering, Decision-Making Biologically Accurate Brain Be Built from a Supercomputer?” Seed, http://seedmagazine.com/content/article/out_of_the_blue/.

6. Fildes, “Artificial Brain ‘10 Years Away.’”

7. See http://www.humanconnectomeproject.org/.

8. Anders Sandberg and Nick Bostrom, Whole Brain Emulation: A Roadmap, Technical Report #2008–3 (2008), Future of Humanity Institute, Oxford University, www.fhi.ox.ac.uk/reports/2008‐3.pdf.

9. Here is the basic schema for a neural net algorithm. Many variations are possible, and the designer of the system needs to provide certain critical parameters and methods, detailed on the following pages.

Creating a neural net solution to a problem involves the following steps:

Define the input.

Define the topology of the neural net (i.e., the layers of neurons and the connections between the neurons).

Train the neural net on examples of the problem.

Run the trained neural net to solve new examples of the problem.

Take your neural net company public.

These steps (except for the last one) are detailed below:

The Problem Input

The problem input to the neural net consists of a series of numbers. This input can be:

In a visual pattern recognition system, a two-dimensional array of numbers representing the pixels of an image; or

In an auditory (e.g., speech) recognition system, a two-dimensional array of numbers representing a sound, in which the first dimension represents parameters of the sound (e.g., frequency components) and the second dimension represents different points in time; or

In an arbitrary pattern recognition system, an n-dimensional array of numbers representing the input pattern.

Defining the Topology

To set up the neural net, the architecture of each neuron consists of:

Multiple inputs in which each input is “connected” to either the output of another neuron or one of the input numbers.

Generally, a single output, which is connected to either the input of another neuron (which is usually in a higher layer) or the final output.

Set Up the First Layer of Neurons

Create N₀ neurons in the first layer. For each of these neurons, “connect” each of the multiple inputs of the neuron to “points” (i.e., numbers) in the problem input. These connections can be determined randomly or using an evolutionary algorithm (see below).

Assign an initial “synaptic strength” to each connection created. These weights can start out all the same, can be assigned randomly, or can be determined in another way (see below).

Set Up the Additional Layers of Neurons

Set up a total of M layers of neurons. For each layer, set up the neurons in that layer.

For layer_i:

Create N_i neurons in layer_i. For each of these neurons, “connect” each of the multiple inputs of the neuron to the outputs of the neurons in layer_i–1 (see variations below).

Assign an initial “synaptic strength” to each connection created. These weights can start out all the same, can be assigned randomly, or can be determined in another way (see below).

The outputs of the neurons in layer_M are the outputs of the neural net (see variations below).

The Recognition Trials

How Each Neuron Works

Once the neuron is set up, it does the following for each recognition trial:

Each weighted input to the neuron is computed by multiplying the output of the other neuron (or initial input) that the input to this neuron is connected to by the synaptic strength of that connection.

All of these weighted inputs to the neuron are summed.

If this sum is greater than the firing threshold of this neuron, then this neuron is considered to fire and its output is 1. Otherwise, its output is 0 (see variations below).

Do the Following for Each Recognition Trial

For each layer, from layer₀ to layer_M:

For each neuron in the layer:

Sum its weighted inputs (each weighted input = the output of the other neuron [or initial input] that the input to this neuron is connected to, multiplied by the synaptic strength of that connection).

If this sum of weighted inputs is greater than the firing threshold for this neuron, set the output of this neuron = 1, otherwise set it to 0.

To Train the Neural Net

Run repeated recognition trials on sample problems.

After each trial, adjust the synaptic strengths of all the interneuronal connections to improve the performance of the neural net on this trial (see the discussion below on how to do this).

Continue this training until the accuracy rate of the neural net is no longer improving (i.e., reaches an asymptote).

Key Design Decisions

In the simple schema above, the designer of this neural net algorithm needs to determine at the outset:

What the input numbers represent.

The number of layers of neurons.

The number of neurons in each layer. (Each layer does not necessarily need to have the same number of neurons.)

The number of inputs to each neuron in each layer. The number of inputs (i.e., interneuronal connections) can also vary from neuron to neuron and from layer to layer.

The actual “wiring” (i.e., the connections). For each neuron in each layer, this consists of a list of other neurons, the outputs of which constitute the inputs to this neuron. This represents a key design area. There are a number of possible ways to do this:

(1) Wire the neural net randomly; or

(2) Use an evolutionary algorithm (see below) to determine an optimal wiring; or

(3) Use the system designer’s best judgment in determining the wiring.

The initial synaptic strengths (i.e., weights) of each connection. There are a number of possible ways to do this:

(1) Set the synaptic strengths to the same value; or

(2) Set the synaptic strengths to different random values; or

(3) Use an evolutionary algorithm to determine an optimal set of initial values; or

(4) Use the system designer’s best judgment in determining the initial values.

The firing threshold of each neuron.

Determine the output. The output can be:

(1) the outputs of layer_M of neurons; or

(2) the output of a single output neuron, the inputs of which are the outputs of the neurons in layer_M; or

(3) a function of (e.g., a sum of) the outputs of the neurons in layer_M; or

(4) another function of neuron outputs in multiple layers.

Determine how the synaptic strengths of all the connections are adjusted during the training of this neural net. This is a key design decision and is the subject of a great deal of research and discussion. There are a number of possible ways to do this:

(1) For each recognition trial, increment or decrement each synaptic strength by a (generally small) fixed amount so that the neural net’s output more closely matches the correct answer. One way to do this is to try both incrementing and decrementing and see which has the more desirable effect. This can be time-consuming, so other methods exist for making local decisions on whether to increment or decrement each synaptic strength.

(2) Other statistical methods exist for modifying the synaptic strengths after each recognition trial so that the performance of the neural net on that trial more closely matches the correct answer.

Note that neural net training will work even if the answers to the training trials are not all correct. This allows using real-world training data that may have an inherent error rate. One key to the success of a neural net–based recognition system is the amount of data used for training. Usually a very substantial amount is needed to obtain satisfactory results. As with human students, the amount of time that a neural net spends learning its lessons is a key factor in its performance.

Variations

Many variations of the above are feasible. For example:

There are different ways of determining the topology. In particular, the interneuronal wiring can be set either randomly or using an evolutionary algorithm.

There are different ways of setting the initial synaptic strengths.

The inputs to the neurons in layer_i do not necessarily need to come from the outputs of the neurons in layer_i–1. Alternatively, the inputs to the neurons in each layer can come from any lower layer or any layer.

There are different ways to determine the final output.

The method described above results in an “all or nothing” (1 or 0) firing called a nonlinearity. There are other nonlinear functions that can be used. Commonly a function is used that goes from 0 to 1 in a rapid but more gradual fashion. Also, the outputs can be numbers other than 0 and 1.

The different methods for adjusting the synaptic strengths during training represent key design decisions.

The above schema describes a “synchronous” neural net, in which each recognition trial proceeds by computing the outputs of each layer, starting with layer₀ through layer_M. In a true parallel system, in which each neuron is operating independently of the others, the neurons can operate “asynchronously” (i.e., independently). In an asynchronous approach, each neuron is constantly scanning its inputs and fires whenever the sum of its weighted inputs exceeds its threshold (or whatever its output function specifies).

10. Robert Mannell, “Acoustic Representations of Speech,” 2008, http://clas.mq.edu.au/acoustics/frequency/acoustic_speech.xhtml.

11. Here is the basic schema for a genetic (evolutionary) algorithm. Many variations are possible, and the designer of the system needs to provide certain critical parameters and methods, detailed below.

The Evolutionary Algorithm

Create N solution “creatures.” Each one has:

A genetic code: a sequence of numbers that characterize a possible solution to the problem. The numbers can represent critical parameters, steps to a solution, rules, etc.

For each generation of evolution, do the following:

Do the following for each of the N solution creatures:

Apply this solution creature’s solution (as represented by its genetic code) to the problem, or simulated environment. Rate the solution.

Pick the L solution creatures with the highest ratings to survive into the next generation.

Eliminate the (N – L) nonsurviving solution creatures.

Create (N – L) new solution creatures from the L surviving solution creatures by:

(1) Making copies of the L surviving creatures. Introduce small random variations into each copy; or

(2) Create additional solution creatures by combining parts of the genetic code (using “sexual” reproduction, or otherwise combining portions of the chromosomes) from the L surviving creatures; or

(3) Do a combination of (1) and (2).

Determine whether or not to continue evolving:

Improvement = (highest rating in this generation) – (highest rating in the previous generation).

If Improvement < Improvement Threshold then we’re done.

The solution creature with the highest rating from the last generation of evolution has the best solution. Apply the solution defined by its genetic code to the problem.

Key Design Decisions

In the simple schema above, the designer needs to determine at the outset:

Key parameters:

Improvement threshold.

What the numbers in the genetic code represent and how the solution is computed from the genetic code.

A method for determining the N solution creatures in the first generation. In general, these need only be “reasonable” attempts at a solution. If these first-generation solutions are too far afield, the evolutionary algorithm may have difficulty converging on a good solution. It is often worthwhile to create the initial solution creatures in such a way that they are reasonably diverse. This will help prevent the evolutionary process from just finding a “locally” optimal solution.

How the solutions are rated.

How the surviving solution creatures reproduce.

Variations

Many variations of the above are feasible. For example:

There does not need to be a fixed number of surviving solution creatures (L) from each generation. The survival rule(s) can allow for a variable number of survivors.

There does not need to be a fixed number of new solution creatures created in each generation (N – L). The procreation rules can be independent of the size of the population. Procreation can be related to survival, thereby allowing the fittest solution creatures to procreate the most.

The decision as to whether or not to continue evolving can be varied. It can consider more than just the highest-rated solution creature from the most recent generation(s). It can also consider a trend that goes beyond just the last two generations.

12. Dileep George, “How the Brain Might Work: A Hierarchical and Temporal Model for Learning and Recognition” (PhD dissertation, Stanford University, June 2008).

13. A. M. Turing, “Computing Machinery and Intelligence,” Mind, October 1950.

14. Hugh Loebner has a “Loebner Prize” competition that is run each year. The Loebner silver medal will go to a computer that passes Turing’s original text-only test. The gold medal will go to a computer that can pass a version of the test that includes audio and video input and output. In my view, the inclusion of audio and video does not actually make the test more challenging.

15. “Cognitive Assistant That Learns and Organizes,” Artificial Intelligence Center, SRI International, http://www.ai.sri.com/project/CALO.

16. Dragon Go! Nuance Communications, Inc., http://www.nuance.com/products/dragon-go-in-action/index.htm.

17. “Overcoming Artificial Stupidity,” WolframAlpha Blog, April 17, 2012, http://blog.wolframalpha.com/author/stephenwolfram/.

Chapter 8: The Mind as Computer

1. Salomon Bochner, A Biographical Memoir of John von Neumann (Washington, DC: National Academy of Sciences, 1958).

2. A. M. Turing, “On Computable Numbers, with an Application to the Entscheidungsproblem,” Proceedings of the London Mathematical Society Series 2, vol. 42 (1936–37): 230–65, http://www.comlab.ox.ac.uk/activities/ieg/e-library/sources/tp2-ie.pdf. A. M. Turing, “On Computable Numbers, with an Application to the Entscheidungsproblem: A Correction,” Proceedings of the London Mathematical Society 43 (1938): 544–46.

3. John von Neumann, “First Draft of a Report on the EDVAC,” Moore School of Electrical Engineering, University of Pennsylvania, June 30, 1945. John von Neumann, “A Mathematical Theory of Communication,” Bell System Technical Journal, July and October 1948.

4. Jeremy Bernstein, The Analytical Engine: Computers—Past, Present, and Future, rev. ed. (New York: William Morrow & Co., 1981).

5. “Japan’s K Computer Tops 10 Petaflop/s to Stay Atop TOP500 List,” Top 500, November 11, 2011, http://top500.org/lists/2011/11/press-release.

6. Carver Mead, Analog VLSI and Neural Systems (Reading, MA: Addison-Wesley, 1986).

7. “IBM Unveils Cognitive Computing Chips,” IBM news release, August 18, 2011, http://www-03.ibm.com/press/us/en/pressrelease/35251.wss.

8. “Japan’s K Computer Tops 10 Petaflop/s to Stay Atop TOP500 List.”

Chapter 9: Thought Experiments on the Mind

1. John R. Searle, “I Married a Computer,” in Jay W. Richards, ed., Are We Spiritual Machines? Ray Kurzweil vs. the Critics of Strong AI (Seattle: Discovery Institute, 2002).

2. Stuart Hameroff, Ultimate Computing: Biomolecular Consciousness and Nanotechnology (Amsterdam: Elsevier Science, 1987).

3. P. S. Sebel et al., “The Incidence of Awareness during Anesthesia: A Multicenter United States Study,” Anesthesia and Analgesia 99 (2004): 833–39.

4. Stuart Sutherland, The International Dictionary of Psychology (New York: Macmillan, 1990).

5. David Cockburn, “Human Beings and Giant Squids,” Philosophy 69, no. 268 (April 1994): 135–50.

6. Ivan Petrovich Pavlov, from a lecture given in 1913, published in Lectures on Conditioned Reflexes: Twenty-Five Years of Objective Study of the Higher Nervous Activity [Behavior] of Animals (London: Martin Lawrence, 1928), 222.

7. Roger W. Sperry, from James Arthur Lecture on the Evolution of the Human Brain, 1964, p. 2.

8. Henry Maudsley, “The Double Brain,” Mind 14, no. 54 (1889): 161–87.

9. Susan Curtiss and Stella de Bode, “Language after Hemispherectomy,” Brain and Cognition 43, nos. 1–3 (June–August 2000): 135–38.

10. E. P. Vining et al., “Why Would You Remove Half a Brain? The Outcome of 58 Children after Hemispherectomy—the Johns Hopkins Experience: 1968 to 1996,” Pediatrics 100 (August 1997): 163–71. M. B. Pulsifer et al., “The Cognitive Outcome of Hemispherectomy in 71 Children,” Epilepsia 45, no. 3 (March 2004): 243–54.

11. S. McClelland III and R. E. Maxwell, “Hemispherectomy for Intractable Epilepsy in Adults: The First Reported Series,” Annals of Neurology 61, no. 4 (April 2007): 372–76.

12. Lars Muckli, Marcus J. Naumerd, and Wolf Singer, “Bilateral Visual Field Maps in a Patient with Only One Hemisphere,” Proceedings of the National Academy of Sciences 106, no. 31 (August 4, 2009), http://dx.doi.org/10.1073/pnas.0809688106.

13. Marvin Minsky, The Society of Mind (New York: Simon and Schuster, 1988).

14. F. Fay Evans-Martin, The Nervous System (New York: Chelsea House, 2005), http://www.scribd.com/doc/5012597/The-Nervous-System.

15. Benjamin Libet, Mind Time: The Temporal Factor in Consciousness (Cambridge, MA: Harvard University Press, 2005).

16. Daniel C. Dennett, Freedom Evolves (New York: Viking, 2003).

17. Michael S. Gazzaniga, Who’s in Charge? Free Will and the Science of the Brain (New York: Ecco/HarperCollins, 2011).

18. David Hume, An Enquiry Concerning Human Understanding (1765), 2nd ed., edited by Eric Steinberg (Indianapolis: Hackett, 1993).

19. Arthur Schopenhauer, The Wisdom of Life.

20. Arthur Schopenhauer, On the Freedom of the Will (1839).

21. From Raymond Smullyan, 5000 B.C. and Other Philosophical Fantasies (New York: St. Martin’s Press, 1983).

22. For an insightful and entertaining examination of similar issues of identity and consciousness, see Martine Rothblatt, “The Terasem Mind Uploading Experiment,” International Journal of Machine Consciousness 4, no. 1 (2012): 141–58. In this paper, Rothblatt examines the issue of identity with regard to software that emulates a person based on “a database of video interviews and associated information about a predecessor person.” In this proposed future experiment, the software is successfully emulating the person it is based on.

23. “How Do You Persist When Your Molecules Don’t?” Science and Consciousness Review 1, no. 1 (June 2004), http://www.sci-con.org/articles/20040601.xhtml.

Chapter 10: The Law of Accelerating Returns Applied to the Brain

1. “DNA Sequencing Costs,” National Human Genome Research Institute, NIH, http://www.genome.gov/sequencingcosts/.

2. “Genetic Sequence Data Bank, Distribution Release Notes,” December 15, 2009, National Center for Biotechnology Information, National Library of Medicine, ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt.

3. “DNA Sequencing—The History of DNA Sequencing,” January 2, 2012, http://www.dnasequencing.org/history-of-dna.

4. “Cooper’s Law,” ArrayComm, http://www.arraycomm.com/technology/coopers-law.

5. “The Zettabyte Era,” Cisco, http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/VNI_Hyperconnectivity_WP.xhtml, and “Number of Internet Hosts,” Internet Systems Consortium, http://www.isc.org/solutions/survey/history.

6. TeleGeography © PriMetrica, Inc., 2012.

7. Dave Kristula, “The History of the Internet” (March 1997, update August 2001), http://www.davesite.com/webstation/net-history.shtml; Robert Zakon, “Hobbes’ Internet Timeline v8.0,” http://www.zakon.org/robert/internet/timeline; Quest Communications, 8-K for 9/13/1998 EX-99.1; Converge! Network Digest, December 5, 2002, http://www.convergedigest.com/Daily/daily.asp?vn=v9n229&fecha=December%2005,%202002; Jim Duffy, “AT&T Plans Backbone Upgrade to 40G,” Computerworld, June 7, 2006, http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9001032; “40G: The Fastest Connection You Can Get?” InternetNews.com, November 2, 2007, http://www.internetnews.com/infra/article.php/3708936; “Verizon First Global Service Provider to Deploy 100G on U.S. Long-Haul Network,” news release, Verizon, http://newscenter.verizon.com/press-releases/verizon/2011/verizon-first-global-service.xhtml.

8. Facebook, “Key Facts,” http://newsroom.fb.com/content/default.aspx?NewsAreaId=22.

9. http://www.kurzweilai.net/how-my-predictions-are-faring.

10. Calculations per Second per $1,000

11. Top 500 Supercomputer Sites, http://top500.org/.

12. “Microprocessor Quick Reference Guide,” Intel Research, http://www.intel.com/pressroom/kits/quickreffam.htm.

13. 1971–2000: VLSI Research Inc.

2001–2006: The International Technology Roadmap for Semiconductors, 2002 Update and 2004 Update, Table 7a, “Cost—Near-term Years,” “DRAM cost/bit at (packaged microcents) at production.”

2007–2008: The International Technology Roadmap for Semiconductors, 2007, Tables 7a and 7b, “Cost—Near-term Years,” “Cost—Long-term Years,” http://www.itrs.net/Links/2007ITRS/ExecSum2007.pdf.

2009–2022: The International Technology Roadmap for Semiconductors, 2009, Tables 7a and 7b, “Cost—Near-term Years,” “Cost—Long-term Years,” http://www.itrs.net/Links/2009ITRS/Home2009.htm.

14. To make all dollar values comparable, computer prices for all years were converted to their year 2000 dollar equivalent using the Federal Reserve Board’s CPI data at http://minneapolisfed.org/research/data/us/calc/. For example, $1 million in 1960 is equivalent to $5.8 million in 2000, and $1 million in 2004 is equivalent to $0.91 million in 2000.

1949: http://www.cl.cam.ac.uk/UoCCL/misc/EDSAC99/statistics.xhtml, http://www.davros.org/misc/chronology.xhtml.

1951: Richard E. Matick, Computer Storage Systems and Technology (New York: John Wiley & Sons, 1977); http://inventors.about.com/library/weekly/aa062398.htm.

1955: Matick, Computer Storage Systems and Technology; OECD, 1968, http://members.iinet.net.au/~dgreen/timeline.xhtml.

1960: ftp://rtfm.mit.edu/pub/usenet/alt.sys.pdp8/PDP-8_Frequently_Asked_Questions_%28posted_every_other_month%29; http://www.dbit.com/~greeng3/pdp1/pdp1.xhtml#INTRODUCTION.

1962: ftp://rtfm.mit.edu/pub/usenet/alt.sys.pdp8/PDP-8_Frequently_Asked_Questions_%28posted_every_other_month%29.

1964: Matick, Computer Storage Systems and Technology; http://www.research.microsoft.com/users/gbell/craytalk; http://www.ddj.com/documents/s=1493/ddj 0005hc/.

1965: Matick, Computer Storage Systems and Technology; http://www.fourmilab.ch/documents/univac/config1108.xhtml; http://www.frobenius.com/univac.htm.

1968: Data General.

1969, 1970: http://www.eetimes.com/special/special_issues/millennium/mile stones/whittier.xhtml.

1974: Scientific Electronic Biological Computer Consulting (SCELBI).

1975–1996: Byte magazine advertisements.

1997–2000: PC Computing magazine advertisements.

2001: www.pricewatch.com (http://www.jc-news.com/parse.cgi?news/pricewatch/raw/pw-010702).

2002: www.pricewatch.com (http://www.jc-news.com/parse.cgi?news/price watch/raw/pw-020624).

2003: http://sharkyextreme.com/guides/WMPG/article.php/10706_2227191_2.

2004: http://www.pricewatch.com (11/17/04).

2008: http://www.pricewatch.com (10/02/08) ($16.61).

15. Dataquest/Intel and Pathfinder Research:

Year	$	Log ($)
1968	1.00000000	0
1969	0.85000000	−0.16252
1970	0.60000000	−0.51083
1971	0.30000000	−1.20397
1972	0.15000000	−1.89712
1973	0.10000000	−2.30259
1974	0.07000000	−2.65926
1975	0.02800000	−3.57555
1976	0.01500000	−4.19971
1977	0.00800000	−4.82831
1978	0.00500000	−5.29832
1979	0.00200000	−6.21461
1980	0.00130000	−6.64539
1981	0.00082000	−7.10621
1982	0.00040000	−7.82405
1983	0.00032000	−8.04719
1984	0.00032000	−8.04719
1985	0.00015000	−8.80488
1986	0.00009000	−9.31570
1987	0.00008100	−9.42106
1988	0.00006000	−9.72117
1989	0.00003500	−10.2602
1990	0.00002000	−10.8198
1991	0.00001700	−10.9823
1992	0.00001000	−11.5129
1993	0.00000900	−11.6183
1994	0.00000800	−11.7361
1995	0.00000700	−11.8696
1996	0.00000500	−12.2061
1997	0.00000300	−12.7169
1998	0.00000140	−13.4790
1999	0.00000095	−13.8668
2000	0.00000080	−14.0387
2001	0.00000035	−14.8653
2002	0.00000026	−15.1626
2003	0.00000017	−15.5875
2004	0.00000012	−15.9358
2005	0.000000081	−16.3288
2006	0.000000063	−16.5801
2007	0.000000024	−17.5452
2008	0.000000016	−17.9507

16. Steve Cullen, In-Stat, September 2008, www.instat.com.

Year	Mbits	Bits
1971	921.6	9.216E+08
1972	3788.8	3.789E+09
1973	8294.4	8.294E+09
1974	19865.6	1.987E+10
1975	42700.8	4.270E+10
1976	130662.4	1.307E+11
1977	276070.4	2.761E+11
1978	663859.2	6.639E+11
1979	1438720.0	1.439E+12
1980	3172761.6	3.173E+12
1981	4512665.6	4.513E+12
1982	11520409.6	1.152E+13
1983	29648486.4	2.965E+13
1984	68418764.8	6.842E+13
1985	87518412.8	8.752E+13
1986	192407142.4	1.924E+14
1987	255608422.4	2.556E+14
1988	429404979.2	4.294E+14
1989	631957094.4	6.320E+14
1990	950593126.4	9.506E+14
1991	1546590618	1.547E+15
1992	2845638656	2.846E+15
1993	4177959322	4.178E+15
1994	7510805709	7.511E+15
1995	13010599936	1.301E+16
1996	23359078007	2.336E+16
1997	45653879161	4.565E+16
1998	85176878105	8.518E+16
1999	1.47327E+11	1.473E+17
2000	2.63636E+11	2.636E+17
2001	4.19672E+11	4.197E+17
2002	5.90009E+11	5.900E+17
2003	8.23015E+11	8.230E+17
2004	1.32133E+12	1.321E+18
2005	1.9946E+12	1.995E+18
2006	2.94507E+12	2.945E+18
2007	5.62814E+12	5.628E+18

17. “Historical Notes about the Cost of Hard Drive Storage Space,” http://www.littletechshoppe.com/ns1625/winchest.xhtml; Byte magazine advertisements, 1977–1998; PC Computing magazine advertisements, 3/1999; Understanding Computers: Memory and Storage (New York: Time Life, 1990); http://www.cedmagic.com/history/ibm-305-ramac.xhtml; John C. McCallum, “Disk Drive Prices (1955–2012),” http://www.jcmit.com/diskprice.htm; IBM, “Frequently Asked Questions,” http://www-03.ibm.com/ibm/history/documents/pdf/faq.pdf; IBM, “IBM 355 Disk Storage Unit,” http://www-03.ibm.com/ibm/history/exhibits/storage/storage_355.xhtml; IBM, “IBM 3380 Direct Access Storage Device,” http://www.03-ibm.com/ibm/history/exhibits/storage/storage_3380.xhtml.

18. “Without Driver or Map, Vans Go from Italy to China,” Sydney Morning Herald, October 29, 2010, http://www.smh.com.au/technology/technology-news/without-driver-or-map-vans-go-from-italy-to-china-20101029-176ja.xhtml.

19. KurzweilAI.net.

20. Adapted with permission from Amiram Grinvald and Rina Hildesheim, “VSDI: A New Era in Functional Imaging of Cortical Dynamics,” Nature Reviews Neuroscience 5 (November 2004): 874–85.

The main tools for imaging the brain are shown in this diagram. Their capabilities are depicted by the shaded rectangles.

Spatial resolution refers to the smallest dimension that can be measured with a technique. Temporal resolution is imaging time or duration. There are tradeoffs with each technique. For example, EEG (electroencephalography), which measures “brain waves” (electrical signals from neurons), can measure very rapid brain waves (occurring in short time intervals), but can only sense signals near the surface of the brain.

In contrast, fMRI (functional magnetic resonance imaging), which uses a special MRI machine to measure blood flow to neurons (indicating neuron activity), can sense a lot deeper in the brain (and spinal cord) and with higher resolution, down to tens of microns (millionths of a meter). However, fMRI operates very slowly compared with EEG.

These are noninvasive techniques (no surgery or drugs are required). MEG (magnetoencephalography) is another noninvasive technique. It detects magnetic fields generated by neurons. MEG and EEG can resolve events with a temporal resolution of down to 1 millisecond, but better than fMRI, which can at best resolve events with a resolution of several hundred milliseconds. MEG also accurately pinpoints sources in primary auditory, somatosensory, and motor areas.

Optical imaging covers almost the entire range of spatial and temporal resolutions, but is invasive. VSDI (voltage-sensitive dyes) is the most sensitive method of measuring brain activity, but is limited to measurements near the surface of the cortex of animals.

The exposed cortex is covered with a transparent sealed chamber; after the cortex is stained with a suitable voltage-sensitive dye, it is illuminated with light and a sequence of images is taken with a high-speed camera. Other optical techniques used in the lab include ion imaging (typically calcium or sodium ions) and fluorescence imaging systems (confocal imaging and multiphoton imaging).

Other lab techniques include PET (positron emission tomography, a nuclear medicine imaging technique that produces a 3-D image), 2DG (2-deoxyglucose postmortem histology, or tissue analysis), lesions (involves damaging neurons in an animal and observing the effects), patch clamping (to measure ion currents across biological membranes), and electron microscopy (using an electron beam to examine tissues or cells at a very fine scale). These techniques can also be integrated with optical imaging.

21. MRI spatial resolution in microns (μm), 1980–2012:

22. Spatial resolution in nanometers (nm) of destructive imaging techniques, 1983–2011:

23. Spatial resolution in microns (μm) of nondestructive imaging techniques in animals, 1985–2012:

Year	Finding
2012	Resolution	0.07
	Citation	Sebastian Berning et al., “Nanoscopy in a Living Mouse Brain,” Science 335, no. 6068 (February 3, 2012): 551.
	URL	http://dx.doi.org/10.1126/science.1215369
	Technique	Stimulated emission depletion (STED) fluorescence nanoscopy
	Notes	Highest resolution achieved in vivo so far
2012	Resolution	0.25
	Citation	Sebastian Berning et al., “Nanoscopy in a Living Mouse Brain,” Science 335, no. 6068 (February 3, 2012): 551.
	URL	http://dx.doi.org/10.1126/science.1215369
	Technique	Confocal and multiphoton microscopy
2004	Resolution	50
	Citation	Amiram Grinvald and Rina Hildesheim, “VSDI: A New Era in Functional Imaging of Cortical Dynamics,” Nature Reviews Neuroscience 5 (November 2004): 874–85.
	URL	http://dx.doi.org/10.1038/nrn1536
	Technique	Imaging based on voltage-sensitive dyes (VSDI)
	Notes	“VSDI has provided high-resolution maps, which correspond to cortical columns in which spiking occurs, and offer a spatial resolution better than 50 μm.”
1996	Resolution	50
	Citation	Dov Malonek and Amiram Grinvald, “Interactions between Electrical Activity and Cortical Microcirculation Revealed by Imaging Spectroscopy: Implications for Functional Brain Mapping,” Science 272, no. 5261 (April 26, 1996): 551–54.
	URL	http://dx.doi.org/10.1126/science.272.5261.551
	Technique	Imaging spectroscopy
	Notes	“The study of spatial relationships between individual cortical columns within a given brain area has become feasible with optical imaging based on intrinsic signals, at a spatial resolution of about 50 μm.”
1995	Resolution	50
	Citation	D. H. Turnbull et al., “Ultrasound Backscatter Microscope Analysis of Early Mouse Embryonic Brain Development,” Proceedings of the National Academy of Sciences 92, no. 6 (March 14, 1995): 2239–43.
	URL	http://www.pnas.org/content/92/6/2239.short
	Technique	Ultrasound backscatter microscopy
	Notes	“We demonstrate application of a real-time imaging method called ultrasound backscatter microscopy for visualizing mouse early embryonic neural tubes and hearts. This method was used to study live embryos in utero between 9.5 and 11.5 days of embryogenesis, with a spatial resolution close to 50 μm.”
1985	Resolution	500
	Citation	H. S. Orbach, L. B. Cohen, and A. Grinvald, “Optical Mapping of Electrical Activity in Rat Somatosensory and Visual Cortex,” Journal of Neuroscience 5, no. 7 (July 1, 1985): 1886–95.
	URL	http://www.jneurosci.org/content/5/7/1886.short
	Technique	Optical methods

Chapter 11: Objections

1. Paul G. Allen and Mark Greaves, “Paul Allen: The Singularity Isn’t Near,” Technology Review, October 12, 2011, http://www.technologyreview.com/blog/guest/27206/.

2. ITRS, “International Technology Roadmap for Semiconductors,” http://www.itrs.net/Links/2011ITRS/Home2011.htm.

3. Ray Kurzweil, The Singularity Is Near (New York: Viking, 2005), chapter 2.

4. Endnote 2 in Allen and Greaves, “The Singularity Isn’t Near,” reads as follows: “We are beginning to get within range of the computer power we might need to support this kind of massive brain simulation. Petaflop-class computers (such as IBM’s BlueGene/P that was used in the Watson system) are now available commercially. Exaflop-class computers are currently on the drawing boards. These systems could probably deploy the raw computational capability needed to simulate the firing patterns for all of a brain’s neurons, though currently it happens many times more slowly than would happen in an actual brain.”

5. Kurzweil, The Singularity Is Near, chapter 9, section titled “The Criticism from Software” (pp. 435–42).

6. Ibid., chapter 9.

7. Although it is not possible to precisely determine the information content in the genome, because of the repeated base pairs it is clearly much less than the total uncompressed data. Here are two approaches to estimating the compressed information content of the genome, both of which demonstrate that a range of 30 to 100 million bytes is conservatively high.

1. In terms of the uncompressed data, there are 3 billion DNA rungs in the human genetic code, each coding 2 bits (since there are four possibilities for each DNA base pair). Thus the human genome is about 800 million bytes uncompressed. The noncoding DNA used to be called “junk DNA,” but it is now clear that it plays an important role in gene expression. However, it is very inefficiently coded. For one thing, there are massive redundancies (for example, the sequence called “ALU” is repeated hundreds of thousands of times), which compression algorithms can take advantage of.

With the recent explosion of genetic data banks, there is a great deal of interest in compressing genetic data. Recent work on applying standard data compression algorithms to genetic data indicates that reducing the data by 90 percent (for bit perfect compression) is feasible: Hisahiko Sato et al., “DNA Data Compression in the Post Genome Era,” Genome Informatics 12 (2001): 512–14, http://www.jsbi.org/journal/GIW01/GIW01P130.pdf.

Thus we can compress the genome to about 80 million bytes without loss of information (meaning we can perfectly reconstruct the full 800-million-byte uncompressed genome).

Now consider that more than 98 percent of the genome does not code for proteins. Even after standard data compression (which eliminates redundancies and uses a dictionary lookup for common sequences), the algorithmic content of the noncoding regions appears to be rather low, meaning that it is likely that we could code an algorithm that would perform the same function with fewer bits. However, since we are still early in the process of reverse-engineering the genome, we cannot make a reliable estimate of this further decrease based on a functionally equivalent algorithm. I am using, therefore, a range of 30 to 100 million bytes of compressed information in the genome. The top part of this range assumes only data compression and no algorithmic simplification.

Only a portion (although the majority) of this information characterizes the design of the brain.

2. Another line of reasoning is as follows. Though the human genome contains around 3 billion bases, only a small percentage, as mentioned above, codes for proteins. By current estimates, there are 26,000 genes that code for proteins. If we assume those genes average 3,000 bases of useful data, those equal only approximately 78 million bases. A base of DNA requires only 2 bits, which translate to about 20 million bytes (78 million bases divided by four). In the protein-coding sequence of a gene, each “word” (codon) of three DNA bases translates into one amino acid. There are, therefore, 4³ (64) possible codon codes, each consisting of three DNA bases. There are, however, only 20 amino acids used plus a stop codon (null amino acid) out of the 64. The rest of the 43 codes are used as synonyms of the 21 useful ones. Whereas 6 bits are required to code for 64 possible combinations, only about 4.4 (log₂ 21) bits are required to code for 21 possibilities, a savings of 1.6 out of 6 bits (about 27 percent), bringing us down to about 15 million bytes. In addition, some standard compression based on repeating sequences is feasible here, although much less compression is possible on this protein-coding portion of the DNA than in the so-called junk DNA, which has massive redundancies. So this will bring the figure probably below 12 million bytes. However, now we have to add information for the noncoding portion of the DNA that controls gene expression. Although this portion of the DNA constitutes the bulk of the genome, it appears to have a low level of information content and is replete with massive redundancies. Estimating that it matches the approximately 12 million bytes of protein-coding DNA, we again come to approximately 24 million bytes. From this perspective, an estimate of 30 to 100 million bytes is conservatively high.

8. Dharmendra S. Modha et al., “Cognitive Computing,” Communications of the ACM 54, no. 8 (2011): 62–71, http://cacm.acm.org/magazines/2011/8/114944-cognitive-computing/fulltext.

9. Kurzweil, The Singularity Is Near, chapter 9, section titled “The Criticism from Ontology: Can a Computer Be Conscious?” (pp. 458–69).

10. Michael Denton, “Organism and Machine: The Flawed Analogy,” in Are We Spiritual Machines? Ray Kurzweil vs. the Critics of Strong AI (Seattle: Discovery Institute, 2002).

11. Hans Moravec, Mind Children (Cambridge, MA: Harvard University Press, 1988).

Epilogue

1. “In U.S., Optimism about Future for Youth Reaches All-Time Low,” Gallup Politics, May 2, 2011, http://www.gallup.com/poll/147350/optimism-future-youth-reaches-time-low.aspx.

2. James C. Riley, Rising Life Expectancy: A Global History (Cambridge: Cambridge University Press, 2001).

3. J. Bradford DeLong, “Estimating World GDP, One Million B.C.—Present,” May 24, 1998, http://econ161.berkeley.edu/TCEH/1998_Draft/World_GDP/Estimating_World_GDP.xhtml, and http://futurist.typepad.com/my_weblog/2007/07/economic-growth.xhtml. See also Peter H. Diamandis and Steven Kotler, Abundance: The Future Is Better Than You Think (New York: Free Press, 2012).

4. Martine Rothblatt, Transgender to Transhuman (privately printed, 2011). She explains how a similarly rapid trajectory of acceptance is most likely to occur for “transhumans,” for example, nonbiological but convincingly conscious minds as discussed in chapter 9.

5. The following excerpt from The Singularity Is Near, chapter 3 (pp. 133–35), by Ray Kurzweil (New York: Viking, 2005), discusses the limits of computation based on the laws of physics:

The ultimate limits of computers are profoundly high. Building on work by University of California at Berkeley Professor Hans Bremermann and nanotechnology theorist Robert Freitas, MIT Professor Seth Lloyd has estimated the maximum computational capacity, according to the known laws of physics, of a computer weighing one kilogram and occupying one liter of volume—about the size and weight of a small laptop computer—what he calls the “ultimate laptop.”

[Note: Seth Lloyd, “Ultimate Physical Limits to Computation,” Nature 406 (2000): 1047–54.

[Early work on the limits of computation were done by Hans J. Bremermann in 1962: Hans J. Bremermann, “Optimization Through Evolution and Recombination,” in M. C. Yovits, C. T. Jacobi, C. D. Goldstein, eds., Self-Organizing Systems (Washington, D.C.: Spartan Books, 1962), pp. 93–106.

[In 1984 Robert A. Freitas Jr. built on Bremermann’s work in Robert A. Freitas Jr., “Xenopsychology,” Analog 104 (April 1984): 41–53, http://www.rfreitas.com/Astro/Xenopsychology.htm#SentienceQuotient.]

The potential amount of computation rises with the available energy. We can understand the link between energy and computational capacity as follows. The energy in a quantity of matter is the energy associated with each atom (and subatomic particle). So the more atoms, the more energy. As discussed above, each atom can potentially be used for computation. So the more atoms, the more computation. The energy of each atom or particle grows with the frequency of its movement: the more movement, the more energy. The same relationship exists for potential computation: the higher the frequency of movement, the more computation each component (which can be an atom) can perform. (We see this in contemporary chips: the higher the frequency of the chip, the greater its computational speed.)

So there is a direct proportional relationship between the energy of an object and its potential to perform computation. The potential energy in a kilogram of matter is very large, as we know from Einstein’s equation E = mc². The speed of light squared is a very large number: approximately 10¹⁷ meter²/second². The potential of matter to compute is also governed by a very small number, Planck’s constant: 6.6 × 10⁻³⁴ joule-seconds (a joule is a measure of energy). This is the smallest scale at which we can apply energy for computation. We obtain the theoretical limit of an object to perform computation by dividing the total energy (the average energy of each atom or particle times the number of such particles) by Planck’s constant.

Lloyd shows how the potential computing capacity of a kilogram of matter equals pi times energy divided by Planck’s constant. Since the energy is such a large number and Planck’s constant is so small, this equation generates an extremely large number: about 5 × 10⁵⁰ operations per second.

[Note: π × maximum energy (10¹⁷ kg × meter²/second²) / (6.6 × 10^–34) joule-seconds) = ~ 5 × 10⁵⁰ operations/second.]

If we relate that figure to the most conservative estimate of human brain capacity (10¹⁹ cps and 10¹⁰ humans), it represents the equivalent of about 5 billion trillion human civilizations.

[Note: 5 × 10⁵⁰ cps is equivalent to 5 × 10²¹ (5 billion trillion) human civilizations (each requiring 10²⁹ cps).]

If we use the figure of 10¹⁶ cps that I believe will be sufficient for functional emulation of human intelligence, the ultimate laptop would function at the equivalent brain power of 5 trillion trillion human civilizations.

[Note: Ten billion (10¹⁰) humans at 10¹⁶ cps each is 10²⁶ cps for human civilization. So 5 × 10⁵⁰ cps is equivalent to 5 × 10²⁴ (5 trillion trillion) human civilizations.]

Such a laptop could perform the equivalent of all human thought over the last ten thousand years (that is, ten billion human brains operating for ten thousand years) in one ten-thousandth of a nanosecond.

[Note: This estimate makes the conservative assumption that we’ve had ten billion humans for the past ten thousand years, which is obviously not the case. The actual number of humans has been increasing gradually over the past to reach about 6.1 billion in 2000. There are 3 × 10⁷ seconds in a year, and 3 × 10¹¹ seconds in ten thousand years. So, using the estimate of 10²⁶ cps for human civilization, human thought over ten thousand years is equivalent to certainly no more than 3 × 10³⁷ calculations. The ultimate laptop performs 5 × 10⁵⁰ calculations in one second. So simulating ten thousand years of ten billion humans’ thoughts would take it about 10^–13 seconds, which is one ten-thousandth of a nanosecond.]

Again, a few caveats are in order. Converting all of the mass of our 2.2-pound laptop into energy is essentially what happens in a thermonuclear explosion. Of course, we don’t want the laptop to explode but to stay within its one-liter dimension. So this will require some careful packaging, to say the least. By analyzing the maximum entropy (degrees of freedom represented by the state of all the particles) in such a device, Lloyd shows that such a computer would have a theoretical memory capacity of 10³¹ bits. It’s difficult to imagine technologies that would go all the way in achieving these limits. But we can readily envision technologies that come reasonably close to doing so. As the University of Oklahoma project shows, we already demonstrated the ability to store at least fifty bits of information per atom (although only on a small number of atoms, so far). Storing 10²⁷ bits of memory in the 10²⁵ atoms in a kilogram of matter should therefore be eventually achievable.

But because many properties of each atom could be exploited to store information—such as the precise position, spin, and quantum state of all of its particles—we can probably do somewhat better than 10²⁷ bits. Neuroscientist Anders Sandberg estimates the potential storage capacity of a hydrogen atom at about four million bits. These densities have not yet been demonstrated, however, so we’ll use the more conservative estimate.

[Note: Anders Sandberg, “The Physics of the Information Processing Superobjects: Daily Life Among the Jupiter Brains,” Journal of Evolution and Technology 5 (December 22, 1999), http://www.transhumanist.com/volume5/Brains2.pdf.]

As discussed above, 10⁴² calculations per second could be achieved without producing significant heat. By fully deploying reversible computing techniques, using designs that generate low levels of errors, and allowing for reasonable amounts of energy dissipation, we should end up somewhere between 10⁴² and 10⁵⁰ calculations per second.

The design terrain between these two limits is complex. Examining the technical issues that arise as we advance from 10⁴² to 10⁵⁰ is beyond the scope of this chapter. We should keep in mind, however, that the way this will play out is not by starting with the ultimate limit of 10⁵⁰ and working backward based on various practical considerations. Rather, technology will continue to ramp up, always using its latest prowess to progress to the next level. So once we get to a civilization with 10⁴² cps (for every 2.2 pounds), the scientists and engineers of that day will use their essentially vast nonbiological intelligence to figure out how to get 10⁴³, then 10⁴⁴, and so on. My expectation is that we will get very close to the ultimate limits.

Even at 10⁴² cps, a 2.2-pound “ultimate portable computer” would be able to perform the equivalent of all human thought over the last ten thousand years (assumed at ten billion human brains for ten thousand years) in ten microseconds.

[Note: See note above. 10⁴² cps is a factor of 10^–8 less than 10⁵⁰ cps, so one ten-thousandth of a nanosecond becomes 10 microseconds.]

If we examine the Exponential Growth of Computing chart (chapter 2), we see that this amount of computing is estimated to be available for one thousand dollars by 2080.