The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World - читать бесплатно онлайн полную версию книги автора Pedro Domingos (CHAPTER TEN: This Is the World on Machine Learning) #13

CHAPTER TEN: This Is the World on Machine Learning

Now that you’ve toured the machine learning wonderland, let’s switch gears and see what it all means to you. Like the red pill in The Matrix, the Master Algorithm is the gateway to a different reality: the one you already live in but didn’t know it yet. From dating to work, from self-knowledge to the future of society, from data sharing to war, and from the dangers of AI to the next step in evolution, a new world is taking shape, and machine learning is the key that unlocks it. This chapter will help you make the most of it in your life and be ready for what comes next. Machine learning will not single-handedly determine the future, any more than any other technology; it’s what we decide to do with it that counts, and now you have the tools to decide.

Chief among these tools is the Master Algorithm. Whether it arrives sooner or later, and whether or not it looks like Alchemy, is less important than what it encapsulates: the essential capabilities of a learning algorithm, and where they’ll take us. We can equally well think of the Master Algorithm as a composite picture of current and future learners, which we can conveniently use in our thought experiments in lieu of the specific algorithm inside product X or website Y, which the respective companies are unlikely to share with us anyway. Seen in this light, the learners we interact with every day are embryonic versions of the Master Algorithm, and our task is to understand them and shape their growth to better serve our needs.

In the coming decades, machine learning will affect such a broad swath of human life that one chapter of one book cannot possibly do it justice. Nevertheless, we can already see a number of recurring themes, and it’s those we’ll focus on, starting with what psychologists call theory of mind-the computer’s theory of your mind, that is.

Sex, lies, and machine learning

Your digital future begins with a realization: every time you interact with a computer-whether it’s your smart phone or a server thousands of miles away-you do so on two levels. The first one is getting what you want there and then: an answer to a question, a product you want to buy, a new credit card. The second level, and in the long run the most important one, is teaching the computer about you. The more you teach it, the better it can serve you-or manipulate you. Life is a game between you and the learners that surround you. You can refuse to play, but then you’ll have to live a twentieth-century life in the twenty-first. Or you can play to win. What model of you do you want the computer to have? And what data can you give it that will produce that model? Those two questions should always be in the back of your mind whenever you interact with a learning algorithm-as they are when you interact with other people. Alice knows that Bob has a mental model of her and seeks to shape it through her behavior. If Bob is her boss, she tries to come across as competent, loyal, and hardworking. If instead Bob is someone she’s trying to seduce, she’ll be at her most seductive. We could hardly function in society without this ability to intuit and respond to what’s on other people’s minds. The novelty in the world today is that computers, not just people, are starting to have theories of mind. Their theories are still primitive, but they’re evolving quickly, and they’re what we have to work with to get what we want-no less than with other people. And so you need a theory of the computer’s mind, and that’s what the Master Algorithm provides, after plugging in the score function (what you think the learner’s goals are, or more precisely its owner’s) and the data (what you think it knows).

Take online dating. When you use Match.com, eHarmony, or OkCupid (suspend your disbelief, if necessary), your goal is simple: to find the best possible date you can. But chances are it will take a lot of work and several disappointing dates before you meet someone you really like. One hardy geek extracted twenty thousand profiles from OkCupid, did his own data mining, found the woman of his dreams on the eighty-eighth date, and told his odyssey to Wired magazine. To succeed with fewer dates and less work, your two main tools are your profile and your responses to suggested matches. One popular option is to lie (about your age, for example). This may seem unethical, not to mention liable to blow up in your face when your date discovers the truth, but there’s a twist. Savvy online daters already know that people lie about their age on their profiles and adjust accordingly, so if you state your true age, you’re effectively telling them you’re older than you really are! In turn, the learner doing the matching thinks people prefer younger dates than they really do. The logical next step is for people to lie about their age by even more, ultimately rendering this attribute meaningless.

A better way for all concerned is to focus on your specific, unusual attributes that are highly predictive of a match, in the sense that they pick out people you like that not everyone else does, and therefore have less competition for. Your job (and your prospective date’s) is to provide these attributes. The matcher’s job is to learn from them, in the same way that an old-fashioned matchmaker would. Compared to a village matchmaker, Match.com’s algorithm has the advantage that it knows vastly more people, but the disadvantage is that it knows them much more superficially. A naïve learner, such as a perceptron, will be content with broad generalizations like “gentlemen prefer blondes.” A more sophisticated one will find patterns like “people with the same unusual musical tastes are often good matches.” If Alice and Bob both like Beyoncé, that alone hardly singles them out for each other. But if they both like Bishop Allen, that makes them at least a little bit more likely to be potential soul mates. If they’re both fans of a band the learner does not know about, that’s even better, but only a relational algorithm like Alchemy can pick it up. The better the learner, the more it’s worth your time to teach it about you. But as a rule of thumb, you want to differentiate yourself enough so that it won’t confuse you with the “average person” (remember Bob Burns from Chapter 8), but not be so unusual that it can’t fathom you.

Online dating is in fact a tough example because chemistry is hard to predict. Two people who hit it off on a date may wind up falling in love and believing passionately that they were made for each other, but if their initial conversation takes a different turn, they might instead find each other annoying and never want to meet again. What a really sophisticated learner would do is run a thousand Monte Carlo simulations of a date between each pair of plausible matches and rank the matches by the fraction of dates that turned out well. Short of that, dating sites can organize parties and invite people who are each a likely match for many of the others, letting them accomplish in a few hours what would otherwise take weeks.

For those of us who are not keen on online dating, a more immediately useful notion is to choose which interactions to record and where. If you don’t want your Christmas shopping to leave Amazon confused about your tastes, do it on other sites. (Sorry, Amazon.) If you watch different kinds of videos at home and for work, keep two accounts on YouTube, one for each, and YouTube will learn to make the corresponding recommendations. And if you’re about to watch some videos of a kind that you ordinarily have no interest in, log out first. Use Chrome’s incognito mode not for guilty browsing (which you’d never do, of course) but for when you don’t want the current session to influence future personalization. On Netflix, adding profiles for the different people using your account will spare you R-rated recommendations on family movie night. If you don’t like a company, click on their ads: this will not only waste their money now, but teach Google to waste it again in the future by showing the ads to people who are unlikely to buy the products. And if you have very specific queries that you want Google to answer correctly in the future, take a moment to trawl through the later results pages for the relevant links and click on them. More generally, if a system keeps recommending the wrong things to you, try teaching it by finding and clicking on a bunch of the right ones and come back later to see if it did.

That could be a lot of work, though. What all of these illustrate, unfortunately, is how narrow the communication channel between you and the learner is today. You should be able to tell it as much as you want about yourself, not just have it learn indirectly from what you do. More than that, you should be able to inspect the learner’s model of you and correct it as desired. The learner can still decide to ignore you, if it thinks you’re lying or are low on self-knowledge, but at least it would be able to take your input into account. For this, the model needs to be in a form that humans can understand, such as a set of rules rather than a neural network, and it needs to accept general statements as input in addition to raw data, as Alchemy does. All of which brings us to the question of how good a model of you a learner can have and what you’d want to do with that model.

The digital mirror

Take a moment to consider all the data about you that’s recorded on all the world’s computers: your e-mails, Office docs, texts, tweets, and Facebook and LinkedIn accounts; your web searches, clicks, downloads, and purchases; your credit, tax, phone, and health records; your Fitbit statistics; your driving as recorded by your car’s microprocessors; your wanderings as recorded by your cell phone; all the pictures of you ever taken; brief cameos on security cameras; your Google Glass snippets-and so on and so forth. If a future biographer had access to nothing but this “data exhaust” of yours, what picture of you would he form? Probably a quite accurate and detailed one in many ways, but also one where some essential things would be missing. Why did you, one beautiful day, decide to change careers? Could the biographer have predicted it ahead of time? What about that person you met one day and secretly never forgot? Could the biographer wind back through the found footage and say “Ah, there”?

The sobering (or perhaps reassuring) thought is that no learner in the world today has access to all this data (not even the NSA), and even if it did, it wouldn’t know how to turn it into a real likeness of you. But suppose you took all your data and gave it to the-real, future-Master Algorithm, already seeded with everything we could teach it about human life. It would learn a model of you, and you could carry that model in a thumb drive in your pocket, inspect it at will, and use it for everything you pleased. It would surely be a wonderful tool for introspection, like looking at yourself in the mirror, but it would be a digital mirror that showed not just your looks but all things observable about you-a mirror that could come alive and converse with you. What would you ask it? Some of the answers you might not like, but that would be all the more reason to ponder them. And some would give you new ideas, new directions. The Master Algorithm’s model of you might even help you become a better person.

Self-improvement aside, probably the first thing you’d want your model to do is negotiate the world on your behalf: let it loose in cyberspace, looking for all sorts of things for you. From all the world’s books, it would suggest a dozen you might want to read next, with more insight than Amazon could dream of. Likewise for movies, music, games, clothes, electronics-you name it. It would keep your refrigerator stocked at all times, natch. It would filter your e-mail, voice mail, Facebook posts, and Twitter feed and, when appropriate, reply on your behalf. It would take care of all the little annoyances of modern life for you, like checking credit-card bills, disputing improper charges, making arrangements, renewing subscriptions, and filling out tax returns. It would find a remedy for your ailment, run it by your doctor, and order it from Walgreens. It would bring interesting job opportunities to your attention, propose vacation spots, suggest which candidates to vote for on the ballot, and screen potential dates. And, after the match was made, it would team up with your date’s model to pick some restaurants you might both like. Which is where things start to get really interesting.

A society of models

In this rapidly approaching future, you’re not going to be the only one with a “digital half” doing your bidding twenty-four hours a day. Everyone will have a detailed model of him- or herself, and these models will talk to each other all the time. If you’re looking for a job and company X is looking to hire, its model will interview your model. It will be a lot like a real, flesh-and-blood interview-your model will still be well advised to not volunteer negative information about you, and so on-but it will take only a fraction of a second. You’ll click on “Find Job” in your future LinkedIn account, and you’ll immediately interview for every job in the universe that remotely fits your parameters (profession, location, pay, etc.). LinkedIn will respond on the spot with a ranked list of the best prospects, and out of those, you’ll pick the first company that you want to have a chat with. Same with dating: your model will go on millions of dates so you don’t have to, and come Saturday, you’ll meet your top prospects at an OkCupid-organized party, knowing that you’re also one of their top prospects-and knowing, of course, that their other top prospects are also in the room. It’s sure to be an interesting night.

In the world of the Master Algorithm, “my people will call your people” becomes “my program will call your program.” Everyone has an entourage of bots, smoothing his or her way through the world. Deals get pitched, terms negotiated, arrangements made, all before you lift a finger. Today, drug companies target your doctor, because he decides what drugs to prescribe to you. Tomorrow, the purveyors of every product and service you use, or might use, will target your model, because your model will screen them for you. Their bots’ job is to get your bot to buy. Your bot’s job is to see through their claims, just as you see through TV commercials, but at a much finer level of detail, one that you’d never have the time or patience for. Before you buy a car, the digital you will go over every one of its specs, discuss them with the manufacturer, and study everything anyone in the world has said about that car and its alternatives. Your digital half will be like power steering for your life: it goes where you want to go but with less effort from you. This does not mean that you’ll end up in a “filter bubble,” seeing only what you reliably like, with no room for the unexpected; the digital you knows better than that. Part of its brief is to leave some things open to chance, to expose you to new experiences, and to look for serendipity.

Even more interesting, the process doesn’t end when you find a car, a house, a doctor, a date, or a job. Your digital half is continually learning from its experiences, just as you would. It figures out what works and doesn’t, whether it’s in job interviews, dating, or real-estate hunting. It learns about the people and organizations it interacts with on your behalf and then (even more important) from your real-world interactions with them. It predicted Alice would be a great date for you, but you had an awkward time, so it hypothesizes possible reasons, which it will test on your next round of dating. It shares its most important findings with you. (“You believe you like X, but in reality you tend to go for Y.”) Comparing your experiences of various hotels with their reviews on TripAdvisor, it figures out what the really telling tidbits are and looks for them in the future. It learns not just which online merchants are more trustworthy but how to decode what the less trustworthy ones say. Your digital half has a model of the world: not just of the world in general but of the world as it relates to you. At the same time, of course, everyone else also has a continually evolving model of his or her world. Every party to an interaction learns from it and applies what it’s learned to its next interactions. You have your model of every person and organization you interact with, and they each have their model of you. As the models improve, their interactions become more and more like the ones you would have in the real world-except millions of times faster and in silicon. Tomorrow’s cyberspace will be a vast parallel world that selects only the most promising things to try out in the real one. It will be like a new, global subconscious, the collective id of the human race.

To share or not to share, and how and where

Of course, learning about the world all by yourself is slow, even if your digital half does it orders of magnitude faster than the flesh-and-blood you. If others learn about you faster than you learn about them, you’re in trouble. The answer is to share: a million people learn about a company or product a lot faster than a single one does, provided they pool their experiences. But who should you share data with? That’s perhaps the most important question of the twenty-first century.

Today your data can be of four kinds: data you share with everyone, data you share with friends or coworkers, data you share with various companies (wittingly or not), and data you don’t share. The first type includes things like Yelp, Amazon, and TripAdvisor reviews, eBay feedback scores, LinkedIn résumés, blogs, tweets, and so on. This data is very valuable and is the least problematic of the four. You make it available to everyone because you want to, and everyone benefits. The only problem is that the companies hosting the data don’t necessarily allow it to be downloaded in bulk for building models. They should. Today you can go to TripAdvisor and see the reviews and star ratings of particular hotels you’re considering, but what about a model of what makes a hotel good or bad in general, which you could use to rate hotels that currently have few or no reliable reviews? TripAdvisor could learn it, but what about a model of what makes a hotel good or bad for you? This requires information about you that you may not want to share with TripAdvisor. What you’d like is a trusted party that combines the two types of data and gives you the results.

The second kind of data should also be unproblematic, but it isn’t because it overlaps with the third. You share updates and pictures with your friends on Facebook, and they with you. But everyone shares their updates and pictures with Facebook. Lucky Facebook: it has a billion friends. Day by day, it learns a lot more about the world than any one person does. It would learn even more if it had better algorithms, and they are getting better every day, courtesy of us data scientists. Facebook’s main use for all this knowledge is to target ads to you. In return, it provides the infrastructure for your sharing. That’s the bargain you make when you use Facebook. As its learning algorithms improve, it gets more and more value out of the data, and some of that value returns to you in the form of more relevant ads and better service. The only problem is that Facebook is also free to do things with the data and the models that are not in your interest, and you have no way to stop it.

This problem pops up across the board with data you share with companies, which these days includes pretty much everything you do online as well as a lot of what you do offline. In case you haven’t noticed, there’s a mad race to gather data about you. Everybody loves your data, and no wonder: it’s the gateway to your world, your money, your vote, even your heart. But everyone has only a sliver of it. Google sees your searches, Amazon your online purchases, AT &T your phone calls, Apple your music downloads, Safeway your groceries, Capital One your credit-card transactions. Companies like Acxiom collate and sell information about you, but if you inspect it (which in Acxiom’s case you can, at aboutthedata.com), it’s not much, and some of it is wrong. No one has anything even approaching a complete picture of you. That’s both good and bad. Good because if someone did, they’d have far too much power. Bad because as long as that’s the case there can be no 360-degree model of you. What you really want is a digital you that you’re the sole owner of and that others can access only on your terms.

The last type of data-data you don’t share-also has a problem, which is that maybe you should share it. Maybe it hasn’t occurred to you to do so, maybe there’s no easy way to, or maybe you just don’t want to. In the latter case, you should consider whether you have an ethical responsibility to share. One example we’ve seen is cancer patients, who can contribute to curing cancer by sharing their tumors’ genomes and treatment histories. But it goes well beyond that. All sorts of questions about society and policy can potentially be answered by learning from the data we generate in our daily lives. Social science is entering a golden age, where it finally has data commensurate with the complexity of the phenomena it studies, and the benefits to all of us could be enormous-provided the data is accessible to researchers, policy makers, and citizens. This does not mean letting others peek into your private life; it means letting them see the learned models, which should contain only statistical information. So between you and them there needs to be an honest data broker that guarantees your data won’t be misused, but also that no free riders share the benefits without sharing the data.

In sum, all four kinds of data sharing have problems. These problems all have a common solution: a new type of company that is to your data what your bank is to your money. Banks don’t steal your money (with rare exceptions). They’re supposed to invest it wisely, and your deposits are FDIC-insured. Many companies today offer to consolidate your data somewhere in the cloud, but they’re still a far cry from your personal data bank. If they’re cloud providers, they try to lock you in-a big no-no. (Imagine depositing your money with Bank of America and not knowing if you’ll be able to transfer it to Wells Fargo somewhere down the line.) Some startups offer to hoard your data and then mete it out to advertisers in return for discounts, but to me that misses the point. Sometimes you want to give information to advertisers for free because it’s in your interests, sometimes you don’t want to give it at all, and what to share when is a problem that only a good model of you can solve.

The kind of company I’m envisaging would do several things in return for a subscription fee. It would anonymize your online interactions, routing them through its servers and aggregating them with its other users’. It would store all the data from all your life in one place-down to your 24/7 Google Glass video stream, if you ever get one. It would learn a complete model of you and your world and continually update it. And it would use the model on your behalf, always doing exactly what you would, to the best of the model’s ability. The company’s basic commitment to you is that your data and your model will never be used against your interests. Such a guarantee can never be foolproof-you yourself are not guaranteed to never do anything against your interests, after all. But the company’s life would depend on it as much as a bank’s depends on the guarantee that it won’t lose your money, so you should be able to trust it as much as you trust your bank.

A company like this could quickly become one of the most valuable in the world. As Alexis Madrigal of the Atlantic points out, today your profile can be bought for half a cent or less, but the value of a user to the Internet advertising industry is more like $1,200 per year. Google’s sliver of your data is worth about $20, Facebook’s $5, and so on. Add to that all the slivers that no one has yet, and the fact that the whole is more than the sum of the parts-a model of you based on all your data is much better than a thousand models based on a thousand slivers-and we’re looking at easily over a trillion dollars per year for an economy the size of the United States. It doesn’t take a large cut of that to make a Fortune 500 company. If you decide to take up the challenge and wind up becoming a billionaire, remember where you first got the idea.

Of course, some existing companies would love to host the digital you. Google, for example. Sergey Brin says that “we want Google to be the third half of your brain,” and some of Google’s acquisitions are probably not unrelated to how well their streams of user data complement its own. But, despite their head start, companies like Google and Facebook are not well suited to being your digital home because they have a conflict of interest. They earn a living by targeting ads, and so they have to balance your interests and the advertisers’. You wouldn’t let the first or second half of your brain have divided loyalties, so why would you let the third?

One possible showstopper is that the government may subpoena your data or even preventively jail you, Minority Report-style, if your model looks like a criminal’s. To forestall that, your data company can keep everything encrypted, with the key in your possession. (These days you can even compute over encrypted data without ever decrypting it.) Or you can keep it all in your hard disk at home, and the company just rents you the software.

If you don’t like the idea of a profit-making entity holding the keys to your kingdom, you can join a data union instead. (If there isn’t one in your neck of the cyberwoods yet, consider starting it.) The twentieth century needed labor unions to balance the power of workers and bosses. The twenty-first needs data unions for a similar reason. Corporations have a vastly greater ability to gather and use data than individuals. This leads to an asymmetry in power, and the more valuable the data-the better and more useful the models that can be learned from it-the greater the asymmetry. A data union lets its members bargain on equal terms with companies about the use of their data. Perhaps labor unions can get the ball rolling, and shore up their membership, by starting data unions for their members. But labor unions are organized by occupation and location; data unions can be more flexible. Join up with people you have a lot in common with; the models learned will be more useful to you that way. Notice that being in a data union does not mean letting other members see your data; it just means letting everyone use the models learned from the pooled data. Data unions can also be your vehicle for telling politicians what you want. Your data can influence the world as much as your vote-or more-because you only go to the polls on election day. On all other days, your data is your vote. Stand up and be counted!

So far I haven’t uttered the word privacy. That’s not by accident. Privacy is only one aspect of the larger issue of data sharing, and if we focus on it to the detriment of the whole, as much of the debate to date has, we risk reaching the wrong conclusions. For example, laws that forbid using data for any purpose other than the originally intended one are extremely myopic. (Not a single chapter of Freakonomics could have been written under such a law.) When people have to trade off privacy against other benefits, as when filling out a profile on a website, the implied value of privacy that comes out is much lower than if you ask them abstract questions like “Do you care about your privacy?” But privacy debates are more often framed in terms of the latter. The European Union’s Court of Justice has decreed that people have the right to be forgotten, but they also have the right to remember, whether it’s with their neurons or a hard disk. So do companies, and up to a point, the interests of users, data gatherers, and advertisers are aligned. Wasted attention benefits no one, and better data makes better products. Privacy is not a zero-sum game, even though it’s often treated like one.

Companies that host the digital you and data unions are what a mature future of data in society looks like to me. Whether we’ll get there is an open question. Today, most people are unaware of both how much data about them is being gathered and what the potential costs and benefits are. Companies seem content to continue doing it under the radar, terrified of a blowup. But sooner or later a blowup will happen, and in the ensuing fracas, draconian laws will be passed that in the end will serve no one. Better to foster awareness now and let everyone make their individual choices about what to share, what not, and how and where.

A neural network stole my job

How much of your brain does your job use? The more it does, the safer you are. In the early days of AI, the common view was that computers would replace blue-collar workers before white-collar ones, because white-collar work requires more brains. But that’s not quite how things turned out. Robots assemble cars, but they haven’t replaced construction workers. On the other hand, machine-learning algorithms have replaced credit analysts and direct marketers. As it turns out, evaluating credit applications is easier for machines than walking around a construction site without tripping, even though for humans it’s the other way around. The common theme is that narrowly defined tasks are easily learned from data, but tasks that require a broad combination of skills and knowledge aren’t. Most of your brain is devoted to vision and motion, which is a sign that walking around is much more complex than it seems; we just take it for granted because, having been honed to perfection by evolution, it’s mostly done subconsciously. The company Narrative Science has an AI system that can write pretty good summaries of baseball games, but not novels, because-pace George F. Will-there’s a lot more to life than to baseball games. Speech recognition is hard for computers because it’s hard to fill in the blanks-literally, the sounds speakers routinely elide-when you have no idea what the person is talking about. Algorithms can predict stock fluctuations but have no clue how they relate to politics. The more context a job requires, the less likely a computer will be able to do it soon. Common sense is important not just because your mom taught you so, but because computers don’t have it.

The best way to not lose your job is to automate it yourself. Then you’ll have time for all the parts of it that you didn’t before and that a computer won’t be able to do any time soon. (If there aren’t any, stay ahead of the curve and get a new job now.) If a computer has learned to do your job, don’t try to compete with it; harness it. H &R Block is still in business, but tax preparers’ jobs are much less dreary than they used to be, now that computers do most of the grunge work. (OK, perhaps this is not the best example, given that the tax code’s exponential growth is one of the few that can hold its own against computing power’s exponential growth.) Think of big data as an extension of your senses and learning algorithms as an extension of your brain. The best chess players these days are so-called centaurs, half-man and half-program. The same is true in many other occupations, from stock analyst to baseball scout. It’s not man versus machine; it’s man with machine versus man without. Data and intuition are like horse and rider, and you don’t try to outrun a horse; you ride it.

As technology progresses, an ever more intimate mix of human and machine takes shape. You’re hungry; Yelp suggests some good restaurants. You pick one; GPS gives you directions. You drive; car electronics does the low-level control. We are all cyborgs already. The real story of automation is not what it replaces but what it enables. Some professions disappear, but many more are born. Most of all, automation makes all sorts of things possible that would be way too expensive if done by humans. ATMs replaced some bank tellers, but mainly they let us withdraw money any time, anywhere. If pixels had to be colored one at a time by human animators, there would be no Toy Story and no video games.

Still, we can ask whether we’ll eventually run out of jobs for humans. I think not. Even if the day comes-and it won’t be soon-when computers and robots can do everything better, there will still be jobs for at least some of us. A robot may be able to do a perfect impersonation of a bartender, down to the small talk, but patrons may still prefer a bartender they know is human, just because he is. Restaurants with human waiters will have extra cachet, just as handmade goods already do. People still go to the theater, ride horses, and sail, even though we have movies, cars, and motorboats. More importantly, some professionals will be truly irreplaceable because their jobs require the one thing that computers and robots by definition cannot have: the human experience. By that I don’t mean touchy-feely jobs, because touchy-feely is not hard to fake; witness the success of robo-pets. I mean the humanities, whose domain is precisely everything you can’t understand without the experience of being human. We worry that the humanities are in a death spiral, but they’ll rise from the ashes once other professions have been automated. The more everything is done cheaply by machines, the more valuable the humanist’s contribution will be.

Conversely, the long-term prospects of scientists are not the brightest, sadly. In the future, the only scientists may well be computer scientists, meaning computers doing science. The people formerly known as scientists (like me) will devote their lives to understanding the scientific advances made by computers. They won’t be noticeably less happy than before; after all, science was always a hobby to them. And one very important job for the technically minded will remain: keeping an eye on the computers. In fact, this will require more than engineers; ultimately, it may be the full-time occupation of all mankind to figure out what we want from the machines and make sure we’re getting it-more on this later in this chapter.

In the meantime, as the boundary between automatable and non-automatable jobs advances across the economic landscape, what we’ll likely see is unemployment creeping up, downward pressure on the wages of more and more professions, and increasing rewards for the fewer and fewer that can’t yet be automated. This is what’s already happening, of course, but it has much further to run. The transition will be tumultuous, but thanks to democracy, it will have a happy ending. (Hold on to your vote-it may be the most valuable thing you have.) When the unemployment rate rises above 50 percent, or even before, attitudes about redistribution will radically change. The newly unemployed majority will vote for generous lifetime unemployment benefits and the sky-high taxes needed to fund them. These won’t break the bank because machines will do the necessary production. Eventually, we’ll start talking about the employment rate instead of the unemployment one and reducing it will be seen as a sign of progress. (“The US is falling behind. Our employment rate is still 23 percent.”) Unemployment benefits will be replaced by a basic income for everyone. Those of us who aren’t satisfied with it will be able to earn more, stupendously more, in the few remaining human occupations. Liberals and conservatives will still fight about the tax rate, but the goalposts will have permanently moved. With the total value of labor greatly reduced, the wealthiest nations will be those with the highest ratio of natural resources to population. (Move to Canada now.) For those of us not working, life will not be meaningless, any more than life on a tropical island where nature’s bounty meets all needs is meaningless. A gift economy will develop, of which the open-source software movement is a preview. People will seek meaning in human relationships, self-actualization, and spirituality, much as they do now. The need to earn a living will be a distant memory, another piece of humanity’s barbaric past that we rose above.

War is not for humans

Soldiering is harder to automate than science, but it will be as well. One of the prime uses of robots is to do things that are too dangerous for humans, and fighting wars is about as dangerous as it gets. Robots already defuse bombs, and drones allow a platoon to see over the hill. Self-driving supply trucks and robotic mules are on the way. Soon we will need to decide whether robots are allowed to pull the trigger on their own. The argument for doing this is that we want to get humans out of harm’s way, and remote control is not viable in fast-moving, shoot-or-be-shot situations. The argument against is that robots don’t understand ethics, and so can’t be entrusted with life-or-death decisions. But we can teach them. The deeper question is whether we’re ready to.

It’s not hard to state general principles like military necessity, proportionality, and sparing civilians. But there’s a gulf between them and concrete actions, which the soldier’s judgment has to bridge. Asimov’s three laws of robotics quickly run into trouble when robots try to apply them in practice, as his stories memorably illustrate. General principles are usually contradictory, if not self-contradictory, and they have to be lest they turn all shades of gray into black and white. When does military necessity outweigh sparing civilians? There is no universal answer and no way to program a computer with all the eventualities. Machine learning, however, provides an alternative. First, teach the robot to recognize the relevant concepts, for example with data sets of situations where civilians were and were not spared, armed response was and was not proportional, and so on. Then give it a code of conduct in the form of rules involving these concepts. Finally, let the robot learn how to apply the code by observing humans: the soldier opened fire in this case but not in that case. By generalizing from these examples, the robot can learn an end-to-end model of ethical decision making, in the form of, say, a large MLN. Once the robot’s decisions agree with a human’s as often as one human agrees with another, the training is complete, meaning the model is ready for download into thousands of robot brains. Unlike humans, robots don’t lose their heads in the heat of combat. If a robot malfunctions, the manufacturer is responsible. If it makes a wrong call, its teachers are.

The main problem with this scenario, as you may have already guessed, is that letting robots learn ethics by observing humans may not be such a good idea. The robot is liable to get seriously confused when it sees that humans’ actions often violate their ethical principles. We can clean up the training data by including only the examples where, say, a panel of ethicists agrees that the soldier made the right decision, and the panelists can also inspect and tweak the model post-learning to their satisfaction. Agreement may be hard to reach, however, particularly if the panel includes all the different kinds of people it should. Teaching ethics to robots, with their logical minds and lack of baggage, will force us to examine our assumptions and sort out our contradictions. In this, as in many other areas, the greatest benefit of machine learning may ultimately be not what the machines learn but what we learn by teaching them.

Another objection to robot armies is that they make war too easy. But if we unilaterally relinquish them, that could cost us the next war. The logical response, advocated by the United Nations and Human Rights Watch, is a treaty banning robot warfare, similar to the Geneva Protocol of 1925 banning chemical and biological warfare. This misses a crucial distinction, however. Chemical and biological warfare can only increase human suffering, but robot warfare can greatly decrease it. If a war is fought by machines, with humans only in command positions, no one is killed or wounded. Perhaps, then, what we should do, instead of outlawing robot soldiers, is-when we’re ready-outlaw human soldiers.

Robot armies may indeed make wars more likely, but they will also change the ethics of war. Shoot/don’t shoot dilemmas become much easier if the targets are other robots. The modern view of war as an unspeakable horror, to be engaged in only as a last resort, will give way to a more nuanced view of war as an orgy of destruction that leaves all sides impoverished and is best avoided but not at all costs. And if war is reduced to a competition to see who can destroy the most, then why not compete instead to create the most?

In any case, banning robot warfare may not be viable. Far from banning drones-the precursors of tomorrow’s warbots-countries large and small are busy developing them, presumably because in their estimation the benefits outweigh the risks. As with any weapon, it’s safer to have robots than to trust the other side not to. If in future wars millions of kamikaze drones will destroy conventional armies in minutes, they’d better be our drones. If World War III will be over in seconds, as one side takes control of the other’s systems, we’d better have the smarter, faster, more resilient network. (Off-grid systems are not the answer: systems that aren’t networked can’t be hacked, but they can’t compete with networked systems, either.) And, on balance, a robot arms race may be a good thing, if it hastens the day when the Fifth Geneva Convention bans humans in combat. War will always be with us, but the casualties of war need not be.

Google + Master Algorithm = Skynet?

Of course, robot armies also raise a whole different specter. According to Hollywood, the future of humanity is to be snuffed out by a gargantuan AI and its vast army of machine minions. (Unless, of course, a plucky hero saves the day in the last five minutes of the movie.) Google already has the gargantuan hardware such an AI would need, and it’s recently acquired an army of robotics startups to go with it. If we drop the Master Algorithm into its servers, is it game over for humanity? Why yes, of course. It’s time to reveal my true agenda, with apologies to Tolkien:

Three Algorithms for the Scientists under the sky,

Seven for the Engineers in their halls of servers,

Nine for Mortal Businesses doomed to die,

One for the Dark AI on its dark throne,

In the Land of Learning where the Data lies.

One Algorithm to rule them all, One Algorithm to find them,

One Algorithm to bring them all and in the darkness bind them,

In the Land of Learning where the Data lies.

Hahahaha! Seriously, though, should we worry that machines will take over? The signs seem ominous. With every passing year, computers don’t just do more of the world’s work; they make more of the decisions. Who gets credit, who buys what, who gets what job and what raise, which stocks will go up and down, how much insurance costs, where police officers patrol and therefore who gets arrested, how long their prison terms will be, who dates whom and therefore who will be born: machine-learned models already play a part in all of these. The point where we could turn off all our computers without causing the collapse of modern civilization has long passed. Machine learning is the last straw: if computers can start programming themselves, all hope of controlling them is surely lost. Distinguished scientists like Stephen Hawking have called for urgent research on this issue before it’s too late.

Relax. The chances that an AI equipped with the Master Algorithm will take over the world are zero. The reason is simple: unlike humans, computers don’t have a will of their own. They’re products of engineering, not evolution. Even an infinitely powerful computer would still be only an extension of our will and nothing to fear. Recall the three components of every learning algorithm: representation, evaluation, and optimization. The learner’s representation circumscribes what it can learn. Let’s make it a very powerful one, like Markov logic, so the learner can in principle learn anything. The optimizer then does everything in its power to maximize the evaluation function-no more and no less-and the evaluation function is determined by us. A more powerful computer will just optimize it better. There’s no risk of it getting out of control, even if it’s a genetic algorithm. A learned system that didn’t do what we want would be severely unfit and soon die out. In fact, it’s the systems that have even a slight edge in serving us better that will, generation after generation, multiply and take over the gene pool. Of course, if we’re so foolish as to deliberately program a computer to put itself above us, then maybe we’ll get what we deserve.

The same reasoning applies to all AI systems because they all-explicitly or implicitly-have the same three components. They can vary what they do, even come up with surprising plans, but only in service of the goals we set them. A robot whose programmed goal is “make a good dinner” may decide to cook a steak, a bouillabaisse, or even a delicious new dish of its own creation, but it can’t decide to murder its owner any more than a car can decide to fly away. The purpose of AI systems is to solve NP-complete problems, which, as you may recall from Chapter 2, may take exponential time, but the solutions can always be checked efficiently. We should therefore welcome with open arms computers that are vastly more powerful than our brains, safe in the knowledge that our job is exponentially easier than theirs. They have to solve the problems; we just have to check that they did so to our satisfaction. AIs will think fast what we think slow, and the world will be the better for it. I, for one, welcome our new robot underlings.

It’s natural to worry about intelligent machines taking over because the only intelligent entities we know are humans and other animals, and they definitely have a will of their own. But there is no necessary connection between intelligence and autonomous will; or rather, intelligence and will may not inhabit the same body, provided there is a line of control between them. In The Extended Phenotype, Richard Dawkins shows how nature is replete with examples of an animal’s genes controlling more than its own body, from cuckoo eggs to beaver dams. Technology is the extended phenotype of man. This means we can continue to control it even if it becomes far more complex than we can understand.

Picture two strands of DNA going for a swim in their private pool, aka a bacterium’s cytoplasm, two billion years ago. They’re pondering a momentous decision. “I’m worried, Diana,” says one. “If we start making multicellular creatures, will they take over?” Fast-forward to the twenty-first century, and DNA is still alive and well. Better than ever, in fact, with an increasing fraction living safely in bipedal organisms comprising trillions of cells. It’s been quite a ride for our tiny double-stranded friends since they made their momentous decision. Humans are their trickiest creation yet; we’ve invented things like contraception that let us have fun without spreading our DNA, and we have-or seem to have-free will. But it’s still DNA that shapes our notions of fun, and we use our free will to pursue pleasure and avoid pain, which, for the most part, still coincides with what’s best for our DNA’s survival. We may yet be DNA’s demise if we choose to transmute ourselves into silicon, but even then, it’s been a great two billion years. The decision we face today is similar: if we start making AIs-vast, interconnected, superhuman, unfathomable AIs-will they take over? Not any more than multicellular organisms took over from genes, vast and unfathomable as we may be to them. AIs are our survival machines, in the same way that we are our genes’.

This does not mean that there is nothing to worry about, however. The first big worry, as with any technology, is that AI could fall into the wrong hands. If a criminal or prankster programs an AI to take over the world, we’d better have an AI police capable of catching it and erasing it before it gets too far. The best insurance policy against vast AIs gone amok is vaster AIs keeping the peace.

The second worry is that humans will voluntarily surrender control. It starts with robot rights, which seem absurd to me but not to everyone. After all, we already give rights to animals, who never asked for them. Robot rights might seem like the logical next step in expanding the “circle of empathy.” Feeling empathy for robots is not hard, particularly if they’re designed to elicit it. Even Tamagotchi, Japanese “virtual pets” with all of three buttons and an LCD screen, do it quite successfully. The first humanoid consumer robot will set off a race to make more and more empathy-eliciting robots, because they’ll sell much better than the plain metal variety. Children raised by robot nannies will have a lifelong soft spot for kindly electronic friends. The “uncanny valley”-our discomfort with robots that are almost human but not quite-will be unknown to them because they grew up with robot mannerisms and maybe even adopted them as cool teenagers.

The next step in the insidious progression of AI control is letting them make all the decisions because they’re, well, so much smarter. Beware. They may be smarter, but they’re in the service of whoever designed their score functions. This is the “Wizard of Oz” problem. Your job in a world of intelligent machines is to keep making sure they do what you want, both at the input (setting the goals) and at the output (checking that you got what you asked for). If you don’t, somebody else will. Machines can help us figure out collectively what we want, but if you don’t participate, you lose out-just like democracy, only more so. Contrary to what we like to believe today, humans quite easily fall into obeying others, and any sufficiently advanced AI is indistinguishable from God. People won’t necessarily mind taking their marching orders from some vast oracular computer; the question is who oversees the overseer. Is AI the road to a more perfect democracy or to a more insidious dictatorship? The eternal vigil has just begun.

The third and perhaps biggest worry is that, like the proverbial genie, the machines will give us what we ask for instead of what we want. This is not a hypothetical scenario; learning algorithms do it all the time. We train a neural network to recognize horses, but it learns instead to recognize brown patches, because all the horses in its training set happened to be brown. You just bought a watch, so Amazon recommends similar items: other watches, which are now the last thing you want to buy. If you examine all the decisions that computers make today-who gets credit, for example-you’ll find that they’re often needlessly bad. Yours would be too, if your brain was a support vector machine and all your knowledge of credit scoring came from perusing one lousy database. People worry that computers will get too smart and take over the world, but the real problem is that they’re too stupid and they’ve already taken over the world.

Evolution, part 2

Even if computers today are still not terribly smart, there’s no doubt that their intelligence is rapidly increasing. As early as 1965, I. J. Good, a British statistician and Alan Turing’s sidekick on the World War II Enigma code-breaking project, speculated on a coming intelligence explosion. Good pointed out that if we can design machines that are more intelligent than us, they should in turn be able to design machines that are more intelligent than them, and so on ad infinitum, leaving human intelligence far behind. In a 1993 essay, Vernor Vinge christened this “the Singularity.” The concept has been popularized most of all by Ray Kurzweil, who argues in The Singularity Is Near that not only is the Singularity inevitable, but the point where machine intelligence exceeds human intelligence-let’s call it the Turing point-will arrive within the next few decades.

Clearly, without machine learning-programs that design programs-the Singularity cannot happen. We also need sufficiently powerful hardware, but that’s coming along nicely. We’ll reach the Turing point soon after we invent the Master Algorithm. (I’m willing to bet Kurzweil a bottle of Dom Pérignon that this will happen before we reverse engineer the brain, his method of choice for bringing about human-level AI.) Pace Kurzweil, this will not, however, lead to the Singularity. It will lead to something much more interesting.

The term singularity comes from mathematics, where it denotes a point at which a function becomes infinite. For example, the function 1/x has a singularity when x is 0, because 1 divided by 0 is infinity. In physics, the quintessential example of a singularity is a black hole: a point of infinite density, where a finite amount of matter is crammed into infinitesimal space. The only problem with singularities is that they don’t really exist. (When did you last divide a cake among zero people, and each one got an infinite slice?) In physics, if a theory predicts something is infinite, something’s wrong with the theory. Case in point, general relativity presumably predicts that black holes have infinite density because it ignores quantum effects. Likewise, intelligence cannot continue to increase forever. Kurzweil acknowledges this, but points to a series of exponential curves in technology improvement (processor speed, memory capacity, etc.) and argues that the limits to this growth are so far away that we need not concern ourselves with them.

Kurzweil is overfitting. He correctly faults other people for always extrapolating linearly-seeing straight lines instead of curves-but then falls prey to a more exotic malady: seeing exponentials everywhere. In curves that are flat-nothing happening-he sees exponentials that have not taken off yet. But technology improvement curves are not exponentials; they are S curves, our good friends from Chapter 4. The early part of an S curve is easy to mistake for an exponential, but then they quickly diverge. Most of Kurzweil’s curves are consequences of Moore’s law, which is on its last legs. Kurzweil argues that other technologies will take the place of semiconductors and S curve will pile on S curve, each steeper than the previous one, but this is speculation. He goes even further to claim that the entire history of life on Earth, not just human technology, shows exponentially accelerating progress, but this perception is at least partly due to a parallax effect: things that are closer seem to move faster. Trilobites in the heat of the Cambrian explosion could be forgiven for believing in exponentially accelerating progress, but then there was a big slowdown. A Tyrannosaurus Ray would probably have proposed a law of accelerating body size. Eukaryotes (us) evolve more slowly than prokaryotes (bacteria). Far from accelerating smoothly, evolution proceeds in fits and starts.

To sidestep the problem that infinitely dense points don’t exist, Kurzweil proposes to instead equate the Singularity with a black hole’s event horizon, the region within which gravity is so strong that not even light can escape. Similarly, he says, the Singularity is the point beyond which technological evolution is so fast that humans cannot predict or understand what will happen. If that’s what the Singularity is, then we’re already inside it. We can’t predict in advance what a learner will come up with, and often we can’t even understand it in retrospect. As a matter of fact, we’ve always lived in a world that we only partly understood. The main difference is that our world is now partly created by us, which is surely an improvement. The world beyond the Turing point will not be incomprehensible to us, any more than the Pleistocene was. We’ll focus on what we can understand, as we always have, and call the rest random (or divine).

The trajectory we’re on is not a singularity but a phase transition. Its critical point-the Turing point-will come when machine learning overtakes the natural variety. Natural learning itself has gone through three phases: evolution, the brain, and culture. Each is a product of the previous one, and each learns faster. Machine learning is the logical next stage of this progression. Computer programs are the fastest replicators on Earth: copying them takes only a fraction of a second. But creating them is slow, if it has to be done by humans. Machine learning removes that bottleneck, leaving a final one: the speed at which humans can absorb change. This too will eventually be removed, but not because we’ll decide to hand things off to our “mind children,” as Hans Moravec calls them, and go gently into the good night. Humans are not a dying twig on the tree of life. On the contrary, we’re about to start branching.

In the same way that culture coevolved with larger brains, we will coevolve with our creations. We always have: humans would be physically different if we had not invented fire or spears. We are Homo technicus as much as Homo sapiens. But a model of the cell of the kind I envisaged in the last chapter will allow something entirely new: computers that design cells based on the parameters we give them, in the same way that silicon compilers design microchips based on their functional specifications. The corresponding DNA can then be synthesized and inserted into a “generic” cell, transforming it into the desired one. Craig Venter, the genome pioneer, has already taken the first steps in this direction. At first we will use this power to fight disease: a new pathogen is identified, the cure is immediately found, and your immune system downloads it from the Internet. Health problems becomes an oxymoron. Then DNA design will let people at last have the body they want, ushering in an age of affordable beauty, in William Gibson’s memorable words. And then Homo technicus will evolve into a myriad different intelligent species, each with its own niche, a whole new biosphere as different from today’s as today’s is from the primordial ocean.

Many people worry that human-directed evolution will permanently split the human race into a class of genetic haves and one of have-nots. This strikes me as a singular failure of imagination. Natural evolution did not result in just two species, one subservient to the other, but in an infinite variety of creatures and intricate ecosystems. Why would artificial evolution, building on it but less constrained, do so?

Like all phase transitions, this one will eventually taper off too. Overcoming a bottleneck does not mean the sky is the limit; it means the next bottleneck is the limit, even if we don’t see it yet. Other transitions will follow, some large, some small, some soon, some not for a long time. But the next thousand years could well be the most amazing in the life of planet Earth.