DATA DISENFRANCHISEMENT

DAVID ROWAN

Editor, WIRED U.K.


In a big-data world, it takes an exponentially rising curve of statistics to bring home just how subjugated we now are to the data crunchers’ powers. Each day, according to IBM, we collectively generate 2.5 quintillion bytes—a tsunami of structured and unstructured data that’s growing, in the International Data Corporation’s reckoning, at 60 percent a year. Walmart drags a million hourly retail transactions into a database that long ago passed 2.5 petabytes; Facebook processes 2.5 billion pieces of content and 500 terabytes of data each day; and Google, whose YouTube division alone gains seventy-two hours of new video every minute, accumulates 24 petabytes of data in a single day. No wonder the rock star of Silicon Valley is no longer the genius software engineer but the analytically inclined, ever-more-venerated data scientist.

Certainly there are vast public benefits in the smart processing of these zetta- and yottabytes of previously unconstrained zeroes and ones. Low-cost genomics allows oncologists to target tumors ever more accurately, using the algorithmic magic of personalized medicine; real-time Bayesian analysis lets counterintelligence forces identify the bad guys, or at least attempt to, in new data-mining approaches to fighting terrorism. And let’s not forget the commercial advantages accruing to businesses that turn raw numbers into actionable information: According to the Economist Intelligence Unit, companies that use effective data analytics typically outperform their peers on stock markets by a factor of 250 percent.

Yet as our lives are swept unstoppably into the data-driven world, such benefits are being denied to a fast-emerging data underclass. Any citizen lacking a basic understanding of, and at least minimal access to, the new algorithmic tools will increasingly be disadvantaged in vast areas of economic, political, and social participation. The data-disenfranchised will find it harder to establish personal creditworthiness or political influence; they will be discriminated against by stock markets and social networks. We need to start seeing data literacy as a requisite fundamental skill in a 21st-century democracy, and to campaign—and perhaps even legislate—to protect the interests of those being left behind.

The data-disenfranchised suffer in two main ways. First, they face systemic disadvantages in markets that are nominally open to all. Take stock markets: Any human traders today bold enough to compete against the algorithms of high-frequency and low-latency traders should be made aware of how far the odds are stacked against them. As Andrei Kirilenko, the chief economist at the U.S. Commodity Futures Trading Commission, along with researchers from Princeton and the University of Washington found recently, the most aggressive high-frequency traders tend to make the greatest profits—which suggests that it would be wise for the small investor simply to leave the machines to it. It’s no coincidence that power in a swath of other sectors is accruing to those who control the algorithms—whether the Obama campaign’s electoral “microtargeters” or the yield-raising strategists of data-fueled precision agriculture.

Second, absolute power is accruing to a small number of data superminers whose influence is matched only by their lack of accountability. Your identity is increasingly what the data oligopolists say it is: Credit agencies, employers, prospective dates, even the U.S. National Security Agency have a fixed view of you based on your online data stream as channeled via search engines, social networks, and “influence” scoring sites, however inaccurate or outdated the results. And good luck trying to correct the errors or false impressions that are damaging your prospects. As disenfranchised users of services such as Instagram and Facebook have increasingly come to realize, it’s up to them, not you, as to how your personal data shall be used. The customer may indeed be the product, but there should at least be a duty for such services clearly to inform and educate the customer about his lack of ownership in their digital output.

Data, as we know, is power—and as our personal metrics become ever easier to amass and store, that power needs rebalancing strongly toward us as individuals and citizens. We impeded medical progress by letting pharmaceutical companies selectively, and on occasion misleadingly, control the release of clinical trials data. In the emerging yottabyte age, let’s ensure the sovereignty of the people over the databases by holding to account those with the keys to the machine.

Загрузка...