The first time I heard the term “Information Age” I was tantalized. I knew about the Iron Age and the Bronze Age, periods of history named for the new materials men used to make their tools and weapons. Those were specific eras. Then I read academics predicting that countries will be fighting over the control of information, not natural resources. This sounded intriguing too, but what did they mean by information?
The claim that information would define the future reminded me of the famous party scene in the 1967 movie The Graduate. A businessman buttonholes Benjamin, the college graduate played by Dustin Hoffman, and offers him a single word of unsolicited career advice: “Plastics.” I wondered whether, if the scene had been written a few decades later, the businessman’s advice would have been: “One word, Benjamin. ‘Information.’”
I imagined nonsensical conversations around a future office watercooler: “How much information do you have?” “Switzerland is a great country because of all the information they have there!” “I hear the Information Price Index is going up!”
It sounds nonsensical because information isn’t as tangible or measurable as the materials that defined previous ages, but information has become increasingly important to us. The information revolution is just beginning. The cost of communications will drop as precipitously as the cost of computing already has. When it gets low enough and is combined with other advances in technology, “information highway” will no longer be just a phrase for eager executives and excited politicians. It will be as real and as far-reaching as “electricity.” To understand why information is going to be so central, it’s important to know how technology is changing the ways we handle information.
The majority of this chapter is devoted to such an explanation. The material that follows is to give less-informed readers without a background in computer principles and history sufficient information to enjoy the rest of the material in the book. If you understand how digital computers work, you probably already know the material cold, so feel free to skip to chapter 3.
The most fundamental difference we’ll see in future information is that almost all of it will be digital. Whole printed libraries are already being scanned and stored as electronic data on disks and CD-ROMs. Newspapers and magazines are now often completely composed in electronic form and printed on paper as a convenience for distribution. The electronic information is stored permanently—or for as long as anyone wants it—in computer databases: giant banks of journalistic data accessible through on-line services. Photographs, films, and videos are all being converted into digital information. Every year, better methods are being devised to quantify information and distill it into quadrillions of atomistic packets of data. Once digital information is stored, anyone with access and a personal computer can instantaneously recall, compare, and refashion it. What characterizes this period in history is the completely new ways in which information can be changed and manipulated, and the increasing speeds at which we can handle it. The computer’s abilities to provide low-cost, high-speed processing and transmission of digital data will transform the conventional communication devices in homes and offices.
The idea of using an instrument to manipulate numbers isn’t new. The abacus had been in use in Asia for nearly 5,000 years by 1642, when the nineteen-year-old French scientist Blaise Pascal invented a mechanical calculator. It was a counting device. Three decades later, the German mathematician Gottfried von Leibniz improved on Pascal’s design. His “Stepped Reckoner” could multiply, divide, and calculate square roots. Reliable mechanical calculators, powered by rotating dials and gears, descendants of the Stepped Reckoner, were the mainstay of business until their electronic counterparts replaced them. When I was a boy, a cash register was essentially a mechanical calculator linked to a cash drawer.
More than a century and a half ago, a visionary British mathematician glimpsed the possibility of the computer and that glimpse made him famous even in his day. Charles Babbage was a professor of mathematics at Cambridge University who conceived the possibility of a mechanical device that would be able to perform a string of related calculations. As early as the 1830s, he was drawn to the idea that information could be manipulated by a machine if the information could be converted into numbers first. The steam-powered machine Babbage envisioned would use pegs, toothed wheels, cylinders, and other mechanical parts, the apparatus of the then-new Industrial Age. Babbage believed his “Analytical Engine” would be used to take the drudgery and inaccuracy out of calculating.
He lacked the terms we now use to refer to the parts of his machine. He called the central processor, or working guts of his machine, the “mill.” He referred to his machine’s memory as the “store.” Babbage imagined information being transformed the way cotton was—drawn from a store (warehouse) and milled into something new.
His Analytical Engine would be mechanical, but he foresaw how it would be able to follow changing sets of instructions and thus serve different functions. This is the essence of software. It is a comprehensive set of rules a machine can be given to “instruct” it how to perform particular tasks. Babbage realized that to create these instructions he would need an entirely new kind of language, and he devised one using numbers, letters, arrows, and other symbols. The language was designed to let Babbage “program” the Analytical Engine with a long series of conditional instructions, which would allow the machine to modify its actions in response to changing situations. He was the first to see that a single machine could serve a number of different purposes.
For the next century mathematicians worked with the ideas Babbage had outlined and finally, by the mid-1940s, an electronic computer was built based on the principles of his Analytical Engine. It is hard to sort out the paternity of the modern computer, because much of the thinking and work was done in the United States and Britain during World War II under the cloak of wartime secrecy. Three major contributors were Alan Turing, Claude Shannon, and John von Neumann.
In the mid-1930s, Alan Turing, like Babbage a superlative Cambridge-trained British mathematician, proposed what is known today as a Turing machine. It was his version of a completely general-purpose calculating machine that could be instructed to work with almost any kind of information.
In the late 1930s, when Claude Shannon was still a student, he demonstrated that a machine executing logical instructions could manipulate information. His insight, the subject of his master’s thesis, was about how computer circuits—closed for true and open for false—could perform logical operations, using the number 1 to represent “true” and 0 to represent “false.”
This is a binary system. It’s a code. Binary is the alphabet of electronic computers, the basis of the language into which all information is translated, stored, and used within a computer. It’s simple, but so vital to the understanding of the way computers work that it’s worth pausing here to explain it more fully.
Imagine you have a room that you want illuminated with as much as 250 watts of electric lighting and you want the lighting to be adjustable, from 0 watt of illumination (total darkness) to the full wattage. One way to accomplish this is with a rotating dimmer switch hooked to a 250-watt bulb. To achieve complete darkness, turn the knob fully counterclockwise to Off for 0 watt of light. For maximum brightness, turn the knob fully clockwise for the entire 250 watts. For some illumination level in between, turn the knob to an intermediate position.
This system is easy to use but has limitations. If the knob is at an intermediate setting—if lighting is lowered for an intimate dinner, for example—you can only guess what the lighting level is. You don’t really know how many watts are in use, or how to describe the setting precisely. Your information is approximate, which makes it hard to store or reproduce.
What if you want to reproduce exactly the same level of lighting next week? You could make a mark on the switch plate so that you know how far to turn it, but this is hardly exact, and what happens when you want to reproduce a different setting? What if a friend wants to reproduce the same level of lighting? You can say, “Turn the knob about a fifth of the way clockwise,” or “Turn the knob until the arrow is at about two o’clock” but your friend’s reproduction will only approximate your setting. What if your friend then passes the information on to another friend, who in turn passes it on again? Each time the information is handed on, the chances of its remaining accurate decrease.
That is an example of information stored in “analog” form. The dimmer’s knob provides an analogy to the bulb’s lighting level. If it’s turned halfway, presumably you have about half the total wattage. When you measure or describe how far the knob is turned, you’re actually storing information about the analogy (the knob) rather than about the lighting level. Analog information can be gathered, stored, and reproduced, but it tends to be imprecise—and runs the risk of becoming less precise each time it is transferred.
Now let’s look at an entirely different way of describing how to light the room, a digital rather than analog method of storing and transmitting information. Any kind of information can be converted into numbers using only 0s and 1s. These are called binary numbers—numbers composed entirely of 0s and 1s. Each 0 or 1 is called a bit. Once the information has been converted, it can be fed to and stored in computers as long strings of bits. Those numbers are all that’s meant by “digital information.”
Instead of a single 250-watt bulb, let’s say you have eight bulbs, each with a wattage double the one preceding it, from 1 to 128. Each of these bulbs is hooked to its own switch, with the lowest-watt bulb on the right. Such an arrangement can be diagrammed like this:
By turning these switches on and off, you can adjust the lighting level in 1-watt increments from 0 watt (all switches off) to 255 watts (all switches on). This gives you 256 possibilities. If you want 1 watt of light, you turn on only the rightmost switch, which turns on the 1-watt bulb. If you want 2 watts of light, you turn on only the 2-watt bulb. If you want 3 watts of light, you turn on both the 1-watt and 2-watt bulbs, because 1 plus 2 equals the desired 3 watts. If you want 4 watts of light, you turn on the 4-watt bulb. If you want 5 watts, you turn on just the 4-watt and 1-watt bulbs. If you want 250 watts of light, you turn on all but the 4-watt and 1-watt bulbs.
If you have decided the ideal illumination level for dining is 137 watts of light, you turn on the 128-, 8-, and 1-watt bulbs, like this:
This system makes it easy to record an exact lighting level for later use or to communicate it to others who have the same light-switch setup. Because the way we record binary information is universal—low number to the right, high number to the left, always doubling—you don’t have to write down the values of the bulbs. You simply record the pattern of switches: on, off, off, off, on, off, off, on. With that information a friend can faithfully reproduce the 137 watts of light in your room. In fact, as long as everyone involved double-checks the accuracy of what he does, the message can be passed through a million hands and at the end every person will have the same information and be able to achieve exactly 137 watts of light.
To shorten the notation further, you can record each “off” as 0 and each “on” as 1. This means that instead of writing down “on, off, off, off, on, off, off, on,” meaning turn on the first, the fourth, and the eighth of the eight bulbs, and leave the others off, you write the same information as 1, 0, 0, 0, 1, 0, 0, 1, or 10001001, a binary number. In this case it’s 137. You call your friend and say: “I’ve got the perfect lighting level! It’s 10001001. Try it.” Your friend gets it exactly right, by flipping a switch on for each 1 and off for each 0.
This may seem like a complicated way to describe the brightness of a light source, but it is an example of the theory behind binary expression, the basis of all modern computers.
Binary expression made it possible to take advantage of electric circuits to build calculators. This happened during World War II when a group of mathematicians led by J. Presper Eckert and John Mauchly at the University of Pennsylvania’s Moore School of Electrical Engineering began developing an electronic computational machine, the Electronic Numerical Integrator And Calculator, called ENIAC. Its purpose was to speed up the calculations for artillery-aiming tables. ENIAC was more like an electronic calculator than a computer, but instead of representing a binary number with on and off settings on wheels the way a mechanical calculator did, it used vacuum tube “switches.”
Soldiers assigned by the army to the huge machine wheeled around squeaking grocery carts filled with vacuum tubes. When one burned out, ENIAC shut down and the race began to locate and replace the burned-out tube. One explanation, perhaps somewhat apocryphal, for why the tubes had to be replaced so often was that their heat and light attracted moths, which would fly into the huge machine and cause short circuits. If this is true, it gives new meaning to the term “bugs” for the little glitches that can plague computer hardware or software.
When all the tubes were working, a staff of engineers could set up ENIAC to solve a problem by laboriously plugging in 6,000 cables by hand. To make it perform another function, the staff had to reconfigure the cabling—every time. John von Neumann, a brilliant Hungarian-born American, who is known for many things, including the development of game theory and his contributions to nuclear weaponry, is credited with the leading role in figuring out a way around this problem. He created the paradigm that all digital computers still follow. The “von Neumann architecture,” as it is known today, is based on principles he articulated in 1945—including the principle that a computer could avoid cabling changes by storing instructions in its memory. As soon as this idea was put into practice, the modern computer was born.
Today the brains of most computers are descendants of the microprocessor Paul Allen and I were so knocked out by in the seventies, and personal computers often are rated according to how many bits of information (one switch in the lighting example) their microprocessor can process at a time, or how many bytes (a cluster of eight bits) of memory or disk-based storage they have. ENIAC weighed 30 tons and filled a large room. Inside, the computational pulses raced among 1,500 electro-mechanical relays and flowed through 17,000 vacuum tubes. Switching it on consumed 150,000 watts of energy. But ENIAC stored only the equivalent of about 80 characters of information.
By the early 1960s, transistors had supplanted vacuum tubes in consumer electronics. This was more than a decade after the discovery at Bell Labs that a tiny sliver of silicon could do the same job as a vacuum tube. Like vacuum tubes, transistors act as electrical switches, but they require significantly less power to operate and as a result generate much less heat and require less space. Multiple transistor circuits could be combined onto a single chip, creating an integrated circuit. The computer chips we use today are integrated circuits containing the equivalent of millions of transistors packed onto less than a square inch of silicon.
In a 1977 Scientific American article, Bob Noyce, one of the founders of Intel, compared the $300 microprocessor to ENIAC, the moth-infested mastodon from the dawn of the computer age. The wee microprocessor was not only more powerful, but as Noyce noted, “It is twenty times faster, has a larger memory, is thousands of times more reliable, consumes the power of a lightbulb rather than that of a locomotive, occupies 1/30,000 the volume and costs 1/10,000 as much. It is available by mail order or at your local hobby shop.”
Of course, the 1977 microprocessor seems like a toy now. And, in fact, many inexpensive toys contain computer chips that are more powerful than the 1970s chips that started the microcomputer revolution. But all of today’s computers, whatever their size or power, manipulate information stored as binary numbers.
Binary numbers are used to store text in a personal computer, music on a compact disc, and money in a bank’s network of cash machines. Before information can go into a computer it has to be converted into binary. Machines, digital devices, convert the information back into its original, useful form. You can imagine each device throwing switches, controlling the flow of electrons. But the switches involved, which are usually made of silicon, are extremely small and can be thrown by applying electrical charges extraordinarily quickly—to produce text on the screen of a personal computer, music from a CD player, and the instructions to a cash machine to dispense currency.
The light-switch example demonstrated how any number can be represented in binary. Here’s how text can be expressed in binary. By convention, the number 65 represents a capital A, the number 66 represents a capital B, and so forth. On a computer each of these numbers is expressed in binary code: the capital letter A, 65, becomes 01000001. The capital B, 66, becomes 01000010. A space break is represented by the number 32, or 00100000. So the sentence “Socrates is a man” becomes this 136-digit string of 1s and 0s:
01010011 01101111 01100011 01110010 01100101 01110011 00100000 01101001 01100001 00100000 01101101 0110000101100001 01110011 0110111001110100 00100000
It’s easy to follow how a line of text can become a set of binary numbers. To understand how other kinds of information are digitized, let’s consider another example of analog information. A vinyl record is an analog representation of sound vibrations. It stores audio information in microscopic squiggles that line the record’s long, spiral groove. If the music has a loud passage, the squiggles are cut more deeply into the groove, and if there is a high note the squiggles are packed more tightly together. The groove’s squiggles are analogs of the original vibrations—sound waves captured by a microphone. When a turntable’s needle travels down the groove, it vibrates in resonation with the tiny squiggles. This vibration, still an analog representation of the original sound, is amplified and sent to loudspeakers as music.
Like any analog device for storing information, a record has drawbacks. Dust, fingerprints, or scratches on the record’s surface can cause the needle to vibrate inappropriately and create clicks or other noises. If the record is not turning at exactly the right speed, the pitch of the music won’t be accurate. Each time a record is played, the needle wears away some of the subtleties of the squiggles in the groove and the reproduction of the music deteriorates. If you record a song from a vinyl record onto a cassette tape, any of the record’s imperfections will be permanently transferred to the tape, and new imperfections will be added because conventional tape machines are themselves analog devices. The information loses quality with each generation of rerecording or retransmission.
On a compact disc, music is stored as a series of binary numbers, each bit (or switch) of which is represented by a microscopic pit on the surface of the disc. Today’s CDs have more than 5 billion pits. The reflected laser light inside the CD player—a digital device—reads each of the pits to determine if it is switched to the 0 or the 1 position, and then reassembles that information back into the original music by generating specified electrical signals that are converted by the speakers into sound waves. Each time the disc is played, the sounds are exactly the same.
It’s convenient to be able to convert everything into digital representations, but the number of bits can build up quite quickly. Too many bits of information can overflow the computer’s memory or take a long time to transmit between computers. This is why a computer’s capacity to compress digital data, store or transmit it, then expand it back into its original form is so useful and will become more so.
Quickly, here’s how the computer accomplishes these feats. It goes back to Claude Shannon, the mathematician who in the 1930s recognized how to express information in binary form. During World War II, he began developing a mathematical description of information and founded a field that later became known as information theory. Shannon defined information as the reduction of uncertainty. By this definition, if you already know it is Saturday and someone tells you it is Saturday, you haven’t been given any information. On the other hand, if you’re not sure of the day and someone tells you it is Saturday, you’ve been given information, because your uncertainty has been reduced.
Shannon’s information theory eventually led to other break-throughs. One was effective data-compression, vital to both computing and communications. On the face of it what he said is obvious: Those parts of data that don’t provide unique information are redundant and can be eliminated. Headline writers leave out nonessential words, as do people paying by the word to send a telegraph message or place a classified advertisement. One example Shannon gave was the letter u, redundant in English whenever it follows the letter q. You know a u will follow each q, so the u needn’t actually be included in the message.
Shannon’s principles have been applied to the compression of both sound and pictures. There is a great deal of redundant information in the thirty frames that make up a second of video. The information can be compressed from about 27 million to about 1 million bits for transmission and still make sense and be pleasant to watch.
However, there are limits to compression and in the near future we’ll be moving ever-increasing numbers of bits from place to place. The bits will travel through copper wires, through the air, and through the structure of the information highway, most of which will be fiber-optic cable (or just “fiber” for short). Fiber is cable made of glass or plastic so smooth and pure that if you looked through a wall of it 70 miles thick, you’d be able to see a candle burning on the other side. Binary signals, in the form of modulated light, carry for long distances through these optic fibers. A signal doesn’t move any faster through fiber-optic cable than it does in copper wire; both go at the speed of light. The enormous advantage fiber-optic cable has over wire is the bandwidth it can carry. Bandwidth is a measure of the number of bits that can be moved through a circuit in a second. This really is like a highway. An eight-lane interstate has more room for vehicles than a narrow dirt road. The greater the bandwidth, the more lanes available—thus, that many more cars, or bits of information, can pass in a second. Cables with limited bandwidth, used for text or voice transmissions, are called narrow-band circuit. Cables with more capacity, which carry images and limited animation, are “midband capable.” Those with a high bandwidth, which can carry multiple video and audio signals, are said to have broadband capacity.
The information highway will use compression, but there will still have to be a great deal of bandwidth. One of the main reasons we don’t already have a working highway is that there isn’t sufficient bandwidth in today’s communications networks for all the new applications. And there won’t be until fiber-optic cable is brought into enough neighborhoods.
Fiber-optic cable is an example of technology that goes beyond what Babbage or even Eckert and Mauchly could have predicted. So is the speed at which the performance and capacity of chips have improved.
In 1965, Gordon Moore, who later cofounded Intel with Bob Noyce, predicted that the capacity of a computer chip would double every year. He said this on the basis of having examined the price/performance ratio of computer chips over the previous three years and projecting it forward. In truth, Moore didn’t believe that this rate of improvement would last long. But ten years later, his forecast proved true, and he then predicted the capacity would double every two years. To this day his predictions have held up, and the average—a doubling every eighteen months—is referred to among engineers as Moore’s Law.
No experience in our everyday life prepares us for the implications of a number that doubles a great number of times—exponential improvements. One way to understand it is with a fable.
King Shirham of India was so pleased when one of his ministers invented the game of chess that he asked the man to name any reward.
“Your Majesty,” said the minister, “I ask that you give me one grain of wheat for the first square of the chessboard, two grains for the second square, four grains for the third, and so on, doubling the number of grains each time until all sixty-four squares are accounted for.” The king was moved by the modesty of the request and called for a bag of wheat.
The king asked that the promised grains be counted out onto the chessboard. On the first square of the first row was placed one small grain. On the second square were two specks of wheat. On the third square there were 4, then 8, 16, 32, 64, 128. By square eight at the end of the first row, King Shirham’s supply master had counted out a total of 255 grains.
The king probably registered no concern. Maybe a little more wheat was on the board than he had expected, but nothing surprising had happened. Assuming it would take one second to count each grain, the counting so far had taken only about four minutes. If one row was done in four minutes, try to guess how long it would take to count out the wheat for all sixty-four squares of the board. Four hours? Four days? Four years?
By the time the second row was complete, the supply master had worked for about eighteen hours just counting out 65,535 grains. By the end of the third of the eight rows, it took 194 days to count the 16.8 million grains for the twenty-fourth square. And there were still forty empty squares to go.
It is safe to say that the king broke his promise to the minister. The final square would have gotten 18,446,744,073,709,551,615 grains of wheat on the board, and required 584 billion years of counting. Current estimates of the age of the earth are around 4.5 billion years. According to most versions of the legend, King Shirham realized at some point in the counting that he had been tricked and had his clever minister beheaded.
Exponential growth, even when explained, seems like a trick.
Moore’s Law is likely to hold for another twenty years. If it does, a computation that now takes a day will be more than 10,000 times faster, and thus take fewer than ten seconds.
Laboratories are already operating “ballistic” transistors that have switching times on the order of a femtosecond. That is 1/1,000,000,000,000,000 of a second, which is about 10 million times faster than the transistors in today’s microprocessors. The trick is to reduce the size of the chip circuitry and the current flow so that moving electrons don’t bump into anything, including each other. The next stage is the “single-electron transistor,” in which a single bit of information is represented by a lone electron. This will be the ultimate in low-power computing, at least according to our current understanding of physics. In order to make use of the incredible speed advantages at the molecular level, computers will have to be very small, even microscopic. We already understand the science that would allow us to build these superfast computers. What we need is an engineering breakthrough, and these are often quick in coming.
By the time we have the speed, storing all those bits won’t be a problem. In the spring of 1983, IBM released its PC/XT, the company’s first personal computer with an interior hard disk. The disk served as a built-in storage device and held 10 megabytes, or “megs,” of information, about 10 million characters or 80 million bits. Existing customers who wanted to add these 10 megs to their original computers could, for a price. IBM offered a $3,000 kit, complete with separate power supply, to expand the computer’s storage. That’s $300 per megabyte. Today, thanks to the exponential growth described by Moore’s Law, personal-computer hard drives that can hold 1.2 gigabytes—1.2 billion characters of information—are priced at $250. That’s 21 cents per megabyte! And we look toward an exotic improvement called a holographic memory, which can hold terabytes of characters in less than a cubic inch of volume. With such capability, a holographic memory the size of your fist could hold the contents of the Library of Congress.
As communications technology goes digital, it becomes subject to the same exponential improvements that have made today’s $2,000 laptop computer more powerful than a $10 million IBM mainframe computer of twenty years ago.
At some point not far in the future, a single wire running into each home will be able to deliver all of a household’s digital data. The wire will either be fiber, which is what long-distance telephone calls are carried on now, or coaxial cable, which currently brings us cable television signals. If the bits are interpreted as voice calls, the phone will ring. If there are video images, they will show up on the television set. If they are on-line news services, they will arrive as written text and pictures on a computer screen.
That single wire bringing the network will certainly carry much more than phone calls, movies, news. But we can no more imagine what the information highway will carry in twenty-five years than a Stone Age man using a crude knife could have envisioned Ghiberti’s Baptistery doors in Florence. Only when the highway arrives will all its possibilities be understood. However, the last twenty years of experience with digital breakthroughs allow us to understand some of the key principles and possibilities for the future.