I Am a Strange Loop - читать бесплатно онлайн полную версию книги автора Douglas R. Hofstadter (ч. 18)

CHAPTER 10

Gödel’s Quintessential Strange Loop

Gödel Encounters Fibonacci

BY HIS early twenties, the boy from Brünn was already a superb mathematician and, like all mathematicians, he knew whole numbers come in limitless varieties. Aside from squares, cubes, primes, powers of ten, sums of two squares, and all the other usual suspects, he was familiar with many other types of integers. Most crucially for his future, young Kurt knew, thanks to Leonardo di Pisa (more often known as “Fibonacci”), that one could define classes of integers recursively.

In the 1300’s, Fibonacci had concocted and explored what are now known as the “Fibonacci numbers”:

1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, . . .

In this rapidly growing infinite sequence, whose members I will henceforth refer to as the F numbers, each new element is created by summing the two previous ones (except for the first pair, 1 and 2, which we simply declare by fiat to be F numbers).

This almost-but-not-quite-circular fashion of defining a sequence of numbers in terms of itself is called a “recursive definition”. This means there is some kind of precise calculational rule for making new elements out of previous ones. The rule might involve adding, multiplying, dividing, whatever — as long as it’s well-defined. The opening gambit of a recursive sequence (in this case, the numbers 1 and 2) can be thought of as a packet of seeds from which a gigantic plant — all of its branches and leaves, infinite in number — grows in a predetermined manner, based on the fixed rule.

The Caspian Gemstones: An Allegory

Leonardo di Pisa’s sequence is brimming with amazing patterns, but unfortunately going into that would throw us far off course. Still, I cannot resist mentioning that 144 jumps out in this list of the first few F numbers because it is a salient perfect square. Aside from 8, which is a cube, and 1, which is a rather degenerate case, no other perfect square, cube, or any other exact power appears in the first few hundred terms of the F sequence.

Several decades ago, people started wondering if the presence of 8 and 144 in the F sequence was due to a reason, or if it was just a “random accident”. Therefore, as computational tools started becoming more and more powerful, they undertook searches. Curiously enough, even with the advent of supercomputers, allowing millions and even billions of F numbers to be churned out, no one ever came across any other perfect powers in Fibonacci’s sequence. The chance of a power turning up very soon in the F sequence was looking slim, but why would a perfect mutual avoidance occur? What do nth powers for arbitrary n have to do with adding up pairs of numbers in Fibonacci’s peculiar recursive fashion? Couldn’t 8 and 144 just be little random glitches? Why couldn’t other little glitches take place?

To cast allegorical light on this, imagine someone chanced one day to fish up a giant diamond, a magnificent ruby, and a tiny pearl at the bottom of the great green Caspian Sea in central Asia, and other seekers of fortune, spurred on by these stunning finds, then started madly dredging the bottom of the world’s largest lake to seek more diamonds, rubies, pearls, emeralds, topazes, etc., but none was found, no matter how much dredging was done. One would naturally wonder if more gems might be hidden down there, but how could one ever know? (Caveat: my allegory is slightly flawed, because we can imagine, at least in principle, a richly financed scientific team someday dredging the lake’s bottom completely, since, though huge, it is finite. For my analogy to be “perfect”, we would have to conceive of the Caspian Sea as infinite. Just stretch your imagination a bit, reader!)

Now the twist. Suppose some mathematically-minded geologist set out to prove that the two exquisite Caspian gems, plus the tiny round pearl, were sui generis — in other words, that there was a precise reason that no other gemstone or pearl of any type or size would ever again, or could ever again, be found in the Caspian Sea. Does seeking such a proof make any sense? How could there be a watertight scientific reason absolutely forbidding any gems — except for one pearl, one ruby, and one diamond — from ever being found on the floor of the Caspian Sea? It sounds absurd.

This is typical of how we think about the physical world — we think of it as being filled with contingent events, facts that could be otherwise, situations that have no fundamental reason for their being as they are. But let me remind you that mathematicians see their pristine, abstract world as the antithesis to the random, accident-filled physical world we all inhabit. Things that happen in the mathematical world strike mathematicians as happening, without any exceptions, for statable, understandable reasons.

This — the Mathematician’s Credo — is the mindset that you have to adopt and embrace if you wish to understand how mathematicians think. And in this particular case, the mystery of the lack of Fibonacci powers, although just a tiny one in most mathematicians’ eyes, was a particularly baffling one, because it seemed to offer no natural route of access. The two phenomena involved — integer powers with arbitrarily large exponents, on the one hand, and Fibonacci numbers on the other — simply seemed (like gemstones and the Caspian Sea) to be too conceptually remote from each other to have any deep, systematic, inevitable interrelationship.

And then along came a vast team of mathematicians who had set their collective bead on the “big game” of Fermat’s Last Theorem (the notorious claim, originally made by Pierre de Fermat in the middle of the seventeenth century, that no positive integers a, b, c exist such that aⁿ + bⁿ equals c ⁿ, with the exponent n being an integer greater than 2). This great international relay team, whose final victorious lap was magnificently sprinted by Andrew Wiles (his sprint took him about eight years), was at last able to prove Fermat’s centuries-old claim by using amazing techniques that combined ideas from all over the vast map of contemporary mathematics.

In the wake of this team’s revolutionary work, new paths were opened up that seemed to leave cracks in many famous old doors, including the tightly-closed door of the small but alluring Fibonacci power mystery. And indeed, roughly ten years after the proof of Fermat’s Last Theorem, a trio of mathematicians, exploiting the techniques of Wiles and others, were able to pinpoint the exact reason for which cubic 8 and square 144 will never have any perfect-power mates in Leonardo di Pisa’s recursive sequence (except for 1). Though extremely recondite, the reason behind the infinite mutual-avoidance dance had been found. This is just one more triumph of the Mathematician’s Credo — one more reason to buy a lot of stock in the idea that in mathematics, where there’s a pattern, there’s a reason.

A Tiny Spark in Gödel’s Brain

We now return to the story of Kurt Gödel and his encounter with the powerful idea that all sorts of infinite classes of numbers can be defined through various kinds of recursive rules. The image of the organic growth of an infinite structure or pattern, all springing out of a finite set of initial seeds, struck Gödel as much more than a mere curiosity; in fact, it reminded him of the fact that theorems in PM (like theorems in Euclid’s Elements) always spring (by formal rules of inference) from earlier theorems in PM, with the exception of the first few theorems, which are declared by fiat to be theorems, and thus are called “axioms” (analogues to the seeds).

In other words, in the careful analogy sparked in Gödel’s mind by this initially vague connection, the axioms of PM would play the role of Fibonacci’s seeds 1 and 2, and the rules of inference of PM would play the role of adding the two most recent numbers. The main difference is that in PM there are several rules of inference, not just one, so at any stage you have a choice of what to do, and moreover, you don’t have to apply your chosen rule to the most recently generated theorem(s), so that gives you even more choice. But aside from these extra degrees of freedom, Gödel’s analogy was very tight, and it turned out to be immensely fruitful.

Clever Rules Imbue Inert Symbols with Meaning

I must stress here that each rule of inference in a formal system like PM not only leads from one or more input formulas to an output formula, but it does so by purely typographical means — that is, via purely mechanical symbol-shunting that doesn’t require any thought about the meanings of symbols. From the viewpoint of a person (or machine) following the rules to produce theorems, the symbols might as well be totally devoid of meaning.

On the other hand, each rule has to be very carefully designed so that, given input formulas that express truths, the output formula will also express a truth. The rule’s designer (Russell and Whitehead, in this case) therefore has to think about the symbols’ intended meanings in order to be sure that the rule will work exactly right for a manipulator (human or otherwise) who is not thinking about the symbols’ intended meanings.

To give a trivial example, suppose the symbol “∨” were intended to stand for the concept “or”. Then a possible rule of inference would be:

From any formula “P ∨ Q” one can derive the reversed formula “Q ∨ P”.

This rule of inference is reasonable because whenever an or-statement (such as “You’re crazy or I’m crazy”) is true, then so is the flipped-around or-statement (“I’m crazy or you’re crazy”).

This particular ∨-flipping rule happens not to be one of PM’s rules of inference, but it could have been one. The point is just that this rule shows how one can mechanically shunt symbols and ignore their meanings, and yet preserve truth while doing so. This rule is rather trivial, but there are subtler ones that do real work. That, indeed, is the whole idea of symbolic logic, first suggested by Aristotle and then developed piecemeal over many centuries by such thinkers as Blaise Pascal, Gottfried Wilhelm von Leibniz, George Boole, Augustus De Morgan, Gottlob Frege, Giuseppe Peano, David Hilbert, and many others. Russell and Whitehead were simply developing the ancient dream of totally mechanizing reasoning in a more ambitious fashion than any of their predecessors had.

Mechanizing the Mathematician’s Credo

If you apply PM’s rules of inference to its axioms (the seeds that constitute the “zeroth generation” of theorems), you will produce some “progeny” — theorems of the “first generation”. Then apply the rules once again to the first-generation theorems (as well as to the axioms) in all the different ways you can; you will thereby produce a new batch of theorems — the second generation. Then from that whole brew comes a third batch of theorems, and so on, ad infinitum, constantly snowballing. The infinite body of theorems of PM is fully determined by the initial seeds and by the typographical “growth rules” that allow one to make new theorems out of old ones.

Needless to say, the hope here is that all of these mechanically generated theorems of PM are true statements of number theory (i.e., no false statement is ever generated), and conversely, it is hoped that all true statements of number theory are mechanically generated as theorems of PM (i.e., no true statement is left ungenerated forever). The first of these hopes is called consistency, and the second one is called completeness.

In a nutshell, we want the entire infinite body of theorems of PM to coincide exactly with the infinite body of true statements in number theory — we want perfect, flawless alignment. At least that’s what Russell and Whitehead wanted, and they believed that with PM they had attained this goal (after all, “s0 + s0 = ss0” was a theorem, wasn’t it?).

Let us recall the Mathematician’s Credo, which in some form or other had existed for many centuries before Russell and Whitehead came along:

X is true because there is a proof of X;

X is true and so there is a proof of X.

The first line expresses the first hope expressed above — consistency. The second line expresses the second hope expressed above — completeness. We thus see that the Mathematician’s Credo is very closely related to what Russell and Whitehead were aiming for. Their goal, however, was to set the Credo on a new and rigorous basis, with PM serving as its pedestal. In other words, where the Mathematician’s Credo merely speaks of “a proof ” without saying what is meant by the term, Russell and Whitehead wanted people to think of it as meaning a proof within PM.

Gödel himself had great respect for the power of PM, as is shown by the opening sentences of his 1931 article:

The development of mathematics in the direction of greater exactness has — as is well known — led to large tracts of it becoming formalized, so that proofs can be carried out according to a few mechanical rules. The most comprehensive formal systems yet set up are, on the one hand, the system of Principia Mathematica (PM) and, on the other, the axiom system for set theory of Zermelo-Fraenkel (later extended by J. v. Neumann). These two systems are so extensive that all methods of proof used in mathematics today have been formalized in them, i. e., reduced to a few axioms and rules of inference.

And yet, despite his generous hat-tip to Russell and Whitehead’s opus, Gödel did not actually believe that a perfect alignment between truths and PM theorems had been attained, nor indeed that such a thing could ever be attained, and his deep skepticism came from having smelled an extremely strange loop lurking inside the labyrinthine palace of mindless, mechanical, symbol-churning, meaning-lacking mathematical reasoning.

Miraculous Lockstep Synchrony

The conceptual parallel between recursively defined sequences of integers and the leapfrogging set of theorems of PM (or, for that matter, of any formal system whatever, as long as it had axioms acting as seeds and rules of inference acting as growth mechanisms) suggested to Gödel that the typographical patterns of symbols on the pages of Principia Mathematica — that is, the rigorous logical derivations of new theorems from previous ones — could somehow be “mirrored” in an exact manner inside the world of numbers. An inner voice told him that this connection was not just a vague resemblance but could in all likelihood be turned into an absolutely precise correspondence.

More specifically, Gödel envisioned a set of whole numbers that would organically grow out of each other via arithmetical calculations much as Fibonacci’s F numbers did, but that would also correspond in an exact oneto-one way with the set of theorems of PM. For instance, if you made theorem Z out of theorems X and Y by using typographical rule R5, and if you made the number z out of numbers x and y using computational rule r5, then everything would match up. That is to say, if x were the number corresponding to theorem X and y were the number corresponding to theorem Υ, then z would “miraculously” turn out to be the number corresponding to theorem Z. There would be perfect synchrony; the two sides (typographical and numerical) would move together in lock-step. At first this vision of miraculous synchrony was just a little spark, but Gödel quickly realized that his inchoate dream might be made so precise that it could be spelled out to others, so he started pursuing it in a dogged fashion.

Flipping between Formulas and Very Big Integers

In order to convert his intuitive hunch into a serious, precise, and respectable idea, Gödel first had to figure out how any string of PM symbols (irrespective of whether it asserted a truth or a falsity, or even was just a random jumble of symbols haphazardly thrown together) could be systematically converted into a positive integer, and conversely, how such an integer could be “decoded” to give back the string from which it had come. This first stage of Gödel’s dream, a systematic mapping by which every formula would receive a numerical “name”, came about as follows.

The basic alphabet of PM consisted of only about a dozen symbols (other symbols were introduced later but they were all defined in terms of the original few, so they were not conceptually necessary), and to each of these symbols Gödel assigned a different small integer (these initial few choices were quite arbitrary — it really didn’t matter what number was associated with an isolated symbol).

For multi-symbol formulas (by the way, in this book the terms “string of symbols” — “string” for short — and “formula” are synonymous), the idea was to replace the symbols, one by one, moving left to right, by their code numbers, and then to combine all of those individual code numbers (by using them as exponents to which successive prime numbers are raised) into one unique big integer. Thus, once isolated symbols had been assigned numbers, the numbers assigned to strings of symbols were not arbitrary.

For instance, suppose that the (arbitrary) code number for the symbol “0” is 2, and the code number for the symbol “=” is 6. Then for the three symbols in the very simple formula “0=0”, the code numbers are 2, 6, 2, and these three numbers are used as exponents for the first three prime numbers (2, 3, and 5) as follows:

2² · 3⁶ · 5² = 72900

So we see that 72900 is the single number that corresponds to the formula “0=0”. Of course this is a rather large integer for such a short formula, and you can easily imagine that the integer corresponding to a fifty-symbol formula is astronomical, since it involves putting the first fifty prime numbers to various powers and then multiplying all those big numbers together, to make a true colossus. But no matter — numbers are just numbers, no matter how big they are. (Luckily for Gödel, there are infinitely many primes, since if there had been merely, say, one billion of them, then his method would only have let him encode formulas made of a billion symbols or fewer. Now that would be a crying shame!)

The decoding process works by finding the prime factorization of 72900 (which is unique), and reading off the exponents that the ascending primes are raised to, one by one — 2, 6, 2 in this case.

To summarize, then, in this non-obvious but simple manner, Gödel had found a way to replace any given formula of PM by an equivalent number (which other people soon would dub its Gödel number). He then extended this idea of “arithmetization” to cover arbitrary sequences of formulas, since proofs in PM are sequences of formulas, and he wanted to be able to deal with proofs, not just isolated formulas. Thus an arbitrarily long sequence of formulas could be converted into one large integer via essentially the same technique, using primes and exponents. You can imagine that we’re talking really big numbers here.

In short, Gödel showed how any visual symbol-pattern whatsoever in the idiosyncratic notation of Principia Mathematica could be assigned a unique number, which could easily be decoded to give back the visual pattern (i.e., sequence of symbols) to which it corresponded. Conceiving of and polishing this precise two-way mapping, now universally called “Gödel numbering”, constituted the first key step of Gödel’s work.

Very Big Integers Moving in Lock-step with Formulas

The next key step was to make Fibonacci-like recursive definitions of special sets of integers — integers that would organically grow out of previously generated ones by addition or multiplication or more complex computations. One example would be the wff numbers, which are those integers that, via Gödel’s code, represent “well-formed” or “meaningful” formulas of PM, as opposed to those that represent meaningless or ungrammatical strings. (A sample well-formed formula, or “wff ” for short, would be “0+0=sss0”. Though it asserts a falsity, it’s still a meaningful statement. On the other hand, “=)0(=” and “00==0+=” are not wffs. Like the arbitrary sequence of pseudo-words “zzip dubbiwubbi pizz”, they don’t assert anything.) Since, as it happens, longer wffs are built up in PM from shorter wffs by just a few simple and standard rules of typographical juxtaposition, their larger code numbers can likewise be built up from the smaller code numbers of shorter ones by just a few simple and standard rules of numerical calculation.

I’ve said the foregoing rather casually, but in fact this step was perhaps the deepest of Gödel’s key insights — namely, that once strings of symbols had been “arithmetized” (given numerical counterparts), then any kind of rule-based typographical shunting-around of strings on paper could be perfectly paralleled by some kind of purely arithmetical calculation involving their numerical proxies — which were huge numbers, to be sure, but still just numbers. What to Russell and Whitehead looked like elaborate symbol-shunting looked like a lot of straightforward number-crunching to Kurt Gödel (although of course he didn’t use that colorful modern term, since this was all taking place back in the prehistoric days when computers didn’t yet exist). These were simply two different views of what was going on — views that were 100 percent equivalent and interchangeable.

Glimmerings of How PM Can Twist Around and See Itself

Gödel saw that the game of building up an infinite class of numbers, such as wff numbers, through recursion — that is, making new “members of the club” by combining older, established members via some number-crunching rule — is essentially the same idea as Fibonacci’s recursive game of building up the class of F numbers by taking sums of earlier members. Of course recursive processes can be far more complicated than just taking the sum of the latest two members of the club.

What a recursive definition does, albeit implicitly, is to divide the entire set of integers into members and non-members of the club — that is, those numbers that are reachable, sooner or later, via the recursive building-up process, and those that are never reachable, no matter how long one waits. Thus 34 is a member of the F club, whereas 35 is a non-member. How do we know 35 is not an F number? That’s very easy — the rule that makes new F numbers always makes larger ones from smaller ones, and so once we’ve passed a certain size, there’s no chance we’ll be returning to “pick up” other numbers in that vicinity later. In other words, once we’ve made the F numbers 1, 2, 3, 5, 8, 13, 21, 34, 55, we know they are the only ones in that range, so obviously 35, 36, and so on, up to 54, are not F numbers.

If, however, some other club of numbers is defined by a recursive rule whose outputs are sometimes bigger than its inputs and other times are smaller than its inputs, then, in contrast to the simple case of the F club, you can’t be so sure that you won’t ever be coming back and picking up smaller integers that were missed in earlier passes.

Let’s think a little bit more about the recursively defined club of numbers that we called “wff numbers”. We’ve seen that the number 72900 possesses “wff-ness”, and if you think about it, you can see that 576 and 2916 lack that quality. (Why? Well, if you factor them and look at the exponents of 2 and 3, you will see that these numbers are the numerical encodings of the strings “0=” and “=0”, respectively, neither of which makes sense, whence they are not well-formed formulas.) In other words, despite its odd definition, wff-ness, no more and no less than squareness or primeness or Fibonacci’s F-ness, is a valid object of study in the world of pure number. The distinction between members and non-members of the “wff club” is every bit as genuine a number-theoretical distinction as that between members and non-members of the club of squares, the club of prime numbers, or the club of F numbers, for wff numbers are definable in a recursive arithmetical (i.e., computational) fashion. Moreover, it happens that the recursive rules defining wff-ness always produce outputs that are bigger than their inputs, so that wff-ness shares with F-ness the simple property that once you’ve exceeded a certain magnitude, you know you’ll never be back visiting that zone again.

Just as some people’s curiosity was fired by the fact of seeing a square in Fibonacci’s recursively defined sequence, so some people might become interested in the question as to whether there are any squares (or cubes, etc.) in the recursively defined sequence of wff numbers. They could spend a lot of time investigating such purely number-theoretical questions, never thinking at all about the corresponding formulas of Principia Mathematica.

One could be completely ignorant of the fact that Gödel’s wff numbers had their origin in Russell and Whitehead’s rules defining well-formedness in Principia Mathematica, just as one can study the laws of probability without ever suspecting that this deep branch of mathematics was originally developed to analyze gambling. What long ago inspired someone to dream up a particular recursive definition obviously doesn’t affect the numbers it defines; all that matters is that there should be a purely computational way of making any member of the club grow out of the initial seeds by applying the rules some finite number of times.

Now wff numbers are, as it happens, relatively easy to define in a recursive fashion, and for that reason wff-ness (exactly like F-ness) is just the kind of mathematical notion that Principia Mathematica was designed to study. To be sure, Whitehead and Russell had never dreamed that their mechanical reasoning system might be put to such a curious use, in which its own properties as a machine were essentially placed under observation by itself, rather like using a microscope to examine some of its own lenses for possible defects. But then, inventions often do surprise their inventors.

Prim Numbers

Having realized that some hypothetical volume of the series by Whitehead and Russell could define and systematically explore the various numerical properties of wff numbers, Gödel pushed his analogy further and showed, with a good deal of fancy machinery but actually not very much conceptual difficulty, that there was an infinitely more interesting recursively defined class of whole numbers, which I shall here call prim numbers (whimsically saluting the title of the famous three tomes), and which are the numbers belonging to provable formulas of PM (i.e., theorems).

A PM proof, of course, is a series of formulas leading from the axioms of PM all the way to the formula in question, each step being allowed by some rule of reasoning, which in PM became a formal typographical rule of inference. To every typographical rule of inference acting on strings of PM, Gödel exhibited a perfectly matching computational rule that acted on numbers. Numerical computation was effectively thumbing its nose at typographical manipulation, sassily saying, “Anything you can do, I can do better!” Well, not really better — but the key point, as Gödel showed beyond any doubt, was that a computational rule would always be able to mimic perfectly — to keep in perfect synchrony with — any formal typographical rule, and so numerical rules were just as good.

The upshot was that to every provable string of Russell and Whitehead’s formal system, there was a counterpart prim number. Any integer that was prim could be decoded into symbols, and the string you got would be a provable-in-PM formula. Likewise, any provable-in-PM formula could be encoded as one whopping huge integer, and by God, with enough calculation, you could show that that number was a prim number. A simple example of a prim number is, once again, our friend 72900, since the formula “0=0”, over and above being a well-formed formula, is also, and not too surprisingly, derivable in PM. (Indeed, if it weren’t, PM would be absolutely pathetic as a mechanical model of mathematical reasoning!)

There is a crucial difference between wff numbers and prim numbers, which comes from the fact that the rules of inference of PM sometimes produce output strings that are shorter than their input strings. This means that the corresponding arithmetical rules defining prim numbers will sometimes take large prim numbers as input and make from them a smaller prim number as output. Therefore, stretches of the number line that have been visited once can always be revisited later, and this fact makes it much, much harder to determine about a given integer whether it is prim or not. This is a central and very deep fact about prim numbers.

Just as with squares, primes, F numbers, or wff numbers, there could once again be a hypothetical volume of the series of tomes by Whitehead and Russell in which prim numbers were defined and their mathematical properties studied. For example, such a volume might contain a proof of the formula of PM that (when examined carefully) asserts “72900 is a prim number”, and it might also discuss another formula that could be seen to assert the opposite (“72900 is not a prim number”), and so on. This latter statement is false, of course, while the former one is true. And even more complex number-theoretical ideas could be expressed using the PM notation and discussed in the hypothetical volume, such as “There are infinitely many prim numbers” — which would be tantamount to asserting (via a code), “There are infinitely many formulas that are provable in PM ”.

Although it might seem an odd thing to do, one could certainly pose eighteenth-century–style number-theory questions such as, “Which integers are expressible as the sum of two prim numbers, and which integers are not?” Probably nobody would ever seriously ask such an oddball question, but the point is that the property of being a prim number, although it’s a rather arcane “modern” property, is no more and no less a genuinely number-theoretical property of an integer than is a “classical” property, such as being square or being prime or being a Fibonacci number.

The Uncanny Power of Prim Numbers

Suppose someone told you that they had built a machine — I’ll dub it “Guru” — that would always correctly answer any question of the form “Is n a prime number?”, with n being any integer that you wish. When asked, “Is 641 prime?”, Guru would spin its wheels for a bit and then say “yes”. As for 642, Guru would “think” a little while and then say “no”. I suppose you would not be terribly surprised by such a machine. That such a machine can be realized, either in silicon circuitry on in domino-chain technology, is not anything to boggle anyone’s mind in this day and age.

But suppose someone told you that they had built an analogous machine — I’ll dub it “Göru” — that would always correctly answer any question of the form “Is n a prim number?” Would this claim — strictly analogous to the previous one — strike you as equally ho-hum? If so, then I respectfully submit that you’ve got another think coming.

The reason is this. If you believed Göru to be reliable and you also believed in the Mathematician’s Credo (Principia Mathematica version), then you could conclude that your little Göru, working all by itself, could answer any number-theoretical question that you were interested in, just like a genie conjured from a magic lamp. How so? What makes Göru a magic genie?

Well, suppose you wanted to know if statement X is true or false (for instance, the famous claim “Every even number greater than 2 is the sum of two primes” — which, as I stated above, remains unsettled even today, after nearly three centuries of work). You would just write X down in the formal notation of PM, then convert that formula mechanically into its Gödel number x, and feed that number into Göru (thus asking if x is prim or not). Of course x will be a huge integer, so it would probably take Göru a good while to give you an answer, but (assuming that Göru is not a hoax) sooner or later it would spit out either a “yes” or a “no”. In case Göru said “yes”, you would know that x is a prim number, which tells you that the formula it encodes is a provable formula, which means that statement X is true. Conversely, were Göru to tell you “no”, then you would know that the statement X is not provable, and so, believing in the Mathematician’s Credo (Principia Mathematica version), you would conclude it is false.

In other words, if we only had a machine that could infallibly tell apart prim numbers and “saucy” (non-prim) numbers, and taking for granted that the Principia Mathematica version of the Mathematician’s Credo is valid, then we could infallibly tell true statements from false ones. In short, having a Göru would give us a royal key to all of mathematical knowledge.

The prim numbers alone would therefore seem to contain, in a cloaked fashion, all of mathematical knowledge wrapped up inside them! No other sequence of numbers ever dreamt up by anyone before Gödel had anything like this kind of magically oracular quality. These amazing numbers seem to be worth their weight in gold! But as I told you, the prim numbers are elusive, because small ones sometimes wind up being added to the club at very late stages, so it won’t be easy to tell prim numbers from saucy ones, nor to build a Göru. (This is meant as a premonition of things to come.)

Gödelian Strangeness

Finally, Gödel carried his analogy to its inevitable, momentous conclusion, which was to spell out for his readers (not symbol by symbol, of course, but via a precise set of “assembly instructions”) an astronomically long formula of PM that made the seemingly innocent assertion, “A certain integer g is not a prim number.” However, that “certain integer g ” about which this formula spoke happened, by a most unaccidental (some might say diabolical) coincidence, to be the number associated with (i.e., coding for) this very formula (and so it was necessarily a gargantuan integer). As we are about to see, Gödel’s odd formula can be interpreted on two different levels, and it has two very different meanings, depending on how one interprets it.

On its more straightforward level, Gödel’s formula merely asserts that this gargantuan integer g lacks the number-theoretical property called primness. This claim is very similar to the assertion “72900 is not a prime number”, although, to be sure, g is a lot larger than 72900, and primness is a far pricklier property than is primeness. However, since primness was defined by Gödel in such a way that it numerically mirrored the provability of strings via the rules of the PM system, the formula also claims:

The formula that happens to have the code number g

is not provable via the rules of Principia Mathematica.

Now as I already said, the formula that “just happens” to have the code number g is the formula making the above claim. In short, Gödel’s formula is making a claim about itself — namely, the following claim:

This very formula is not provable via the rules of PM.

Sometimes this second phraseology is pointedly rendered as “I am not a theorem” or, even more tersely, as

I am unprovable

(where “in the PM system” is tacitly understood).

Gödel further showed that his formula, though very strange and discombobulating at first sight, was not all that unusual; indeed, it was merely one member of an infinite family of formulas that made claims about the system PM, many of which asserted (some truthfully, others falsely) similarly weird and twisty things about themselves (e.g., “Neither I nor my negation is a theorem of PM ”, “If I have a proof inside PM, then my negation has an even shorter proof than I do”, and so forth and so on).

Young Kurt Gödel — he was only 25 in 1931 — had discovered a vast sea of amazingly unsuspected, bizarrely twisty formulas hidden inside the austere, formal, type-theory-protected and therefore supposedly paradoxfree world defined by Russell and Whitehead in their grandiose threevolume œuvre Principia Mathematica, and the many counterintuitive properties of Gödel’s original formula and its countless cousins have occupied mathematicians, logicians, and philosophers ever since.

How to Stick a Formula’s Gödel Number inside the Formula

I cannot leave the topic of Gödel’s magnificent achievement without going into one slightly technical issue, because if I failed to do so, some readers would surely be left with a feeling of confusion and perhaps even skepticism about a key aspect of Gödel’s work. Moreover, this idea is actually rather magical, so it’s worth mentioning briefly.

The nagging question is this: How on earth could Gödel fit a formula’s Gödel number into the formula itself? When you think about it at first, it seems like trying to squeeze an elephant into a matchbox — and in a way, that’s exactly right. No formula can literally contain the numeral for its own Gödel number, because that numeral will contain many more symbols than the formula does! It seems at first as if this might be a fatal stumbling block, but it turns out not to be — and if you think back to our discussion of G. G. Berry’s paradox, perhaps you can see why.

The trick involves the simple fact that some huge numbers have very short descriptions (387420489, for instance, can be described in just four syllables: “nine to the ninth”). If you have a very short recipe for calculating a very long formula’s Gödel number, then instead of describing that huge number in the most plodding, clunky way (“the successor of the successor of the successor of …… the successor of the successor of zero”), you can describe it via your computational shortcut, and if you express your shortcut in symbols (rather than inserting the numeral itself) inside the formula, then you can make the formula talk about itself without squeezing an elephant into a matchbox. I won’t try to explain this in a mathematical fashion, but instead I’ll give an elegant linguistic analogy, due to the philosopher W. V. O. Quine, which gets the gist of it across.

Gödel’s Elephant-in-Matchbox Trick via Quine’s Analogy

Suppose you wanted to write a sentence in English that talks about itself without using the phrase “this sentence”. You would probably find the challenge pretty tricky, because you’d have to actually describe the sentence inside itself, using quoted words and phrases. For example, consider this first (somewhat feeble) attempt:

The sentence “This sentence has five words” has five words.

Now what I’ve just written (and you’ve just read) is a sentence that is true, but unfortunately it’s not about itself. After all, the full thing contains ten words, as well as some quotation marks. This sentence is about a shorter sentence embedded inside it, in quote marks. And changing “five” to “ten” still won’t make it refer to itself; all that this simple act does is to turn my sentence, which was true, into a false one. Take a look:

The sentence “This sentence has ten words” has ten words.

This sentence is false. And more importantly, it’s still merely about a shorter sentence embedded inside itself. As you see, so far we are not yet very close to having devised a sentence that talks about itself.

The problem is that anything I put inside quote marks will necessarily be shorter than the entire sentence of which it is a part. This is trivially obvious, and in fact it is an exact linguistic analogue to the stumbling block of trying to stick a formula’s own Gödel number directly inside the formula itself. An elephant will not fit inside a matchbox! On the other hand, an elephant’s DNA will easily fit inside a matchbox…

And indeed, just as DNA is a description of an elephant rather than the elephant itself, so there is a way of getting around the obstacle by using a description of the huge number rather than the huge number itself. (To be slightly more precise, we can use a concise symbolic description instead of using a huge numeral.) Gödel discovered this trick, and although it is quite subtle, Quine’s analogy makes it fairly easy to understand. Look at the following sentence fragment, which I’ll call “Quine’s Quasi-Quip”:

preceded by itself in quote marks yields a full sentence.

As you will note, Quine’s Quasi-Quip is certainly not a full sentence, for it has no grammatical subject (that is, “yields” has no subject); that’s why I gave it the prefix “Quasi”. But what if we were to put a noun at the head of the Quasi-Quip — say, the title “Professor Quine”? Then Quine’s Quasi-Quip will turn into a full sentence, so I’ll call it “Quine’s Quip”:

“Professor Quine” preceded by itself in quote marks yields a full sentence.

Here, the verb “yields” does have a subject — namely, Professor Quine’s title, modified by a trailing adjectival phrase that is six words long.

But what does Quine’s Quip mean? In order to figure this out, we have to actually construct the entity that it’s talking about, which means we have to precede Professor Quine’s title by itself in quote marks. This gives us:

“Professor Quine” Professor Quine

The Quine’s Quip that we created a moment ago merely asserts (or rather, claims) that this somewhat silly phrase is a full sentence. Well, that claim is obviously false. The above phrase is not a full sentence; it doesn’t even contain a verb.

However, we arbitrarily used Professor Quine’s title when we could have used a million different things. Is there some other noun that we might place at the head of Quine’s Quasi-Quip that will make Quine’s Quip come out true? What Gödel realized, and what Quine’s analogy helps to make clear, is that for this to happen, you have to use, as your subject of the verb “yields”, a subjectless sentence fragment.

What is an example of a subjectless sentence fragment? Well, just take any old sentence such as “Snow is white”, and cut off its subject. What you get is a subjectless sentence fragment: “is white”. So let’s use this as the noun to place in front of Quine’s Quasi-Quip:

“is white” preceded by itself in quote marks yields a full sentence.

This medium-sized mouthful makes a claim about a construction that we have yet to exhibit, and so let’s do so without further ado:

“is white” is white.

(I threw in the period for good measure, but let’s not quibble.)

Now what we have just produced certainly is a full sentence, because it has a verb (“is”), and that verb has a subject (the quoted phrase), and the whole thing makes sense. I’m not saying that it is true, mind you, for indeed it is blatantly false: “is white” is in fact black (although, to be fair, letters and words do contain some white space along with their black ink, otherwise we couldn’t read them). In any case, Quine’s Quasi-Quip when fed “is white” as its input yielded a full sentence, and that’s exactly what Quine’s Quip claimed. We’re definitely making headway.

The Trickiest Step

Our last devilish trick will be to use Quine’s Quasi-Quip itself as the noun to place at its head. Here, then, is Quine’s Quasi-Quip with a quoted copy of itself installed in front:

“preceded by itself in quote marks yields a full sentence”

preceded by itself in quote marks yields a full sentence.

What does this Quip claim? Well, first we have to determine what entity it is talking about, and that means we have to construct the analogue to “ ‘is white’ is white”. Well, in this case, the analogue is the following:

“preceded by itself in quote marks yields a full sentence”

preceded by itself in quote marks yields a full sentence.

I hope you are not lost at this point, for we really have hit the crux of the matter. Quine’s Quip turns out to be talking about a phrase that is identical to the Quip itself! It is claiming that something is a full sentence, and when you go about constructing that thing, it turns out to be Quine’s Quip itself. So Quine’s Quip talks about itself, claiming of itself that it is a full sentence (which it surely is, even though it is built out of two subjectless sentence fragments, one in quote marks and one not).

While you are pondering this, I will jump back to the source of it all, which was Gödel’s PM formula that talked about itself. The point is that Gödel numbers, since they can be used as names for formulas and can be inserted into formulas, are precisely analogous to quoted phrases. Now we have just seen that there is a way to use quotation marks and sentence fragments to make a full sentence that talks about itself (or if you prefer, a sentence that talks about another sentence, but one that is a clone to it, so that whatever is true of the one is true of the other).

Gödel, analogously, created a “subjectless formula fragment” (by which I mean a PM formula that is not about any specific integer, but just about some unspecified variable number x). And then, making a move analogous to that of feeding Quine’s Quasi-Quip into itself (but in quotes), he took that formula fragment’s Gödel number k (which is a specific number, not a variable) and replaced the variable x by it, thus producing a formula (not just a fragment) that made a claim about a much larger integer, g. And g is the Gödel number of that very claim. And last but not least, the claim was not about whether the entity in question was a full sentence or not, but about whether the entity in question was a provable formula or not.

An Elephant in a Matchbox is Neither Fish Nor Fowl

I know this is a lot to swallow in one gulp, and so if it takes you several gulps (careful rereadings), please don’t feel discouraged. I’ve met quite a few sophisticated mathematicians who admit that they never understood this argument totally!

I think it would be helpful at this juncture to exhibit a kind of hybrid sentence that gets across the essential flavor of Gödel’s self-referential construction but that does so in Quinean terms — that is, using the ideas we’ve just been discussing. The hybrid sentence looks like this:

“when fed its own Gödel number yields a non-prim number”

when fed its own Gödel number yields a non-prim number.

The above sentence is neither fish nor fowl, for it is not a formula of Principia Mathematica but an English sentence, so of course it doesn’t have a Gödel number and it couldn’t possibly be a theorem (or a nontheorem) of PM. What a mixed metaphor!

And yet, mixed metaphor though it is, it still does a pretty decent job of getting across the flavor of the PM formula that Gödel actually concocted. You just have to keep in mind that using quote marks is a metaphor for taking Gödel numbers, so the upper line should be thought of as being a Gödel number (k) rather than as being a sentence fragment in quote marks. This means that metaphorically, the lower line (an English sentence fragment) has been fed its own Gödel number as its subject. Very cute!

I know that this is very tricky, so let me state it once again, slightly differently. Gödel asks you to imagine the formula that k stands for (that formula happens to contain the variable x), and then to feed k into it (this means to replace the single letter x by the extremely long numeral k, thus giving you a much bigger formula than you started with), and to take the Gödel number of the result. That will be the number g, huger far than k — and lastly, Gödel asserts that this walloping number is not a prim number. If you’ve followed my hand-waving argument, you will agree that the full formula’s Gödel number (g) is not found explicitly inside the formula, but instead is very subtly described by the formula. The elephant’s DNA has been used to get a description of the entire elephant into the matchbox.

Sluggo and the Morton Salt Girl

Well, I don’t want to stress the technical points here. The main thing to remember is that Gödel devised a very clever number-description trick — a recipe for making a very huge number g out of a less huge number k — in order to get a formula of PM to make a claim about its own Gödel number’s non-primness (which means that the formula is actually making a claim of its own nontheoremhood). And you might also try to remember that the “little” number k is the Gödel number of a “formula fragment” containing a variable x, analogous to a subjectless sentence fragment in quote marks, while the larger number g is the Gödel number of a complete sentence in PM notation, analogous to a complete sentence in English.

Popular culture is by no means immune to the delights of self-reference, and it happens that the two ideas we have been contrasting here — having a formula contain its own Gödel number directly (which would necessitate an infinite regress) and having a formula contain a description of its Gödel number (which beautifully bypasses the infinite regress) — are charmingly illustrated by two images with which readers may be familiar.

In this first image, Ernie Bushmiller’s character Sluggo (from his classic strip Nancy) is dreaming of himself dreaming of himself dreaming of himself, without end. It is clearly a case of self-reference, but it involves an infinite regress, analogous to a PM formula that contained its own Gödel number directly. Such a formula, unfortunately, would have to be infinitely long!

Our second image, in contrast, is the famous label of a Morton Salt box, which shows a girl holding a box of Morton Salt. You may think you smell infinite regress once again, but if so, you are fooling yourself! The girl’s arm is covering up the critical spot where the regress would occur. If you were to ask the girl to (please) hand you her salt box so that you could actually see the infinite regress on its label, you would wind up disappointed, for the label on that box would show her holding a yet smaller box with her arm once again blocking the regress.

And yet we still have a self-referential picture, because customers in the grocery store understand that the little box shown on the label is the same as the big box they are holding. How do they arrive at this conclusion? By using analogy. To be specific, not only do they have the large box in their own hands, but they can see the little box the girl is holding, and the two boxes have a lot in common (their cylindrical shape, their dark-blue color, their white caps at both ends); and in case that’s not enough, they can also see salt spilling out of the little one. These pieces of evidence suffice to convince everyone that the little box and the large box are identical, and there you have it: self-reference without infinite regress!

In closing this chapter, I wish to point out explicitly that the most concise English translations of Gödel’s formula and its cousins employ the word “I” (“I am not provable in PM ”; “I am not a PM theorem”). This is not a coincidence. Indeed, this informal, almost sloppy-seeming use of the singular first-person pronoun affords us our first glimpse of the profound connection between Gödel’s austere mathematical strange loop and the very human notion of a conscious self.