TWENTY-FOUR A Fish in Your Ear: The Short History of Simultaneous Interpreting

Speech predates writing by eons, and oral translation is far, far older than the written kind. Because speech is such an ephemeral thing—it’s gone in a puff of warm air, which is all it is in the material sense—nothing can be known directly about speech translation for almost the entire duration of its history. Two things caused a huge change in the twentieth century: the invention of the telephone by Alexander Graham Bell in 1876, and a political need of the most pressing kind.

The Nuremberg Trials of Nazi war criminals in 1945 was one of the most important courts of law in modern history and also an unprecedented event in the history of translation. The panel of judges and the prosecuting teams came from the four Allied powers—the United States, Great Britain, France, and the Soviet Union—speaking three different languages, and the defendants spoke a fourth language, German. Nothing like this had ever happened before. In courts located in a national jurisdiction, interpreters read consecutively, repeating in the language of the court what the foreign defendant has just said, and then repeating what the court says to the defendant (when the client is not being addressed directly, it may be done at low volume in a “whisper translation,” or chuchotage). Two-way oral translation of this normal kind obviously slows down the proceedings. But four-way translation? In twelve directions? Consecutive interpreting would have so lengthened the International Military Tribunal’s case that everyone might have lost the thread. For the Nuremberg Trials, something new was needed.

Technology for speeding up multilingual interaction already existed. The Filene-Finlay Speech Translator had been tried out a few times in the 1920s by the International Labour Organization in Geneva. Users of the system had a telephone in front of them, and when a delegate could not understand what was being said she picked up the handset, dialed in to the exchange, and heard the speech in a different language (only two—French and English—were involved at that time). The translators sat at the back listening to the speech and speaking their translation of it into a soundproof awning called a Hushaphone, connected directly to the telephone exchange. The original Speech Translator was also used in 1934 for Adolf Hitler’s address to a Nazi Party rally in Nuremberg for live broadcast on French radio.[151]

The Speech Translator was designed and promoted not for rapid two-way interaction in multiple languages but for speeches read aloud from prepared written text—what Germans call gesprochene Sprechsprache, “spoken speech language,” the standard genre of politicians and public figures the world over. The Filene-Finlay device was acquired by IBM in the 1930s, and the company offered a complete set of partly secondhand but much enhanced and extended equipment for free to the International Military Tribunal in Nuremberg. This act of generosity was to prove an epochal event in the way in which we now conceive the possibility of international communication.

Members of the court, including the defendants, were equipped with headphones and microphones, from which wires trailed over the courtroom floor to the exchange. Wires ran from the exchange to four separate translation teams in different compartments. That made for a lot of complicated wiring, but the real magic was what happened in the interpreters’ booths.

Members of the court had switch dials to select which language channel they wished to listen to. The output was produced by four teams of three interpreters each. The English team had a German interpreter, a Russian interpreter, and a French interpreter sitting side by side, listening on headphones, and repeating in English what was said in the other languages; the setup was the same in the three other booths. Altogether, thirty-six interpreters were recruited from among the three hundred language professionals hired by the court and the prosecution and defense teams to work at this brand-new and not obviously manageable task of instantaneous oral translation. Each of the twelve-strong teams worked eighty-five-minute shifts on two days out of three and was expected to rest in between. From the very start of the new profession, simultaneous interpreting was recognized as being one of the most exhausting things you can do with a human brain.

The difficulty is not only high-speed language transfer. The difficulty is that the sound of your own voice diminishes your ability to hear what the other person is saying. That’s why we take turns in conversation and speak over someone else only when we really do not want to hear what he has to say. A simultaneous interpreter must learn to overrule the natural tendency not to listen when talking, and not to talk when listening. Simultaneous interpreting exists only because some very adept people can train themselves to do such an unnatural thing. Try it yourself: switch on a TV news broadcast and repeat at your own normal speaking volume exactly what the newscaster says. If you can keep that up without losing a sentence for ten minutes or more, then maybe you, too, could be a simultaneous interpreter—provided you know another two languages extremely well. Millions of people know three languages well enough to be interpreters, but only a small proportion of them can manage the exhausting trick of dividing attention between what you are saying and what you are hearing—without missing a word.

The trickiest part of high-speed language transfer is that politicians and diplomats do not characteristically use short, simple sentences without subordinate clauses, or leave long gaps between them. They tend to drone on with sausagelike strings of evasive circumlocutions: “I am instructed by my ambassador to inform this august assembly that contrary to rumors reported in one of the organs of the capitalist press no authorized agent of the state has knowingly exported to any other country any materials covered by the international convention on …” Unfortunately, there is no convention on the export of long-windedness, and so interpreters have to begin reformulating sentences of this kind without knowing for sure where they will go, what their real point is, or what alteration to the structure of the starting point the end of the sentence will bring. Extremely sophisticated mental skills are required to “hold” features of meaning in provisional formulations until the real topic of the sentence is finally let out of the bag. An interpreter who has to repair a sentence after it has begun (as we all do in normal speech) loses valuable time. The ability to pick the right formulae in a flash and to keep the sentence loose enough to cope with what may crop up next is acquired by experience and practice—together with an uncommonly developed capacity for finding instant matches between sentence patterns that are grammatically and stylistically far apart.

Most of the people involved in preparing the Nuremberg Trials doubted this newfangled setup would work. We owe the modern world of conference interpreting more to the can-do attitude of the victorious U.S. Army than to the considered judgment of prosecutors, judges, and language professionals. Chief doubter among them was Richard Sonnenfeldt, the head of the U.S. prosecution team’s translation service. He’d been picked from a motor pool in Salzburg by General “Wild Bill” Donovan to serve as translator in the long interrogations of the defendants that preceded the trials. He’d interrogated the Nazi top brass on behalf of four-star generals and was asked to take charge of the simultaneous-interpreting team during the trials. Sonnenfeldt turned the job down because he was intimidated by the speed requirement and by his own lack of familiarity with legal terminology. But the main reason he backed off from running the world’s first simultaneous-interpreting service was his professional opinion that either the people or the system, or both, would break down.[152]

He was right about the glitches. Microphones and headsets went on the blink; lawyers and witnesses (including the chief U.S. attorney, Robert H. Jackson) spoke too fast; on more than one occasion, an interpreter burst into tears on hearing testimony from Rudolf Höss, the ice-cold commandant of Auschwitz. But, despite the obstacles, the system worked. Hermann Göring is said to have remarked to Stefan Hörn, one of the court translators, “Your system is very efficient, but it will also shorten my life!”[153]

The speech-translation system inaugurated at the Nuremberg Trials launched a new era in international communication. The interpreters’ achievements not only created a new skill and a new profession but had an immediate and far-reaching effect on world affairs. First of all, every new international agency wanted a simultaneous-translation system straightaway and thought it could just be bought at the store. In February 1946, when the Nuremberg speech-translation system was barely run in, the first General Assembly of the newborn United Nations Organization adopted as its second resolution that “speeches made in any of the six languages of the Security Council shall be interpreted into the other five languages.”[154] Thereafter all the dependent agencies—from the International Labour Organization to the Food and Agriculture Organization, from UNESCO to the World Bank—acquired the equipment and sought to recruit the personnel to produce the magical illusion that every delegate would always be able to understand what any other delegate was saying as he or she was in the process of saying it.

This led outsiders to take for granted that the diversity of languages was no longer an impediment to collective international action and world harmony. Insiders—diplomats and negotiators in all the new bodies set up by the UN—were under no such illusion. As one student of international law points out, texts and speeches produced in multilingual form at high speed may be grammatically correct, but they are never quite coherent. The small deviations that arise, over which delegates argue for hours on end, “intensify the collective awareness of the importance of translation.”[155] But the early years of simultaneous interpreting were also years of great hope for a new world order ruled by “jaw-jaw” in place of the preceding decades of “war-war.” In those circumstances, the general public easily forgot just what a fragile and mysterious feat was being accomplished by a very small group of language gymnasts in the glass boxes in the rear of the assembly hall.

It hardly needs explaining why simultaneity in translation is an illusion. You cannot translate anything until you have heard what it is: translation is always a “speaking after.” The impression of simultaneity is created by a bag of impressive language tricks. First, many speeches are read out from a prepared text. Diplomats sometimes provide the translation teams with the text in advance of the meeting—often only just in advance, but even a few minutes’ head start takes away a lot of stress. Second, international meetings are dominated by speeches of a fairly predictable kind. Once you acquire experience of the kind of business being conducted and of the formulaic language it uses, you can run ahead of what is actually said and give yourself a little brain space to listen for the all-important variations that the speaker might introduce. Contraction and change of orientation are also used for nonformulaic digressions: “The Soviet delegate has just made a joke” can replace the telling of a long Russian shaggy-dog tale. But, even so, the skill of the “conference interpreter” (the term that has come to replace oral translator, simultaneous translator, and speech translator) calls for high levels of concentration and mental agility. There are few people who can do it at all, and even fewer who want to do it day in and day out.

Sixty years of experience have not made it any easier to predict whether an individual can be turned into a conference interpreter or not. Even now, between half and three quarters of all students admitted to interpreter training courses fail to enter the profession.[156] At the beginning, in the aftermath of the Second World War, the disastrous history of the twentieth century had produced many thousands of people with outstanding language skills in several of the six official international languages (Spanish, English, French, Chinese, Russian, and Arabic)—children of refugees from the Russian Revolution brought up in Shanghai and educated at the Lycée Français, where they learned English, young refugees from German-occupied France who had spent months or years in Cuba or Mexico awaiting a U.S. visa before going to college in New York, and so on. The first generation of the elite of the translating professions consisted mostly of young people from backgrounds of that kind, who remained in post for thirty years and more. These founding mothers and fathers of the conference-interpreting community have now retired, and it has proved difficult to replace them. The lack of personnel is particularly acute for the two most-needed languages in world affairs today—Arabic and Chinese. Even the Russian- and French-into-English booths are getting harder to fill.

The structure of conference interpreting at the UN and its agencies and at most other international gatherings that can afford it is not now quite as it was at the Nuremberg Trials. The rules invented for that first experiment were that all interpreters should work only into their “native” language (now called their A language, “A” standing for “active”), and that all interpreting should be done from the “original.” With six UN languages currently in operation, that would require six teams of five translators, or thirty people in all, to service a single meeting. The job is now reckoned to be as stressful as the work of air traffic controllers; the eighty-five-minute slots used at Nuremberg have been replaced with a routine of alternating thirty-minute shifts (the Chinese and Arabic booths change over every twenty minutes) through a normal (short) working day—so that in fact you would need sixty people, not thirty, to service an international meeting if the original rules were still applied. There just aren’t sixty people with those high-level and variegated skills that can be gathered at any one time in any one place in the world, not even in New York City. The following schema allows the illusion of seamless language transfer to be achieved with a team of just fourteen members:


In the French booth: two interpreters, one listening in Spanish and English, the other listening in Russian and English, and giving out in French

In the English booth: two interpreters, one listening in French and Russian, the other listening in Spanish and French, and giving out in English

In the Spanish booth: two interpreters, both listening in English and French, and giving out in Spanish

In the Russian booth: two interpreters, both listening in either Spanish or French as well as English, and giving out in Russian

In the Chinese booth: three interpreters working shifts, taking in English and Chinese and giving out in Chinese and English

In the Arabic booth: three interpreters working shifts, taking in French or English and Arabic and giving out in Arabic and English or French


In other words, Chinese gets into Spanish, French, and Russian by relay from the English channel, and Arabic gets into Spanish and Russian by relay either from English or, most often, from French; Spanish and Russian get into Chinese by relay from the English channel, and into Arabic by relay from French. If the Russian interpreter in the English booth has gone to the bathroom, then the Russian channel also gets into English by relay from the French booth; similarly, if the Spanish interpreter in the French booth has a nosebleed, Spanish gets into French by relay from English.

Relay, or double translation, is in principle a bad idea, as the possibility of error is increased, as is the time lag between the delegate’s speech and the output in listeners’ headphones. Also, the fact that Chinese and Arabic interpreters work both into their A language and from it into English is not a good idea—working both ways at once more than doubles the mental stress involved. But the devices of relay (double translation) and retour (one interpreter working in two directions) are godsends for the UN officials whose task is to ensure the smooth running of the meetings. Without relay and retour the whole system would be vastly more expensive—and it’s not exactly cheap as it is.

In the European Union, further refinements are used to ensure that meetings of a body with twenty-four official languages can be coped with. Full symmetrical interpreting under Nuremberg rules—that’s to say, each translation direction being supplied by a single dedicated interpreter—would require a team of 552 interpreters, exceeding by far the number of delegates taking part in any meeting, and that’s clearly not feasible. The system works like this:

When all participants in a meeting understand at least one of the EU’s working languages (English, French, German, and Italian)—and this is nearly always the case—then an asymmetrical language regime is used. “Asymmetry” means that participants may speak in any of the official languages (as long as they let the interpreting service know which one ahead of time), but may listen in only one of the four working languages. Such a meeting would be said to have a “24:4” language regime. If each translation direction were served by a dedicated individual, that would require up to eighty interpreters per session, which is still far too many.

The number is further reduced by interpreters with two A languages who can work into both, a device called cheval, but also, most crucially, by retour—interpreters who work into their B language as well. The greatest economy of all is of course made by relay. When the Lithuanian delegate speaks, an interpreter with Lithuanian B provides a simultaneous German translation, which the German–English, German–French, and German–Italian interpreters use for their versions in the working languages (and in a 24:4 regime, no further language versions are required). In this example, the hub or pivot language is German; for other languages at the same imaginary meeting, the hub may be English, French, or Italian, bringing the total number of actual bodies needed to service a meeting under 24:4 to a maximum of twenty-eight, and quite a lot fewer if (for example) the Portuguese–French interpreter also does Spanish when French is the hub language, or the Swedish–German interpreter also does Danish when German is the pivot. Because all EU interpreters must have two B languages, the use of asymmetric regimes together with cheval, retour, and relay suffice to provide just about affordable simultaneous interpreting in Brussels and Luxembourg, and at the European Parliament in Strasbourg.[157]

At the UN, the system is often invisible to users. Interpreters are placed at the rear or the side of the assembly hall behind soundproofed and tinted glass screens. You can attend a dozen meetings without even realizing the interpreters are physically present—so it’s only natural they should get taken for granted. What’s more insidious than the occlusion of the interpreting magic, however, is the impression that anything you say can be simultaneously heard in all other tongues. Conference interpreting, glamorous though it is, buries the real difficulties—and the real interest—of language transfer beneath sophisticated, almost circuslike tricks of the language trade. It makes people think that it’s only a matter of time before we can all have a device to stick in our ear—the “Babel fish” of The Hitchhiker’s Guide to the Galaxy—to provide us with instant communication with all the peoples on earth.

Unlike most translators in written mode and a high proportion of consecutive interpreters, conference interpreters are rarely specialists in any particular field and come closest to being pure language professionals. Few domain-specific organizations are sufficiently large to justify having salaried interpreters on their books: only sixty-seven organizations in the world employ members of AIIC (the interpreters’ professional body) as full-time staff, and only four (the UN in Geneva and New York, and two of the International Criminal Tribunals in The Hague) employ more than ten. As a result, most of the three thousand members of AIIC (and a roughly equal number of nonmembers) work freelance and travel from conference to conference, dealing with all sorts and kinds of topics. Fast-talking yet good listeners, interpreters must be both alert and relaxed, able to tolerate unspeakably boring harangues but also quick to pick up the gist when something entirely new comes on the agenda. They belong to a rare breed.

They might become even rarer, because there are several threats to the survival of the species. First, the precipitous decline in the teaching of foreign languages in the English-speaking world in the last fifty years means that there are ever fewer entrants to the profession with English A. If you prevented boys from having bicycles, then the Tour de France would become a celebration of geriatric fitness in a decade or two, and then stop. If you don’t teach native English speakers two languages out of Spanish, Russian, Chinese, Arabic, and French intensively to high levels while they are young, you will not have candidates for interpreter training within ten or fifteen years. There are many English–Spanish bilinguals, of course, but very few of them have another UN language to the requisite degree of fluency. If the requirement were lowered from two to one foreign language for English A, then the system could be run on relay and retour, and staffing problems would be less acute. However, because ten applicants to a translators’ school produce no more than five entrants, and because barely one third of those graduating will be found good enough to enter the profession, large investment in language education throughout the English-speaking world is urgently needed. Without it, the next cohort of our politicians and diplomats, businessmen and consultants, human rights campaigners, international lawyers, and policy wonks may well be reduced to stuffing fish in both ears.

A second threat to maintaining current language practice in international organizations is that some states may become unwilling to finance simultaneous interpretation into languages that are ceasing to be global vehicular tongues—but the replacement of Russian (for example) may prove politically impossible for many decades yet, and nobody has a clear idea of what might replace French.

But the bigger threat looming on the horizon is something that’s going on right now in research labs in New Jersey and elsewhere. Using the technology of speech recognition that allows a widely available word processor to generate text from speech, alongside the speech synthesis systems that power today’s automated answering machines, the FAHQT target that current U.S. science policy encourages could well become FAHQST—fully automated, high-quality speech translation. Experimental systems not very far from commercial release already produce running English text from Spanish speech. I may not live to see or hear it, but many of you probably will: automated interpreting for the secondary orality of predictable international diplomatic prose, for tourist inquiries at hotel reception desks, and maybe for other uses as well.

You will then enter the era of tertiary orality. It will be another world.

Загрузка...