Chapter 4

Musical Movement


Serious Music

This dictionary of musical themes, by Harold Barlow and Sam Morgenstern, supplies an aid which students of music have long needed . . . We should now have something in musical literature to parallel Bartlett’s Familiar Quotations. Whenever a musical theme haunted us, but refused to identify itself no matter how much we scraped our memory, all we should have to do would be to look up the tune in Barlow and Morgenstern, where those ingenious dictionary-makers would assemble some ten thousand musical themes, with a notation-index or theme-finder, to locate the name of the composition from which the haunting fragment came, and the name of the composer.

– John Erskine, 1948, in the preface to Barlow and Morgenstern’s A Dictionary of Musical Themes.

In the 1940s it must have been laborious to construct a dictionary of musical themes, but that’s what Barlow and Morgenstern went ahead and did. It is unclear whether anyone ever actually used it to identify the tunes that were haunting them, and, at any rate, it is obsolete today, given that our iPhones can tell us the name and composer of a song if you merely let it listen to a few bars. The iPhone software is called “Shazam,” a great advance over locutions such as, “Hey, can you Barlow-and-Morgenstern this song for me?” Now, in defense of Barlow and Morgenstern, Shazam does not recognize much classical music, which makes me the life of the party when someone’s Shazam comes up empty-handed in the attempt to identify what the pianist is playing, and I pull out my 642-page Barlow and Morgenstern and tell them it is Chopin’s Concerto No. 1 in E minor. And, I add, it is the third theme occurring within the second movement . . . because that’s how I roll.

The other great use I have found for Barlow and Morgenstern’s dictionary is as a test bed for the movement theory of music. Each of its 10,000 themes nicely encapsulates the fundamental part of a tune—no chords, no harmony, no flourishes. Most themes have around one to two dozen notes, and so, in movement terms, they correspond to short bouts of behavior. (Figure 18 shows three examples of themes from Barlow and Morgenstern.) There are at least two good reasons for concentrating my efforts on this data set.

Figure 18. Example themes from the Barlow & Morgenstern dictionary. Top: A theme from Bach’s Partita, No. 1 in B minor. Middle: A theme from Beethoven’s Sonata No. 7 in D. Bottom: A theme from Sibelius’s Quartet Op. 56 “Voces Intimae.”

First, the dictionary possesses a lot of themes—10,000 of them. This is crucial for our purposes because we’re studying messy music, not clean physics. One can often get good estimates of physical regularities from a small number of measurements, but even though (according to the music-is-movement theory) music’s structure has the signature of the physical regularities of human movement, music is one giant leap away from physics. Music is the product of cultural selection among billions of people, thousands of years, and hundreds of cultures, and so we can only expect to see a blurry signature of human movement inside any given piece or genre of music. On top of that, we have the wayward ways of composers, who are often bent on marching to their own drum and not fitting any pattern they might notice in the works of others. Music thus is inherently even messier than speech, and that’s why we need a lot of tunes for our data. With enough tunes, we’ll be able to see the moving humans through the fog.

The Dictionary of Musical Themes is also perfect for our purposes here because it is a dictionary of classical music. “What’s so great about classical music?” you might ask. Nothing, is the answer. Or, at least, there is nothing about the category of classical music that makes it more worthy of study than other categories of music. But it is nevertheless perfect for our purposes, and for an “evolutionary” reason. We are interested in analyzing not just any old tune someone can dream up, but the tunes that actually get selected. We want our data set to have the “melodic animals” that have fared well in the ecology of minds they inhabit. Classical music is great for this because it has existed as a category of music for several centuries. The classical music that survives to be played today is just a tiny fraction of all the compositions written over the centuries, with most composers long dead—and even longer obscure.

Ultimately, the theory developed here will have to be tested on the broad spectrum of music styles found across humankind, but, for the reasons I just mentioned, Western classical music is a natural place to begin. And who is going to be motivated to analyze broad swaths of music for signs of human movement if their curiosity is not at least piqued by the success of the theory on a data set closer to home? As it happens, for many of the analyses carried out in the following chapters, we did also analyze a database of approximately 10,000 Finnish folk songs. The results were always qualitatively the same, and I won’t discuss them much here. At any rate, Finnish folk are universally agreed to be a strange and taciturn people, and they are (if just barely) in the West, so they don’t really broaden the range of our musical data.

With the Barlow and Morgenstern app installed in our toolkit, and with good Finyards slandered without reason, we are ready to embark on a quest for the signature of expressive human movers in music.

In this chapter we will successively take on rhythm, pitch, and loudness. As we will see, when we humans move, we have our own signature rhythm, pitch modulations, and loudness fluctuations. I will introduce these fingerprints of human movement, and provide evidence that music has the same fingerprints. I have at this point accumulated more evidence than can be reasonably included in this chapter, and so I have added an “Encore” chapter at the end of the book that takes up many other converging lines of evidence for human movement hidden inside music.


Drum Core

When most people think about the auditory features peculiar to music, they are likely to focus on melody, and in particular upon the melodic contours, or the pattern of pitch rises and falls. Perhaps this bias toward melody is because the most salient visible feature of written music is that the notes go up and down on the staff. Or maybe it is because our fingers go up and down our instruments, pressing different buttons for different pitches; or because much of the difficulty in playing an instrument is learning to move quickly from pitch to pitch. Whatever the reason, the pitch modulations of the melody get a perceived prominence in music. This is an eternal thorn in the side of percussionists, often charged with not really playing an instrument, and of rappers, dismissed as not really being musicians.

But in reality, the chief feature of music is not the pitch contours of melody at all, but rhythm and beat, which concern the timing, emphasis, and duration of the notes. Whereas nearly all music has a rhythm and a beat, music can get by without melodic pitch modulations. I just came back from a street fair, for example, where I heard a rock band, an acoustic guitarist, and a drum group. All three had a rhythm and beat, but only two of the three had a melody. The drum group had no melody, but its rhythm and beat made it music—the best music at the fair, in fact.

The rhythm-and-beat property is the hard nugget at the core of music. And the diamond at the very center of that nugget is the beat, all by itself. Let’s begin our examination of musical structure, then, with the beat.

We humans make a variety of beatlike sounds, including heartbeats, sexual gyrations, breathing, and certain vocalizations like laughing and sobbing. But one of the most salient beatlike sounds we make is when we walk, and our feet hit the ground over and over again in a regular repeating pattern. Hit-ring, hit-ring, hit-ring, or boom, boom, boom. Such beatlike gaits resounding from a mover are among the most important sound patterns in our lives, because they are the centerpiece of the auditory signature of a human in our vicinity, maybe a potential lover, murderer, or mailman. This is why the beat is so fundamental to music: natural human movement has a beat, and so music must have a beat. That is, from the music-is-movement theory’s point of view, a beat must be as integral to music as footstep sounds are to human movement. And because most actions we carry out have regularly repeating footsteps, most music will have a beat.

And music is not merely expected to have a regularly repeating beat, but to have a human steplike beat. Consider the following three prima facie similarities between musical beat and footsteps. First, note that the rate of musical beats tends to be around one to two beats per second, consistent with human footstep rates. Second, also like human footsteps, the beat need not be metronome-like in its regularity; rather, the beat can have irregularities and still be heard as a beat, because our auditory footstep-recognition mechanisms don’t expect perfectly metronome-like human movers. In fact, musical performers are known to sometimes purposely add irregularities to the beat’s timing, with the idea that it sounds better. And a third initial similarity between footsteps and musical beats is that when people go from moving to not moving, the rate of their footsteps slows down, consistent with the tendency toward a slowing of the beat (ritardando) at the end of pieces of music (a topic of study by researchers such as Henkjan Honing, Jacob Feldman, and others over the years). Not all objects stop in this fashion: recall from Chapter 2, on solid-object physical events, that a dropped ball bounces with ever greater frequency as it comes to a stop. If musical beat were trying to mimic simple solid-object sounds instead of human movers, then musical endings would undergo accelerando rather than ritardando. But that’s not how humans slow down, and it’s not how music slows down.

In addition to beats being footsteplike in their rate, regularity, and deceleration, beats are footsteplike in the way they are danced to. Remember those babies shaking their stinky bottoms that we discussed in the previous chapter’s section titled “Motionally Moving”? They dance, indeed, but one might suspect that they aren’t very good at it. After all, these are babies who can barely walk. But baby dancers are better than you may have realized. While they’re missing out on the moves that make me a sensation at office parties, they get a lot right. To illustrate how good babies are at dancing, consider one fundamental thing you do not have to tell them: dance to the beat. Babies gyrate so that their body weight tends to be lower to the ground on, and only on, every beat. They somehow “realize” that to dance means not merely to be time-locked to the music, but to give special footstep status to the beat. Babies don’t, for example, bounce to every other beat, nor do they bounce twice per beat. And dancing to the beat is something we adults do without ever realizing that there are other possibilities. MCs never yell, “Three steps to the beat!” or “Step in between the beat!” or “Step on the sixteenth note just after the beat, and then again on the subsequent thirty-second note!” Instead, MCs shout out what every toddler implicitly knows: “Dance to the beat!” The very fact that we step to the beat, rather than stepping in the many other time-locked ways we could, is itself a fundamental observation about the relationship between movement and music, one that is difficult to notice because of the almost tautological ring to the phrase, “Step to the beat.” Why we tend to step to the beat should now be obvious, given our earlier discussion about the footsteplike meaning of the beat, and (in the previous chapter) about dance music sounding like contagious expressive human behaviors. We step to the beat because our brain thinks we are matching the gait of a human mover in our midst.

Recall the drum group at the festival I mentioned near the start of this section. There is something I didn’t mention: there was no group. More exactly, there was a tent exhibition with a large variety of percussion instruments, and the players were children and adults who, upon seeing and hearing the drums, joined in the spontaneous jam sessions. These random passersby were able to, and wanted to, make rhythms matching those around them. Watching this spectacle, it almost seems as if we humans are born to drum. But is it so surprising that we’re able to drum to the beat if our actions are the origins of the very notion of the beat?

Before discussing further similarities between beats and footsteps, we need to ask about all the notes occurring in music that are not on the beat. Beat may be fundamental to music, but I doubt I’d be bothering to write this—or you to read it—if music were always a simple, boring, one-note-per-beat affair. It is the total pattern of on-beat and off-beat notes that determines a piece of music’s rhythm, and we must address the question: if on-beat notes are footsteps, then what human-movement-related sounds might the off-beat notes sound like?


Gangly Notes

The repetitive nature of our footsteps is the most fundamental regularity found in our gait, explaining the fundamental status of the beat in music. But we humans make a greater racket than we are typically consciously aware of. Much of our body weight consists of four heavy, gangly parts—our limbs—and when we are on the move, these ganglies are rattling about, bumping into all sorts of things. When our feet swing forward in a stride, they float barely above the ground, and very often shuffle on their way to landing. In natural terrain, the grass, rocks, dirt, and leaves can get smacked or brushed in between the beat. Sometimes one’s own body hits itself—legs hitting each other as they pass, or arms hitting the body as they swing. And often we are carrying things, like a quiver of arrows, a spear, a keychain, or a sack of wet scalps of the neighboring villagers, and these will clatter and splat about as we move.

Not only do our clattering ganglies clang in between our footsteps, they make their sounds in a time-locked fashion to the footsteps. This is because when we take a step, we initiate a “launch” of our limbs (and any other objects carried on our bodies) into a behavior-specific “orbit,” an orbit that will be repeated on the next step if the same behavior is repeated. In some cases the footstep causes the gangly hit outright, as when our step launches our backpack a bit into the air and it then thuds onto our back. But in other cases the step doesn’t directly cause the between-the-beat gangly hit so much as it triggers a sequence of motor events, such as our arms brushing against our body, which will recur at the same time delay after the next step. Exactly what the time delay will be after the step depends on the specific manner in which any given gangly part (appendage, carried object, or carried appendage) swings and bounces, which in turn depends on its physical dimensions, how it hangs, where on the body it lies, and how it participates in the behavior.

From the auditory pattern of these footstep-time-locked clattering ganglies, we are able to discern what people are doing. Walking, jogging, and running sound different in their patterns of hits. A sharp turn sounds different from a mover going straight. Jumping leads to a different pattern, as does skipping or trotting. Going up the stairs sounds distinct from going down. Sidestepping and backing up sound different than forward movement. Happy, angry, and sad gaits sound different. Even the special case we discussed in the previous chapter—sex—has its own banging ganglies. Close your eyes while watching a basketball game on television, and you’ll easily be able to distinguish times when the players are crossing the court from times when they are clustered on one team’s side; and you will often be able to make a good guess as to what kind of behavior, more specifically, is being displayed at any time. You can distinguish between the pattern of hits made by a locomoting dog versus cat, cow versus horse. And you can tell via audition whether your dog is walking, pawing, or merely scratching himself. It should come as no surprise that you have fine-grained discrimination capabilities for sensing with your ears the varieties of movements we humans make, movements we hear in the pattern of gangly bangings.

If the pattern of our clanging limbs is the cue our auditory system uses to discern a person’s type of behavior, then music that has culturally evolved to sound like human movement should have gangly-banging-like sounds in it. And just as gangly bangings are time-locked to the steps, music’s analog of these should be time-locked to the beat. And, furthermore, musical banging ganglies should be crucial to the identity of a song, just as the pattern of a mover’s banging ganglies is crucial to identifying the type of behavior.

Where are these banging ganglies in music? Right in front of our ears! Musical banging ganglies are simply notes. The notes on the beat sound like footsteps (and are typically given greater emphasis, just as footsteps are more energetic than between-the-steps body hits), and the notes occurring between the beats are like the other body hits characterizing a mover’s behavior. Beats are footsteps, and rhythm (more generally) is the pattern of a mover’s banging ganglies. Just as between-the-steps body-hit sounds are time-locked to footsteps, notes are time-locked to the beat. And, also like our gait, pieces of music that have the same sequence of pitches but differ considerably in rhythm are perceived to be different tunes. If we randomly change the note durations found in “Twinkle, Twinkle Little Star,” thereby obliterating the original rhythm, it will no longer be “Twinkle, Twinkle Little Star.” Similarly, if we randomly change the timing of the pattern of banging ganglies for a basketball player going up for a layup, it will no longer be the sound of a layup.

Rhythm and beat have, then, some similarities to the structure of our banging ganglies. We will discuss more similarities in the upcoming sections and in the Encore chapter. But there is one important similarity that might appear to be missing: musical notes usually come with a pitch, and yet our footsteps and gangly hits are not particularly pitchy. How can the dull thuds of our bodies possibly be pitchy enough to explain the central role of pitch in music?

If you have already read the earlier chapter on speech, then you may have begun to have an appreciation for the rings occurring when any solid-object physical event occurs. As we discussed, we are typically not consciously aware of the rings, but our auditory system hears them and utilizes them to determine the identity of the objects involved in events (e.g., to tell the difference between a pencil and a paper clip hitting a desk). Although the pitch of a typical solid object may not be particularly salient, it can become much more salient when contrasted with the distinct pitches of other objects’ rings. For example, a single drum in a set of drums doesn’t sound pitchy, but when played in combination with larger and smaller drumheads, each drum’s pitch becomes easy to hear. The same is true for percussionists who use everyday objects for their drums—in such performances one is always surprised to hear the wide range of pitches occurring among all the usually pitchless-seeming everyday objects. Our footsteps and banging ganglies do have pitches, consistent with the hypothesis that they are the fundamental source of musical notes. (As we will see, these gangly pitches are analogous to chords, not to melody—which, I will argue later, is driven by the Doppler effect.)

If I am right that musical notes have their origin in the sounds that humans make when moving, then notes should come in human-gait-like patterns. In the next section, we’ll take up a simple question in this regard: does the number of notes found between the beats of music match the number of gangly bangs between footsteps?


The Length of Your Gangly

Every 17 years, cicadas emerge in droves out of the ground in Virginia, where I grew up. They climb the nearest tree, molt, and emerge looking a bit like a winged tank, big enough to fill your palm. Since they’re barely able to fly, we used to set them on our shoulders on the way to school, and they’d often not bother to fly away before we got there. And if they did fly, it wasn’t really flying at all. More of an extended hop, with an exoskeleton-shaking, tumble-prone landing. With only a few days to live, and with billions of others of their kind having emerged at the same time, all of them screeching mind-numbingly away, they didn’t need to go far to find a mate, and graceful flight did not seem to be something the females rewarded.

Cicadas have, then, a distinctively cicada-like sound when they move: a leap, a clunky clatter of wings, and a heavy landing (often with further hits and skids afterward). The closest thing to a footstep in this kind of movement is the landing thud, and thus the cicada manages to fit dozens of banging ganglies—its wings flapping—in between its landings. If cicadas were someday to develop culture and invent music that tapped into their auditory movement-recognition mechanisms, then their music might have dozens of notes between each beat. With Boooom as their beat and da as their wing-flap inter-beat note, their music might be something like “Boooom-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-Boooom-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da-da,” and so on. Perhaps their ear-shattering, incessant mating call is this sound!

Whereas cicadas liberally dole out notes in between the beats, Frankenstein’s monster in the movies is a miser with his banging ganglies, walking so stiffly that his only gait sounds are his footsteps. Zombies, too, tend to be low on the scale of banging-gangly complexity (although high on their intake of basal ganglia).

When we walk, our ganglies are more complex than those of Frankenstein and his zombie dance buddies, but ours are doled out much more sparingly than the cicadas’. During a step, your leg swings forward just once, and so it can typically only get one really good bang on something. More complex behaviors can lead to more bangs per step, but most commonly, our movements have just one between-the-footsteps bang—or none. Our movements tend to sound more like the following, where “Boooom” is the regularly repeating footstep sound and “da” is the between-the-steps sound: “Boooom-Boooom-Boooom-da-Boooom-Boooom-da-Boooom-da-Boooom-da-da-Boooom-da-Boooom-da-Boooom.” (Remember to do the “Boooom” on the beat, and cram the “da”s in between the beats.)

Given our human tendency to make roughly zero to one gangly bang between our steps, our human music should tend to pack notes similarly lightly between the beats. Music is thus predicted to tend to have around zero to one between-the-beats note. To test for this, we can look at the distribution of time gaps between musical notes. If music most commonly has about zero to one note between the beats—along with notes usually on the beat—then the most common note-to-note time gap should be in the range of a half beat to a beat.

To test this, as an RPI graduate student, Sean Barnett analyzed an electronic database of Barlow and Morgenstern’s 10,000 classical themes, the ones we mentioned at the start of this chapter. For every adjacent pair of notes in the database, Sean recorded the duration between their onsets (i.e., the time from the start of the first note to the start of the second note). Figure 19 shows the distribution of note-to-note time gaps in this database—which time intervals occur most commonly, and which are more rare. The peak occurs at ½ on the x-axis, meaning that the most common time gap is a half beat in length (an eighth note). In other words, there is one note between the beats on average, which is broadly consistent with expectation.

Figure 19. The distribution of durations between notes (measured in beats), for the roughly 10,000 classical themes. One can see that the most common time gap between notes is a half beat long, meaning on average about one between-the-beat note. This is similar to human gait, typically having around zero to one between-the-step “gangly” body hit.

We see, then, that music tends to have the number of notes per beat one would expect if notes are the sounds of the ganglies of a human—not a cicada, not a Frankenzombie—mover. Musical notes are gangly hits. And the beat is that special gangly hit called the footstep. In the next section we will discuss some of what makes the beat special, and see if footsteps are similarly special (relative to other kinds of gangly hits).


Backbone

My family and I just moved into a new house. Knowing that my wife was unhappy with the carpet in the family room, and knowing how much she fancies tiled floor, I took the day off and prepared a surprise for her. I cut tile-size squares from the carpet, so that what remained was a checkerboard pattern, with hardwood floors as the black squares and carpet as the white squares.

I couldn’t sleep very well that night on the couch, and so I headed into the kitchen for a bite. As I pondered how my plan had gone so horribly wrong, I began to notice the sounds of my gait. Walking on my newly checkered floor, my heels occasionally banged loudly on hard wood, and other times landed silently on soft carpet. Although some of my between-step intervals were silent, between many of my steps was a strong bump or shuffle sound when my foot banged into the edge of the two-inch-raised carpet. The overall pattern of my sounds made it clear when my footsteps must be occurring, even when they weren’t audible.

Luckily for my wife—and even more so for me—I never actually checkered my living room carpet. But our world is itself checkered: it is filled with terrain of varying hardness, so that footstep loudness can vary considerably as a mover moves. In addition to soft terrain, another potential source of a silent step is the modulation of a mover’s step, perhaps purposely stepping lightly in order to not sprain an ankle on a crooked spot of ground, or perhaps adapting to the demands of a particular behavioral movement. Given the importance of human footstep sounds, we should expect that our auditory systems were selected to possess mechanisms capable of recognizing human gait sounds even when some footsteps are missing, and to “fill in” where the missing footsteps are, so that the footsteps are perceptually “felt” even if they are not heard.

If our auditory system can handle missed footsteps, then we should expect music—if it is “about” human movement—to tap into this ability with some frequency. Music should be able to “tell stories” of human movement in which some footsteps are inaudible, and be confident that the brain can handle it. Does music ever skip a beat? That is, does music ever not put a note on a beat?

Of course. The simplest cases occur when a sequence of notes on the beat suddenly fails to continue at the next beat. This happens, for example, in “Row, Row, Row Your Boat,” when each “row” is on the beat, and then the beat just after “stream” does not get a note. But music is happy to skip beats in more complex ways. For example, in a rhythm like that shown in Figure 20, the first beat gets a note, but all the subsequent beats do not. In spite of the fact that only the first beat gets a note, you feel the beat occurring on all the subsequent skipped beats. Or the subsequent notes may be perceived to be off-beat notes, not notes on the beat. Music skips beats and humans miss footsteps—and in each case our auditory system is able to perceptually insert the missing beat or footstep where it belongs. That’s what we expect from music if beats are footsteps.

Figure 20. The first note is on the beat, but because it is an eighth note (lasting only half a beat), all the subsequent quarter notes (which are a beat in length) are struck on the off beat. You feel the beat occurring between each subsequent note, despite there being no note on the beat.

The beat is the solid backbone of music, so strong it makes itself felt even when not heard. And the beat is special in other ways. To illustrate this, let’s suppose you hear something strange approaching in the park. What you find unusual about the sound of the thing approaching is that each step is quickly followed by some other sound, with a long gap before the next step. Step-bang . . . . . . Step-bang . . . . . . “What on Earth is that?” you wonder. Maybe someone limping? Someone walking with a stick? Is it human at all?! The strange mover is about to emerge on the path from behind the bushes, and you look up to see. To your surprise, it is simply a lady out for a stroll. How could you not have recognized that?

You then notice that she has a lilting gait in which her forward-swinging foot strikes the ground before rising briefly once again for its proper footstep landing. She makes a hit sound immediately before her footstep, not immediately after as you had incorrectly interpreted. Step . . . . . . bang-Step . . . . . . bang-Step . . . . . . Her gait does indeed, then, have a pair of hit sounds occurring close together in time, but your brain had mistakenly judged the first of the pair of sounds to be the footstep, when in reality the second in the pair was the footstep. The first was a mere shuffle-like floor-strike during a leg stride. Once your brain got its interpretation off-kilter, the perceptual result was utterly different: lilting lady became mysterious monster.

The moral of this lilting-lady story is that to make sense of the gait sounds from a human mover, it is not enough to know the temporal pattern of gait-related hit sounds. The lilting lady and mysterious monster have the same temporal pattern, and yet they sound very different. What differs is which hits within the pattern are deemed to be the footstep sounds. Footsteps are the backbone of the gait pattern; they are the pillars holding up and giving structure to the other banging gangly sounds. If you keep the temporal pattern of body hits but shift the backbone, it means something very different about the mover’s gait (and possibly about the mover’s identity). And this meaning is reflected in our perception.

If musical rhythm is like gait, then the feel of a song’s rhythm should depend not merely on the temporal pattern of notes, but also on where the beat is within the pattern. This is, in fact, a well-known feature of music. For example, consider the pattern of notes in Figure 21.

Figure 21. An endlessly repeating rhythm of long, short, long, short, etc., but with neither “long” nor “short” indicated as being on the beat. One might have thought that such a pattern should have a unique perceptual feel. But as we will see in the following figure, the pattern’s feel depends on where the beat-backbone is placed onto it. Human gait is also like this.

One might think that such a never-ending sequence of long-short note pairs should have a single perceptual feel to it. But that same pattern sounds very different in the two cases shown in Figure 22, which differ only in whether the short or the long note marks the beat. The first of these sounds jarring and inelegant compared to the second. The first of these is, in fact, like the mysterious monster we imagined approaching a moment ago, and the second is like the lilting lady the mover turned out to be.

Figure 22. (a) A short-long rhythm, which sounds very different from (and less natural than) the long-short rhythm in (b).

Music, like human gait-related sounds, cannot have its beat shifted willy-nilly. The identity of a gait depends on which hits are the footsteps, and, accordingly, the identity of a song depends on which notes are on the beat. And when a beat is not heard, the brain infers its presence, something the brain also does when a mover’s footstep is inaudible.

There are, then, a variety of suspicious similarities between human gait and the properties of musical rhythm. In the upcoming section, we begin to move beyond rhythm toward melody and pitch. We’ll get there by way of discussing how chords may fit within this movement framework, and how choreography depends on more than just the rhythm.

Although we’re moving on from rhythm now, there are further lines of evidence that I have included in the Encore, which I will only provide teasers for here:

Encore 1: “The Long and Short of Hit” Earlier in this section I mentioned that the short-long rhythm of the mysterious monster sounds less natural than the long-short rhythm of the lilting lady. In this part of the Encore, I will explain why this might be the case.

Encore 2: “Measure of What?” I will discuss why changing the measure, or time signature, in music modulates our perception of music.

Encore 3: “Fancy Footwork” When people change direction while on the move, their gait often can become more complex. I show that the same thing occurs in music: when pitch changes (indicative, as we will see, of a turning mover), rhythmic complexity rises.

Encore 4: “Distant Beat” The nearer movers are, the more of their gait sounds are audible. I will discuss how this is also found in music: louder portions of music tend to have more notes per beat.


Gangly Chords

Earlier in this chapter, we discussed how footsteps and gangly bangs ring, and how these rings tend to have pitches. I hinted then that it is the Doppler shifting of these pitches that is the source of melody, something we will get to soon in this chapter. But we have yet to talk about the other principal role of pitch in music—harmony and chords.

When pitches combine in close temporal proximity, the result is a distinct kind of musical sound called the chord. For example, C, E, and G pitches combine to make the C major chord. Where do chords fit within the music-is-movement theory? To begin to see what aspect of human movement chords might echo, consider what happens when a pianist wants to get a rhythm going. He or she could just start tapping the rhythm on the wood of the piano top, but what the pianist actually does is play the rhythm via the piano keys. The rhythm is implemented with pitches. And furthermore, the pianist doesn’t just bang out the rhythm with any old pitches. Instead, the pianist picks a chord in which to establish the rhythm and beat. What the pianist is doing is analogous to what a guitarist does with a strum. Strums, whether on a guitar or a piano, are both rhythm and chord.

My suspicion is that rhythm and chords are two distinct kinds of information that come from the gangly banging sounds of human movers. I have suggested in this chapter that rhythm comes from the temporal pattern of human banging ganglies. And now I am suggesting that chords come from the combinations (or perhaps the constituents) of pitches that occur among the banging gangly rings. Gait sounds have temporal patterns and pitch patterns, and these underlie rhythm and chords, respectively. And these two auditory facets of gait are informative in different ways, but both broadly within the realm of “attitude” or “mood” or “intention,” as opposed to being informative about the direction or distance of the mover—topics that will come up later in regard to melody and loudness, respectively.

If rhythm and chords are each aspects of the sounds of our ganglies, then we should expect chords to cycle through their pitches on a time scale similar to that of the rhythm, and time-locked to the rhythm; the rhythm and chord should have the same time signature. For example, in an Alberti chord/rhythm pattern, one’s left hand on the piano might play the notes [CGEG][CGEG][CGEG], where each set of square brackets shows a two-beat interval, and bold type and underlines indicate the emphases in the rhythm. One can see that the same two-beat pitch pattern and rhythm repeats over and over again. The pitch sequence and the rhythm have the same 2/4 time signature. It is much rarer to find chords expressed in a way that mismatches the rhythm, such as the following case, where the chord is expressed as a repeated pattern of three pitches—C-G-E—and thus the two-beat rhythm cycles look like [CGEC][GECG][ECGE]. In this case, notice that the first two-beat interval—the rhythm’s cycle—has the pitch sequence CGEC, but that the second one has, instead, GECG. The pitch cycle for the chord is not matched to the rhythm’s cycle. In real music, if the rhythm is in 2/4 time, then the chord will typically not express itself in ¾ time. Rhythm and chords tend to be locked together in a way that suggests they are coming from the same worldly source, and therefore the arguments in this chapter lead one to speculate that both rhythm and chords come from, or are about, our gangly banging sounds.

We can also ask which pitch within the expressed chord is most likely to be the one played on the beat. For human movers, the lowest-pitched gangly bang we make is usually our footsteps. For music and the rhythmic expression of chords, then, we expect that the pitch played on the beat will tend to be lower than that played between the beats. Indeed, chords are usually caressed starting on the lowest expressed pitch (and often on the chord’s tonic, which in a C major chord would be the C pitch). Chords are, again, like gangly rings, with the lowest pitch ringing on the beat.

Consider yet another attribute of human gait: our gangly bangings can occur simultaneously. Multiple parts of a mover’s body can be clattering at the same time, and even a single bang will cause a ring on both the banger and the banged. So we should expect that the auditory mechanisms evolved for sensing gait would be able to process gait from the input of multiple simultaneous pitches. Consistent with this, the pitches within a chord are commonly played simultaneously, and our brains can make perfect sense of the simultaneously occurring notes. Pitch modulations that are part of the melody, on the other hand, almost never occur simultaneously (as we will discuss later).

The idea that musical chords have their foundation in the pitch combinations heard in the banging gangly sounds of human movers is worth investigating further. However, there are a wide variety of phenomena concerning chords that one would hope to explain, and that I currently have no theoretical insights into how to explain based on the raw materials of our ganglies. The laboratory of Dale Purves at Duke University has carried out exciting research suggesting that the human voice may explain the signature properties of the diatonic scale, and one might imagine persuasive explanations for chords emerging from his work. In fact, people do often vocalize while they move and carry out behaviors, and one possibility is that chords are not about gangly bangs at all, but about the quality of our vocalizations. The advantage of looking to gangly bangs as the foundation for chords, however, is that banging ganglies are time-locked to footsteps, and thus intrinsically note-like. Human vocalizations, however, are not time-locked to our footsteps, and also lack a clear connection to the between-steps movements of our banging ganglies. If chords were driven by vocalizations, we would not be able to explain why chords are so wedded to the rhythm, as demonstrated above. If one can find chords in our ganglies, then it allows for a unified account: our banging ganglies would explain both rhythm and chords—and the tight fit between them.

Chords, I have suggested, may have their origins in the pitches of the complex rings given off by gangly human movers. Later in the chapter, I will suggest that the pitch modulations in melody, in contrast, come from the Doppler shifting of the envelope of those gangly pitches.


Choreographed for You

Choreography is all about finding the right match between human movement and music. I had always figured it wasn’t the music as a whole that must match people’s movement so much as it was just the rhythm and beat. Get the music’s bangs in line with the people’s bangs—that’s all choreographers needed to care about. But I now realize there’s a great deal more to it. A lot of what matters in good choreography is not the rhythm and beat at all. The melodic contour matters, too, and so does the loudness. (To all you choreographers who already know this, please bear with me!)

Why should musical qualities beyond rhythm and beat matter to choreography? Because there are sound qualities beyond our intrinsic banging gangly sounds that also matter for sensing human movers. For example, suppose you and I are waiting for an approaching train, but you are 100 yards farther up the tracks (toward the approaching train) than I am. You and I will hear the same train “gait” sounds—the chugs, the rhythmic clattering of steel, and so on—but you will hear the train’s pitch fall (due to the Doppler effect) before I hear it fall. Now imagine that I am wearing headphones connected to a microphone on your lapel, so that at my position along the tracks I am listening to the sounds you are hearing at your position along the tracks. The intrinsic gait sounds of the train would be choreographed appropriately with my visual perception of the train, because those gait sounds don’t depend on the location of the listener. But my headphone experience of the pitch and loudness contours would no longer fit my visual experience. The train’s pitch now begins falling too early, and will already approach its lowest going-away-from-me pitch before the train even reaches me. The train’s loudness is also now incorrect, reaching its peak when the train is still 100 yards from reaching me. This would be a deeply ecologically incoherent audiovisual experience; the auditory stream from the headphones would not be choreographed with the train’s visible movements, even though the temporal properties—the beat and rhythm—of the train’s trangly bangings are just as they should be.

Real-world choreography pays attention to pitch and loudness contours as well as gait sounds; and, crucially, which pitch and loudness contour matches a movement depends on where the listener is. Choreography for pitch and loudness contours is listener-centric.

The implication for musical choreography is this: in matching music to movement, the choreographer must make sure that the viewpoint is the same point in space as the listening point. Good choreography must not merely “know its audience,” but know where they are. Music choreographed for you, where you’re sitting, may not be music choreographed for me, where I’m sitting. In television, choreographers play to the camera’s position. If movers are seen in a video to veer toward the camera, melody’s pitch must rise to fit the video, for example. In live shows, choreographers play to the audience, although this gets increasingly difficult the more widely the audience is distributed around the stage. (This is one of many reasons why most Super Bowl halftime shows suck.)

Whereas our discussion so far has concerned rhythm and beat, which do not depend on the listener’s position, the upcoming sections concern pitch and loudness, each of which depends crucially on the location of the listener. Music with only a beat and a rhythm is a story of human behavior, but without any particular viewpoint. In contrast, music with pitch and loudness modulations puts the listener at a fixed viewpoint (or listening point) in the story, as the fictional mover changes direction and proximity to the listener. These are the mover’s kinematics, and the rest of this chapter examines how music tells stories about the kinematics.


Motorcycle Music

Next time you’re on the highway at 70 mph next to a roaring Harley, roll down your window and listen (but do not breathe!). I did this just the other day, and was struck by something strange about how the chopper sounded. The motorcycle’s “footsteps” were there, namely the sounds made by the bike’s impacts directly on the asphalt as it barreled over crevices, crags, and cracks. The motorcycle’s “banging gangly” sounds were also present—the sounds made by the bike’s parts interacting with one another, be they moving parts in the engine or body parts rattling due to engine or road vibrations. And the bike’s exhaust pipe also made its high-frequency vroom (not quite analogous to a sound made by human movers). These motorcycle sounds I heard were characterized not only by their rhythm, but also by the suite of pitches among the rings of these physical interactions: the bike’s “chords.” These rhythm and chord sounds informed me of the motorcycle’s “state”: it is a motorcycle; it is a Harley; it is going over uneven ground; it is powerful and rugged; it needs a bath; and so on. Rhythm and beat (and the chords with which they seem inextricably linked), the topics of much of this chapter thus far, are all about the state of the mover—the nature of the mover’s gait, and the emotion or attitude expressed by that manner of moving about.

What, then, was so strange about the motorcycle sounds I heard while driving alongside? It was that the motorcycle’s overall pitch and loudness were constant. In most of my experiences with motorcycles, their pitch and loudness vary dynamically. This is because motorcycles are typically moving relative to me (I never ride them myself), and consequently they are undergoing changes in pitch due to the Doppler effect, and changes in loudness due to changing proximity. These pitch and loudness modulations give away the action, and that was what was missing: the motorcycle had attitude but no action.

Music gets its attitude from the rhythm and beat, but when music wants to tell a story about the mover in motion—the mover’s kinematics—music breaks out the pitch and becomes melodic, and twiddles with the volume and modulates the loudness. The rest of this chapter is about the ecological origins of melody and loudness. We will begin with melody, but before I begin to defend what I think musical melodic pitch means, we need to overcome a commonly held bias—encoded in the expressions “high” and “low notes”—that musical pitch equates with spatial position.


Why Pitch Seems Spatial

Something is falling from the sky! Quick, what sound is it making? You won’t be alone if you feel that the appropriate sound is one with a falling pitch (possibly also with a crescendoing loudness). That’s the sound cartoons use to depict objects falling from overhead. But is that the sound falling objects really make, or might it be just a myth?

No, it’s not a myth. It’s true. If a falling object above you is making audible sounds at all (either intrinsically or due to air resistance), then its pitch will be falling as it physically falls for the same reason passing trains have falling pitch: falling objects (unless headed directly toward the top of your head) are passing you, and so the Doppler effect takes place, like when a train passes you. Falling objects happen to be passing you in a vertical direction rather than along the ground like a train, but that makes no difference to the Doppler effect. Because falling objects have falling pitch, we end up associating greater height with greater pitch. That’s why, despite greater sound frequencies not being “higher” in any real sense, it feels natural to call greater frequencies “higher.” Pitch and physical height are, then, truly associated with one another in the world.

But the association between pitch and physical height is a misleading association, not indicative of an underlying natural regularity. To understand why it is misleading, let’s now imagine, instead, that an object resting on the ground in front of you suddenly launches upward into the sky. How does its pitch change? If it really were a natural regularity that higher in the sky associates with higher pitch, then pitch should rise as the object rises. But that is not what happens. The Doppler effect ensures that its pitch actually falls as it rises into the sky. To understand why, consider the passing train again, and ask what happens to its pitch once it has already reached its nearest point to you and is beginning to move away. At this point, the train’s pitch has already decreased from its maximum, when it was far away and approaching you, to an intermediate value, and it will continue to decrease in pitch as it moves away from you. The pitch “falls” or “drops,” as we say, because the train is directing itself more and more away from you as it continues straight, and so the waves reaching your ears are more and more spread out in space, and thus lower in frequency. (In the upcoming section, we will discuss the Doppler effect in more detail.) An object leaping upward toward the sky from the ground is, then, in the same situation as the train that has just reached its nearest point to you and is beginning to go away. The pitch therefore drops for the upward-launching object. If rocket launches were to be our most common experience with height and pitch, then one would come to associate greater physical height with lower pitch, contrary to the association people have now. But because of gravity, objects don’t tend to launch upward (at least they didn’t for most of our evolutionary history), and so the association between physical height and low pitch doesn’t take hold. Objects do fall, however (and it is an especially dangerous scenario to boot), and so the association between physical height and “high” pitch wins. Thus, greater height only associates with “higher” pitch because of the gravitational asymmetry; the fundamental reason for the pitch falling as the object falls is the Doppler effect, not physical height at all. Pitch falls for falling objects because the falling object is rushing by the listener, something that occurs also as the train comes close and then passes.

Falling objects are not the only reason we’re biased toward a spatial interpretation of pitch (i.e., an interpretation that pitch encodes spatial position or distance). Our music technology—our instruments and musical notation system—accentuates the bias. On most instruments, to change pitch requires changing the position in space of one’s hands or fingers, whether horizontally over the keys of a piano, along the neck of a violin, or down the length of a clarinet. And our Western musical notation system codes for pitch using the vertical spatial dimension on the staff—and, consistent with the gravitational asymmetry we just discussed, greater frequencies are higher on the page. The spatial modulations for pitch in instrument design and musical notation are very useful for performing and reading music, but they further bang us over the head with the idea that pitch has a spatial interpretation.

There is yet another reason why people are prone to give a spatial interpretation to melody’s “rising and falling” pitch, and that is that melody’s pitch tends to change in a continuous manner: it is more likely to move to a nearby pitch than to discontinuously “teleport” to a faraway pitch. This has been recognized since the early twentieth century, and in Sweet Anticipation Professor David Huron of Ohio State University summarizes the evidence for it. Isn’t this pitch continuity conducive to a spatial interpretation? Pitch continuity is at least consistent with a spatial interpretation. (But, then again, continuity is consistent with most possible physical parameters, including the direction of a mover.)

We see, then, that gravity, musical instruments, musical notation, and the pitch continuity of most melodies conspire to bias us to interpret musical pitch in a spatial manner (i.e., where pitch represents spatial position or distance). But like any good conspiracy, it gets people believing something false. Pitch is not spatial in the natural world. It doesn’t indicate distance or measure spatial position. How “high” or “low” a sound is doesn’t tell us how near or far away its source is. I will argue that pitch is not spatial in music, either. But then what is spatial in music? If music is about movement, it would be bizarre if it didn’t have the ability to tell your auditory brain where in space the mover is. As we will see later in this chapter, music does have the ability to tell us about spatial location—that’s the meaning of loudness.

But we’re not ready for that yet, for we must still decode the meaning of melodic pitch. My hypothesis is that hiding underneath those false spatial clues lies the true meaning of melodic pitch: the direction of the mover (relative to the listener’s position). It is that fundamental effect in physics, the Doppler effect, that transforms the directions of a mover into a range of pitches. In order to comprehend musical pitch, and the melody that pitches combine to make, we must learn what the Doppler effect is. We take that up next.


Doppler Dictionary

In the summer months, our neighborhood is regularly trawled by an ice cream truck, loudly blaring music to announce its arrival. When the kids hear the song, they’re up and running, asking for money. My strategy is to stall, suggesting, for example, that the truck only sells dog treats, or that it is that very ice cream truck that took away their older sister whom we never talk about. But soon they’re out the door, listening intently for it. “It’s through the woods behind the Johnsons’,” my daughter yells. “No, it’s at the park playground,” my son responds. As the ice cream truck navigates the maze of streets, the kids can hear that it is sometimes headed toward them, only to turn at a cross street, and the kids’ hearts drop. I try to allay their heartache by telling them they weren’t getting ice cream even if the truck had come, but then they perk up, hearing it headed this way yet again.

The moral of this story about my forlorn kids is not just how to be a good parent, but how kids can hear the comings and goings of ice cream trucks. There are a variety of cues they could be using for their ice cream–truck sense, but one of the best cues is the truck’s pitch, the entire envelope of pitches that modulates as it varies in direction relative to my kids’ location, due to the Doppler effect.

What exactly is the Doppler effect? To understand it, we must begin with a special speed: 768 miles per hour. That’s the speed of sound in the Earth’s atmosphere, a speed Superman must keep in mind because passing through that speed leads to a sonic boom, something sure to flatten the soufflé he baked for the Christmas party. We, on the other hand, move so slowly that we can carry soufflés with ever so slightly less fear. But even though the speed of sound is not something we need to worry about, it nevertheless has important consequences for our lives. In particular, the speed of sound is crucial for comprehending the Doppler effect, wherein moving objects have different pitches depending on their direction of movement relative to the listener.

Let’s imagine a much slower speed of sound: say, two meters per second. Now let’s suppose I stand still and clap 10 times in one second. What will you hear (supposing you also are standing still)? You will hear 10 claps in a second, the wave fronts of a 10 Hertz sound. It also helps to think about how the waves from the 10 claps are spread out over space. Because I’m pretending that the speed of sound is two meters per second, the first clap’s wave has moved two meters by the time the final clap occurs, and so the 10 claps are spread out over two meters of space. (See Figure 23a.)

Figure 23. (a) A stationary speaker is shown making 10 clap sounds in a second. The top indicates that the wave from the clap has just occurred, not having moved beyond the speaker. In the lower part of the panel, the speaker is in the same location, but one second of time has transpired. The first wave has moved two meters to the right, and the final wave has just left the speaker. A listener on the right will hear a 10 Hz sound. (b) Now the speaker is moving in the same direction as the waves and has moved one meter to the right after one second. The 10 claps are thus spread over one meter of space, not two meters as in (a). All 10 waves wash over the listener’s ears in half a second, or at 20 Hz, twice as fast as in (a). (c) In this case the speaker is moving away from the listener, or leftward. By the time the tenth clap occurs, the speaker has moved one meter leftward, and so the 10 claps are spread over three meters, not two as in (a). Their frequency is thus lower, or 6.7 Hz rather than the 10 Hz in (a).

Now suppose that, instead of me standing still, I am moving toward you at one meter per second. That doesn’t sound fast, but remember that the speed of sound in this pretend example is two meters per second, so I’m now moving at half the speed of sound! By the time my first clap has gone two meters toward you, my body and hands have moved one meter toward you, and so my final clap occurs one meter closer to you than my first clap. Whereas my 10 claps were spread over two meters when I was stationary, in this moving-toward-you scenario my 10 claps are spread over only one meter of space. These claps will thus wash over your ears in only half a second, rather than a second, and so you will hear a pitch that is 20 Hz, twice what it was before. (See Figure 23b.) If I were moving away from you instead, then rather than my 10 claps being spread over two meters as in the stationary scenario, they would be spread over three meters. The 10 claps would thus take 1½ seconds to wash over you, and be heard as a 6.66 Hz pitch—a lower pitch than in the baseline case. (See Figure 23c.)

The speed of sound is a couple hundred times faster than the two-meter-per-second speed I just pretended it was, but the same principles apply: when I move toward you my pitches are upshifted, and when I move away from you my pitches are downshifted. The shifts in pitch will be much smaller than those in my pretend example, but in real life they are often large enough to be detectable by the auditory system, as we will discuss later. The Doppler effect is just the kind of strong ecological universal one expects the auditory system to have been selected to latch onto, because from it a listener’s brain can infer the direction of motion of a mover, such as an ice cream truck.

To illustrate the connection between pitch and directedness toward you, let’s go back to our generic train example and assume the track is straight. When a train is far away but approaching the station platform where you are standing, it is going almost directly toward you, as illustrated in Figure 24i. This is when its pitch will be Doppler shifted upward the most. (High and constant pitch is, by the way, the signature of an impending collision.) As the train nears, it gets less and less directed toward you, eventually to pass you by. Its pitch thus drops to an intermediate, or baseline, value when it reaches its nearest point to you and is momentarily moving neither toward nor away from you (see Figure 24ii). As the train begins to move away from you, its pitch falls below its intermediate value and continues to go lower and lower until it reaches its minimum, when headed directly away (see Figure 24iii). (If, by the way, you were unwisely standing on the tracks instead of on the platform, then the train’s pitch would have remained at its maximum the entire period of time it approached. Then, just after the sound of your body splatting, the train’s pitch would instantaneously drop to its lowest pitch. Of course, you would be in no condition to hear this pitch drop.)

Figure 24. Illustration that the pitch of a mover (relative to the baseline pitch) indicates the mover’s directedness toward you. When the train is headed directly toward the observer, pitch is at its maximum (i), and is at its lowest when headed directly away (iii); in between the pitch is in between (ii).

As a further illustration of the relationship between pitch and mover direction, suppose that a mover is going around in a circle out in front of you (not around you). At (a) in Figure 25 the mover is headed directly away, and so has minimum pitch. The mover begins to turn around for a return, and pitch accordingly rises to a baseline, or intermediate, level at (b). The mover now begins veering toward you, raising the pitch higher, until the mover is headed directly toward you at (c), at which point the pitch is at its maximum. Now the mover begins veering away from you so as not to collide, and pitch falls back to baseline at position (d), only to fall further as the mover moves away to (a) again.

Figure 25. The upper section shows a mover moving in a circle out in front of the listener (the ear), indicating four specific spots along the path. The lower part of the figure shows the pitch at these four spots on the path. (a) When moving directly away, pitch is at its minimum. (b) Pitch rises to baseline when at the greatest distance and moving neither toward nor away. (c) Pitch rises further to its maximum when headed directly toward the listener. (d) Pitch then falls back to baseline when passing tangentially nearby. The pitch then falls back to its minimum again at (a), completing the circle.

From our experience with the train and looping-mover illustrations, we can now build the simple “dictionary” of pitches shown in Figure 26. Given a pitch within a range of pitches, the figure tells us the pitch’s meaning: a direction of the mover relative to the listener. In the dictionary of nature, pitch means degree of directedness toward you.

Figure 26. Summary of the “meaning” of pitch, relative to baseline pitch. (The actual mapping from direction to pitch is non-linear, something we discuss later in the upcoming section.)

This pitch dictionary is useful, but only to a limited extent. Doppler pitches tend to be fluctuating when you hear them, whether because movers are merely going straight past you (as in Figure 24), or because movers are turning (as in Figure 25). These dynamic pitch changes, in combination with the pitch dictionary, are a source of rich information for a listener. Whereas pitches above and below baseline mean an approaching or receding mover, respectively, changing pitch tells us about the mover’s turning and veering behavior. A rising pitch means that the mover is becoming increasingly directed toward the listener; the mover is veering more toward you. And falling pitch means that the mover is becoming decreasingly directed toward the listener; the mover is veering more away from you. One can see this in Figure 25. From (a) through (c) the mover is veering more toward the listener, and the pitch is rising throughout. In the other portion of the circular path, from (c) to (a) via (d), the mover is veering away from the listener, and the pitch is falling.

To summarize, pitch informs us of the mover’s direction relative to us, and pitch change informs us of change of direction—the mover’s veering behavior. High and low pitches mean an approaching and a receding mover, respectively; rising and falling pitches mean a mover who is veering toward or away from the listener, respectively. We have, then, the following two fundamental pitch-related meanings:

Pitch: Low pitch means a receding mover. High pitch means an approaching mover.

Pitch change: Falling pitch means a mover veering more away. Rising pitch means a mover veering more toward.

Because movers can be approaching or receding and at the same time veering toward or away, there are 2 × 2 = 4 qualitatively distinct cases, each defining a distinct signature of the mover’s behavior, as enumerated below and summarized in Figure 27.

(A) Moving away, veering toward.

(B) Moving toward, veering toward.

(C) Moving toward, veering away.

(D) Moving away, veering away.

Figure 27. Four qualitatively distinct categories of movement given that a mover may move toward or away, and may veer toward or away. (I have given them alphabet labels starting at the bottom right and moving counterclockwise to the other three squares of the table, although my reason for ordering them in this way won’t be apparent until later in the chapter. I will suggest later that the sequence A-B-C-D is a generic, or most common, kind of encounter.)

These four directional arcs can be thought of as the fundamental “atoms” of movement out of which more complex trajectories are built. The straight-moving train of Figure 24, for example, can be described as C followed by D, that is, veering away over the entire encounter, but first nearing, followed by receding. (As I will discuss in more detail in the Encore section titled “Newton’s First Law of Music,” straight-moving movers passing by a listener are effectively veering away from the listener.)

These four fundamental cases of movement have their own pitch signatures, enumerated below and summarized in Figure 28.

(E) Low, rising pitch means moving away, veering toward.

(F) High, rising pitch means moving toward, veering toward.

(G) High, falling pitch means moving toward, veering away.

(H) Low, falling pitch means moving away, veering away.

Figure 28. Summary of the movement meaning of pitch, for low and high pitch, and rising and falling pitch. (Note that I am not claiming people move in circles as shown in the figure. The figure is useful because all movements fall into one of these four categories, which I am illustrating via the circular case.)

These four pitch categories amount to the auditory atoms of a mover’s trajectory. Given the sequence of Doppler pitches of a mover, it is easy to decompose it into the fundamental atoms of movement the mover is engaged in. Let’s walk through these four kinds of pitch profiles, and the four respective kinds of movement they indicate, keeping our eye on Figure 28.

(A) The bottom right square in Figure 28 shows a situation where the pitch is low and rising. Low pitch means my neighborhood ice cream truck is directed away from me and the kids, but the fact that the pitch is rising means the truck is turning and directing itself more toward us. Intuitively, then, a low and rising pitch is the signature of an away-moving mover noticing you and deciding to begin to turn around and come see you. To my snack-happy children, it means hope—the ice cream truck might be coming back!

(B) The upper right square concerns cases where the pitch is higher than baseline and is rising. The high pitch means the truck is directed at least somewhat toward us, and the fact that the pitch is rising means the truck is further directing itself toward us. Intuitively, the truck has seen my kids and is homing in on them. My kids are ecstatic now, screaming, “It’s coming! It sees us!”

(C) The top left square is where the pitch is still high, but now falling. That the pitch is high means the truck is headed in our direction; but the pitch is falling, meaning it is directing itself less and less toward us. “Hurry! It’s here!” my kids cry. This is the signature of a mover arriving, because when movers arrive at your destination, they either veer away so as not to hit you, or come to a stop; in each case, it causes a lowering pitch, moving toward baseline.

(D) The bottom left, and final, square of the matrix is where the pitch is low and falling. This means the truck is now directed away, and is directing itself even farther away. Now my kids’ faces are purple and drenched with tears, and I am preparing a plate of carrots.

Figure 28 amounts to a second kind of ecological pitch-movement dictionary (in addition to Figure 26). Now, if melodic pitch contours have been culturally selected to mimic Doppler shifts, then the dictionary categorizes four fundamentally different meanings for melody. For example, when a melody begins at the bottom of the pitch range of a piece and rises, it is interpreted by your auditory system as an away-moving mover veering back toward the listener (bottom right of Figure 28). And if the melody is high in pitch and falling, it means the fictional mover is arriving (upper left of Figure 28). At least, that’s what these melodic contours mean if melody has been selected over time to mimic Doppler shifts of movers. With some grounding in the ecological meaning of pitch, we are ready to begin asking whether signatures of the Doppler effect are actually found in the contours of melody. We begin by asking how many fingers one needs to play a melody.


Only One Finger Needed

Piano recitals for six-year-olds tend to be one-finger events, each child wielding his or her favorite finger to poke out the melody of some nursery rhyme. If one didn’t know much about human music and had only been to a kiddie recital, one might suspect that this is because kids are given especially simple melodies that they can eke out with only one finger. But it is not just kindergarten-recital melodies that can be played one note at a time, but nearly all melodies. It appears to be part of the very nature of melody that it is a strictly sequential stream of pitches. That’s why, even though most instruments (including voice, for the most part) are capable of only one note at a time, they are perfectly able to play nearly any melody. And that’s also why virtually every classical theme in Barlow and Morgenstern’s Dictionary of Musical Themes has just one pitch at a time.

Counterexamples to this strong sequential tendency of melody are those pieces of music having two overlapping melodies, or one melody overlapping itself, as in a round or fugue. But such cases serve as counterexamples that prove the rule: they are not cases of a single melody relying on multiple simultaneous notes, but, rather, cases of two simultaneously played single melodies, like the sounds of two people moving in your vicinity.

Could it be that melodies are one note at a time simply because it is physically difficult to implement multiple pitches simultaneously? Not at all! Music revels in having multiple notes at a time. You’d be hard put to find music that does not liberally pour pitches on top of one another—but not for the melody.

Why is melody like this? If chords can be richly complex, having many simultaneous pitches, why can melodic contour have only one pitch at a time? There is a straightforward answer if melodic contour is about the Doppler pitch modulations due to a mover’s direction relative to the listener. A mover can only possibly be moving in a single direction at any given time, and therefore can have only a single Doppler shift relative to baseline. Melodic contour, I submit, is one pitch at a time because movers can only go in one direction at a time. In contrast, the short-time-scale pitch modulations of the chords are, I suggested earlier in the chapter, due to the pitch constituents found in the gangly bangs of human gait, which can occur at the same time. Melodic contour, I am suggesting, is the Doppler shifting of this envelope of gangly pitches.


Human Curves

Melodic contours are, in the sights of this movement theory of music, about the sequence of movement directions of a fictional mover. When the melody’s pitch changes, the music is narrating to your auditory system that the depicted mover is changing his or her direction of movement. If this really is what melody means, then melody and people should have similar turning behavior.

How quickly do people turn when moving? Get on up and let’s see. Walk around and make a turn or two. Notice that when you turn 90 degrees, you don’t usually take 10 steps to do so, and you also don’t typically turn on a dime. In order to get a better idea of how quickly people tend to change direction, I set out to find videos of people moving and changing direction. After some thought, undergraduate RPI student Eric Jordan and I eventually settled on videos of soccer players. Soccer was perfect because players commonly alter their direction of movement as the ball’s location on the field rapidly changes. Soccer players also exhibit the full range of human speeds, allowing us to check whether turning rate depends on speed. Eric measured 126 instances of approximately right-angle turns, and in each case, recorded the number of steps the player took to make the turn. Figure 29 shows the distribution for the number of steps taken. As can be seen in the figure, these soccer players typically took two steps to turn 90 degrees, and this was the case whether they were walking, jogging, or running. Casual observation of movers outside of soccer games—such as in coffee shops—suggests that this is not a result peculiar to soccer.

If music sounds like human movers, then it should be the case that the depicted mover in music turns at rates typical for humans. Specifically, then, we expect the musical mover to take a right-angle turn in about two steps on average. What does this mean musically? A step in music is a beat, and so the expectation is that music will turn 90 degrees in about two beats. But what does it mean to “turn 90 degrees” in music?

Recall that the maximum pitch in a song means the mover is headed directly toward the listener, and the lowest pitch means the mover is headed directly away. That is a 180-degree difference in mover direction. Therefore, when a melody moves over the entirety of the tessitura (the melody’s pitch range), it means that the depicted mover is changing direction by 180 degrees (either from toward you to away from you, or vice versa). And if the melody spans just the top or bottom half of the tessitura, it means the mover has turned 90 degrees. Because human movers take about two steps to turn 90 degrees—as we just saw—we expect that melodies tend to take about two beats to cross the upper or lower half of the tessitura.

Figure 29. Distribution of the number of footsteps soccer players take to turn 90 degrees, for walkers, joggers, and runners. The average number of footsteps for a right-angle turn for walkers is 2.16 (SE 0.17, n=22); for joggers, 2.21 (SE 0.11, n=45); and for runners, 2.23 (SE 0.13, n=59).

To test this, we measured from the Dictionary of Musical Themes melodic “runs” (i.e., strictly rising or strictly falling sequences of notes) having at least three notes within the upper or lower half of the tessitura (and filling at least 80 percent of the width of that half tessitura). These are among the clearest potential candidates for 90-degree turns in music. Figure 30 shows how many beats music typically takes to do its 90-degree turns. The peak is at two beats, consistent with the two footsteps of people making 90-degree turns while moving. Music turns as quickly as people do!

Figure 30. Distribution of the number of beats for a theme “run” to cover the top or bottom half of the tessitura. The most common number of beats to cross half the tessitura is two, consistent with how quickly people turn 90 degrees. (Runs had three or more notes moving in the same direction, and all within the top or bottom half of the tessitura, and filling at least 80% of its half tessitura.)

This close fit between human and melodic turning rates is, as we will see, helpful in trying to understand regularities in the overall structure of melody, which we delve into next.


Musical Encounters

In light of the theory that music is a story about a fictional mover in our midst, is it possible to address what a typical melody is like? How do melodies usually begin? What is the typical melodic contour shape? How many beats are in a typical melody? Is there even such a thing as a “typical” melody?

To answer whether there is a typical melody, we must ask if there is a typical way in which a mover moves in our midst. Is there such a thing as a typical story of human movement?

Yes, in fact, there is. In a nutshell, the most generic possible story of a human mover consists of the mover noticing you and veering toward you, interacting with you in some way, and then scampering off. The plot is: “Hello. How are you? Good-bye.” Let’s call this an “encounter.” Stories told by composers by no means must be encounters, of course. One can expect to find tremendous variability in the stories told by music. But there is no getting around the fact that Hello–HowAreYou–Good-bye is a common story, whereas, say, Good-bye–Hello–HowAreYou is a strange story. People in stories usually arrive and then depart, not vice versa.

If “encounters” are indeed the most typical story of human movement in our midst, then let’s be more specific about the kinds of movement involved. Figure 31 shows the four qualitatively distinct kinds of movement we discussed earlier in the chapter. Can we say what an encounter is in terms of these movement categories? Yes. In the “Hello” part of the encounter, the mover suddenly notices you and begins veering toward you. This is case A or B in Figure 31, depending on whether the mover was receding or nearing, respectively, when he first noticed you. If A is the start of the story, then B occurs next. By the end of B, the mover has gotten near enough that he must begin veering away, lest he bump into you. This is the segment of the mover’s path that brings him closest to you—where he says, “How are you?”—and is case C in Figure 31. Finally, in the “Good-bye” part of the encounter, the mover veers away, which is case D.

Figure 31. Summary of the movement meaning of pitch for low and high, rising and falling pitch, as shown earlier in Figure 28. The sequence A-B-C-D is a plausible “most generic” movement, which I call an “encounter,” or Hello–HowAreYou–Good-bye movement.

The generic encounter, then, has one of two sequences, B-C-D or A-B-C-D, with the movement meanings of A through D defined in Figure 31. Furthermore, I suggest that the “full” A-B-C-D movement is the more common of the two, because stories of human movement often consist of multiple, repeated encounters, and in these cases A will be the first segment of any encounter after the first. That is to say, if a mover goes away, then turns to come back for another encounter, the sequence must begin at A, not B. We conclude, then, that the generic Hello–HowAreYou–Good-bye movement is thus A-B-C-D. A-B is the “Hello.” C is the “How are you?” And D is the “Good-bye.”

We now have some idea of what a generic movement is, and so the next question is, “What does it sound like?” With Figure 31 in hand, we can immediately say what the Doppler pitches are for this movement. Figure 32 illustrates the overhead view and the pitch contour of the generic A-B-C-D encounter. This generic pitch contour in Figure 32 has two distinctive features that we expect, no matter the specific shape of the mover’s encounter path. First and foremost, the generic pitch contour goes up, and then down. Second, it dwells longer (has a flatter slope) at the minimum and maximum of its pitch range (something I will discuss in the Encore section titled “Home Pitch”). In essence, this pitch contour shares two qualitative features that are crucial to hills: hills are not just lumps of earth on the ground—not domes, not mounds, not pyramids, not wedges—but have gentle slopes toward their bottoms and a gentle flattening at the top.

Figure 32. The most generic sequence of movement, and its pitch over time. One can see it is a “hill.” Furthermore, this path tends to be traversed in around eight steps (because people tend to take about two steps to turn 90 degrees).

We are nearly ready to ask whether melodies tend to look like the “hill” we see in the pitch contour of the generic encounter, but first we must gather one further piece of information about encounters. We need to know how long an encounter lasts. When movers do a Hello–HowAreYou–Good-bye to us, do they do it over 100 footsteps, or four? Our results from the previous section (“Human Curves”) can help us answer. We saw then that human movers tend to take about two steps to turn 90 degrees. Each of the four segments of the generic encounter—A, B, C, and D—is roughly a 90-degree turn, and thus the entire encounter can be carried out in about eight steps. Although Hello–HowAreYou–Good-bye behaviors can take fewer or more than eight footsteps, a plausible baseline expectation is that generic encounters will occupy around eight steps—not two steps, and not 80. The generic story of human movement in our midst—the “encounter”—is not just the sequence of movements A-B-C-D, but also, these movements being enacted in eight or so steps. Accordingly, the generic Doppler pitch contour is not just a hill, but a hill implemented in about eight footsteps. The sound of a generic human mover in our midst is an eight-step pitch hill.

These eight-step-hill encounters are, I claim, a fundamental intermediate-level structure found in the pattern of Doppler pitches from human movers in our midst. Real movers will, of course, often diverge from this, but such instances should be viewed as deviations from this generic baseline. I call the generic encounter an intermediate-level structure because it is hierarchically above the individual “atoms” of movement (A, B, C, and D), and because full stories of movers in our midst may involve hundreds or thousands of footsteps—full stories of human movement are built by combining many short bouts of movement. Because the encounter is the generic short bout of movement, the generic long story of human movement is many eight-step hills—many encounters.

And now we are in a position to figure out whether these generic stories of human movement are found in music. In particular, we want to know if melodies are built out of encounter-like structures. And because generic encounters sound like eight-step pitch hills, we wish to see if melodies have any tendency to be built out of eight-beat pitch hills. Of course, we expect that any such tendency should be weak: the eight-step pitch hill is the expectation for the generic encounter, but you can be sure that composers like to tell nongeneric stories as well. Nevertheless, we hope to see the signs of the generic melody by looking across a great many melodies.

Eric Jordan and I set out to measure the average pitch contour for themes, following the lead of Professor David Huron, who first carried out measurements of this kind and found arches in average pitch contours. Themes were put into groups with other themes having the same number of notes; each theme’s pitches were normalized (so that the bottom and top of the tessitura were 0 and 1, respectively); and the average normalized pitch was computed across the group. Classical themes tend to have fewer than 25 notes, and in order to sample longer melodies, allowing us to better discern signs of eight-beat hills, we also measured from a set of 10,000 Finnish folk songs, which have themes with longer lengths of 25 to 40 notes. Figure 33 shows the average pitch contours for each group. There are, for example, 83 themes having exactly eight notes (the average of their contours is shown at the upper left in Figure 33). If melodies are built from eight-beat hills, as predicted, then we expect to see such hills in these averaged melodies. A casual glance across the average pitch contours for melodies having the same number of notes indicates a multiple-hill pattern in longer melodies. For themes with eight to 13 notes, only one hill is apparent in the average contour, but by 14 through about 19 notes, there is an apparent bimodality to the plots. Among the plots with 30 or more notes, a multiple-hill contour is strongly apparent. And the hills are very roughly eight notes in length. (These eight-note hills are due to themes with a predominance of quarter notes. Themes with a preponderance of eight notes lead to 16-note hills.)

Average Melodic Pitch Contour

Figure 33. Average melodic contour for melodies with the same number of notes. Thirty-two different plots are shown, for melodies having eight notes through 40 notes, shown by the number label along each x-axis. The “n” values show the number of melodies over which the average is calculated. If melodies are built with eight-beat hills, then because a sizable fraction of melodies consist mostly of beat-long notes, there should be a strong eight-note-hill tendency in these plots. There will also be sizable fraction of melodies consisting mostly of half-beat-long notes, and these will tend to have a sixteen-note hill. The smallest hills we expect to see, then, are eight-note ones.

To get a more quantitative estimate of the typical number of notes per hill in these average-melodic-contour data, Figure 34 plots the approximate number of hills as a function of the number of notes in the average melody. One can see that the number of hills is approximately 1/8 the number of notes, and so there are about eight notes per hill.

Figure 34. Number of hills in the average melodic contour as a function of the number of notes in the melody. The approximate number of arches from each plot in Figure 33 having eight notes through 36 notes (after which the number of hills is less clear) was recorded. The number of hills rises as 0.1256 times the number of notes in the melody. The number of notes per hill is thus 1/0.1256, which is 7.90, or approximately 8, consistent with our expectation from generic human encounters.

The data we have just discussed provide evidence of the “eightness” of melody’s hills. But recall that in order to show that they are hills, and not some other protuberance shape, we must show that the note durations at the start, peak, and end of the protuberance tend to have longer duration, just as hills are flatter on their ends and on top (see Figure 32 again). Focusing now just on classical themes with eight, nine, and ten notes, Eric Jordan and I determined the average duration (in beats) of each note. To generate a hill-shaped pitch contour, we would expect a “W”-shaped plot of how note durations vary over the course of a melodic theme, with longer durations at the start, middle, and end. Indeed, that’s what we found, as shown in Figure 35. Melodies tend to have longer-duration notes at the start and end (when the fictional mover is headed away from the listener), and also in the middle (when the mover is headed directly toward the listener).

Melodies appear to have a tendency to be built out of eight-beat hills, which is just what is expected if melody’s stories are about a fictional mover’s multiple encounters with the listener. Like arrival, interaction, and departure in eight steps, melodies tend to rise and fall over eight beats, and in the nonlinear fashion consistent with a real encounter. Melodies, in other words, seem to have the signature structure of stories built from Hello–HowAreYou–Good-byes.

In this and the previous sections we have seen that melody behaves in some respects like the Doppler pitch modulations of a person moving. But there’s much more to the similarity between Doppler and melody, and I discuss additional similarities in detail in the Encore. Here I will only hint at them, but I encourage you to read the hints, because they are exciting, and they are crucial to the case I am making.

Encore 3: “Fancy Footwork” This section will discuss how when people turn and their Doppler pitch changes, their gait can often become more complex. And I will provide evidence that music behaves in the same way. I referred to Encore 3 earlier in this chapter when we wrapped up the discussion of rhythm, because “Fancy Footwork” concerns how rhythm and pitch interact.

Figure 35. Average duration in beats of each note for classical themes with number of notes near that of the generic encounter (i.e., for 8, 9, and 10 notes). One can see the expected “W” shape, showing that the eight-note-hills in Figure 33 are truly hills.

Encore 5: “Home Pitch” This Encore section will discuss three similarities between melody and Doppler pitch modulations, each concerning how pitch distributes itself. Like Doppler pitch modulations, melodic contours have a fixed home range (called the tessitura); tend to distribute themselves uniformly over their home range; and tend to dwell longer at the edges of the pitch range.

Encore 6: “Fast Tempo, Wide Pitch” Just as faster movers have wider Doppler pitch ranges, faster-tempo music has a wider pitch range.

Encore 7: “Newton’s First Law of Music” If melodic contours are Doppler pitch modulations, then we expect a variety of asymmetries between pitch rises and falls. Pitch changes are not generally expected to have “momentum” (i.e., a tendency to continue going in the same direction) with the exception of small downward changes in pitch. Also, melody is expected to have a tendency to drift gradually down, but to have larger upswings in pitch.


Where’s the Moving Pitch?

Throughout this chapter thus far, I have been suggesting that the Doppler effect is the ecological foundation of melody. One potential stumbling block for this hypothesis that I have until now avoided mentioning is that we’re not generally aware of these Doppler shifts in pitch. We may sometimes consciously notice them coming from skateboards, bikes, cars, and trains, but do we notice them coming from moving people? If our auditory system is listening to the Doppler shifts of people, then wouldn’t we have noticed? Not necessarily. As we discussed in Chapter 1, in the section titled “Under the Radar,” your conscious self not only doesn’t acknowledge these lower-level stimuli; it typically does not even have access to them. Just as your conscious self sees objects, not visual contour junctions, your conscious self hears the behaviors and actions of movers, not the auditory substrate out of which the whole is built. Your conscious self tends to latch onto the invariants out there—a pink room with a variety of colored lights will have a variety of spectra, but you’ll typically see the walls as uniformly pink. Similarly, when a mover in your midst is making sounds, your conscious self will focus on invariants such as what the mover is doing. Your auditory system will hear all of the lower-level structure, including the (detectable) pitch modulations due to the Doppler effect, but all that will tend to stay below the radar because you don’t need to know the details.

It would not, then, be surprising if our auditory system detects (and uses) Doppler shifts from human movers and yet we don’t consciously notice it. But one might wonder whether our auditory system can possibly detect Doppler shifts for movers going human speeds in the first place. If I am walking at one meter per second, then the difference in pitch between the highest and lowest pitch in the Doppler range is small, roughly about a tenth of a semitone on the piano. The fastest sprints possible by humans are about 10 meters per second, and even at these speeds the Doppler pitch range is only about a single semitone on the piano (e.g., from C to C#). Thus, whereas the Doppler pitch modulations for trains, planes, and automobiles are sizable because those vehicles move quickly, the pitch modulations for humans are small. Tiny as these Doppler pitch shifts may be, however, our auditory system is exquisitely sensitive to pitch changes, capable of detecting differences as small as about 0.3 percent of the frequency, or about 5 percent of a semitone—sensitive enough to distinguish even the pitch shifts of walkers.

So our ears are sensitive enough. But if one thinks more carefully about how our feet make sounds, there appears to be a big problem with the suggestion that the sounds of human movers make Doppler shifts at all. The biggest bang your body makes in motion is when your foot takes a step. But notice that when you are walking and your foot hits the ground, it is moving neither forward nor backward relative to the ground. Your feet are stationary when they touch the ground. But if your sound-making foot is not moving forward when it makes its sound, then its pitch will not get Doppler shifted upward. That is, even though you are moving steadily forward, your feet are standing still at the moments when they are making their footstep sounds, and so your footstep sounds undergo no Doppler shift. (See Figure 36a.) (This is not a problem that applies to between-the-steps gangly bangings, which do Doppler shift.)

Footsteps are, however, more subtle than a simple vertical thud on the ground, and there are multiple avenues in which Doppler shifts may occur. First, even if your footsteps stomp the ground without any forward speed, your body is still moving forward. When the footstep sound waves rise into the air, your big old body bumps into them and reflects them. The waves that your front runs into get reflected forward and are consequently Doppler upshifted, and the waves that reflect off your back get Doppler downshifted, as illustrated in Figure 36b (i). Second, footsteps aren’t the simple vertical ground bangers that Figure 36a would have us believe. For one thing, the dynamics of a footstep are complicated by the fact that the ground is often complicated. Throughout most of our evolutionary history the ground was not smooth as it is today, but covered in grass, brush, and pebbles. When your foot goes in for a landing, it has some forward velocity, and will undergo a sequence of microcollisions with the ground material. This sequence of collisions is a forward-moving sequence, and will sound higher in pitch (or compressed) to a listener toward whom the mover is directed. (See Figure 36b [ii].) Not only is the ground more complicated than the “stomping” in Figure 36a indicates, but the foot dynamics even on smooth ground are more subtle than I have let on. Our heel hits first, and the contact points move forward along the foot toward the toes (see Figure 36b [iii]). Whether on smooth ground or natural terrain, this sequence of microhits underlying a single step is moving in the direction of the mover, and thus will Doppler shift.

Figure 36. (a) When a foot hits the ground, it is not moving forward or backward, and therefore has no Doppler shift. But as we’ll discuss, there’s more to the story. (b) (i) Top: The footstep leads to sound waves going in all directions, all at the same frequency (indicated by the spacing between the wave fronts). Bottom: These waves hit the body and reflect off it. Because the mover is moving forward, the sound waves reflected forward will be Doppler shifted to a higher pitch; the waves hitting the mover’s rear will reflect at a lower pitch. (ii) When feet land they don’t simply move vertically downward for a thud. The surface of the ground very often has complex material on it, which the landing foot strikes as it is still moving forward. These complex sounds will have Doppler shifts. (iii) If our feet were like a pirate’s peg leg, then the single thud it makes when hitting the ground would have no Doppler shift. But our feet aren’t peg legs. Instead, our foot lands on its heel, and the point of contact tends to move forward along the bottom of the foot.

Footsteps can, then, Doppler shift, and these shifts are detectable. There is now a third difficulty that can be raised: if Doppler shifts for human movers are fairly meager, then why doesn’t musical melody have meager tessitura width (i.e., meager pitch range for the melody)? The actual tessitura in melody tends to be wider than that achievable by a human mover, corresponding to speeds faster than humans can achieve. Why, if melodic pitch contours are about Doppler pitches, would music exaggerate the speed of the depicted observer? Perhaps for the same reason that exaggeration is commonplace in other art forms. Facial expressions in cartoons, for example, tend to be hyperexaggerations of human facial expressions. Presumably such exaggerations serve as superstimuli, hyperactivating our brain’s mechanisms for detecting the characteristic (e.g., smile or speed), and something about this hyperactivation feels good to us (perhaps a bit like being on a roller coaster).

One final thought concerning the mismatch between the Doppler pitch range and the tessitura widths found in music: could I have been underestimating the size of the Doppler shifts humans are capable of? Although we may only move in the one- to ten-meters-per-second range, and our limbs may swing forward at a little more than twice our body’s speed, parts of us may be moving at faster speeds. Recall that your feet hit the ground from heel to toe. The sequence of microhits travels forward along the bottoms of your feet, and the entirety of sound the sequence makes will be Doppler shifted. An interesting characteristic of this kind of sound is that it can act like a sound-making object that is moving much faster than the object that actually generates it. As an example, when you close scissors, the actual objects—the two blades—are simply moving toward each other. But the point of contact between the blades moves outward along the blades. The sound of closing scissors is a sound whose principal source location is moving, even though no object is actually moving in that manner. This kind of faux-moving sound maker can go very fast. If two flattish surfaces hit each other with one end just ever so slightly ahead of the other, then the speed of the faux mover can be huge. For example, if you drop a yardstick so as to make it land flat, and one end hits the ground one millisecond before the other end, then the faux mover will have traveled between the yardstick and the ground from one end to the other at about one kilometer per second, or about two thousand miles per hour! The faux mover beneath our stepping feet may, in principle, be moving much faster than we are, and any scissor-like sound it makes will thus acquire a Doppler pitch range much wider than that due to our body’s natural speed.

Human movers do make sounds that Doppler shift, and these shifts are detectable by our auditory system. And their exaggeration in music is sensible in light of the common role of exaggeration in artistic forms. Melodic contour, we have seen thus far, has many of the signature properties expected of Doppler shifts, lending credence to the idea that the role of melodic pitch contours is to tell the story of the sequence of directions in which a mover is headed. That’s a fundamental part of the kinematic information music imparts about the fictional mover. But that’s only half the story of “kinemusic.” It doesn’t tell us how far away the mover is, something more explicitly spatial. That is the role of loudness, the topic of the rest of this chapter.


Loud and in 3-D

Do you know why I love going to live shows like plays or musicals? Sure, the dialogue can be hilarious or touching, the songs a hoot, the action and suspense thrilling. But I go for another reason: the 3-D stereo experience. Long before movies were shot and viewed in 3-D, people were putting on real live performances, which provide a 3-D experience for all the two-eyeds watching. And theater performances don’t simply approximate the 3-D experience—they are the genuine article.

“But,” you might respond, “one goes to the theater for the dance, the dialogue, the humans—for the art. No one goes to live performances for the ‘3-D feel!’ What kind of lowbrow rube are you? And, at any rate, most audiences sit too far away to get much of a stereo 3-D effect.”

“Ah,” I respond, “but that’s why I sit right up front, or go to very small theater houses. I just love that 3-D popping-out feeling, I tell ya!”

At this point you’d walk away, muttering something about the gene pool. And you’d be right. That would be a dopey thing for me to say. We see people doing their thing in 3-D all the time. I just saw the waitress here at the coffee shop walk by. Wow, she was in 3-D! Now I’m looking at my coffee, and my mug’s handle appears directed toward me. Whoa, it’s 3-D!

No. We don’t go to the live theater for the 3-D experience. We get plenty of 3-D thrown at us every waking moment. But this leaves us with a mystery. Why do people like 3-D movies? If people are all 3-D’ed out in their regular lives, why do we jump at the chance to wear funny glasses at the movie house? Part of the attraction surely is that movies can show you places you have never been, whether real or imaginary, and so with 3-D you can more fully experience what it is like to have a Tyrannosaurus rex make a snout-reaching grab for you.

But there is more to it. Even when the movie is showing everyday things, there is considerable extra excitement when it is in 3-D. Watching a live performance in a tiny theater is still not the same as watching a 3-D movie version of that same performance. But what is the difference?

Have you ever been to one of those shows where actors come out into the audience? Specific audience members are sometimes targeted, or maybe even pulled up onstage. In such circumstances, if you’re not the person the actors target, you might find yourself thinking, “Oh, that person is having a blast!” If you’re the shy type, however, you might be thinking, “Thank God they didn’t target me because I’d have been terrified!” If you are the target, then, whether you liked it or not, your experience of the evening’s performance will be very different from that of everyone else in the audience. The show reached out into your space and grabbed you. While everyone else merely watched the show, you were part of it.

The key to understanding the “3-D movie” experience can be found in this targeting. 3-D movies differ from their real-life versions because everyone in the audience is a target, all at the same time. This is simply because the 3-D technology (projecting left- and right-eye images onto the screen, with glasses designed to let each eye see only the image intended for it) gives everyone in the audience the same 3-D effect. If the dragon’s flames appear to me to nearly singe my hair but spare everyone else’s, your experience at the other side of the theater is that the dragon’s flames nearly singe your hair and spare everyone else’s, including mine. If I experience a golf ball shooting over the audience to my left, then the audience to my left also experiences the golf ball going over their left. 3-D movies put on a show that is inextricably tied to each listener, and invades each listener’s space equally. Everyone’s experience is identical in the sense that they’re all treated to the same visual and auditory vantage point. But everyone’s experience is unique because each experiences himself as the target—each believes he has a specially targeted vantage point.

The difference, then, between a live show seen up close and a 3-D movie of the same show is that the former pulls just one or several audience members into the thick of the story, whereas 3-D movies have this effect on everyone. So the fun of 3-D movies is not that they are 3-D at all. We can have the same fun when we happen to be the target in a real live show. The fun is in being targeted. When the show doesn’t merely leap off the screen, but leaps at you, it fundamentally alters the emotional experience. It no longer feels like a story about others, but becomes a story that invades your space, perhaps threateningly, perhaps provocatively, perhaps joyously. You are immersed in the story, not an audience member at all.

What does all this have to do with music and the auditory sense? Imagine yourself again at a live show. You hear the performers’ rhythmic banging ganglies as they carry out behaviors onstage. And as they move onstage and vary their direction, the sounds they make will change pitch due to the Doppler effect. Sitting there in the audience, watching from a vantage point outside of the story, you get the rhythm and pitch modulations of human movers. You get the attitude (rhythm) and action (pitch). But you are not immersed in the story. You can more easily remain detached.

Now imagine that the performers suddenly begin to target you. Several just jumped off the stage, headed directly toward you. A minute later, there you are, grinning and red-faced, with tousled hair and the bright red lipstick mark of a mistress’s kiss on your forehead . . . and, for good measure, a pirate is in your face calling you “salty.” During all this targeting you hear the gait sounds and pitch modulations of the performers, but you also heard these sounds when you were still in detached, untargeted audience-member mode. The big auditory consequence of being targeted by the actors is not in the rhythm or pitch, but in the loudness. When the performers were onstage, most of the time they were more or less equidistant, and fairly far away—and so there was little loudness modulation as they carried on. But when the performers broke through the “screen,” they ramped up the volume. It is these high-loudness parts of music—the fortissimos, or ff s—that are often highly evocative and thrilling, as when the dinosaur reaches out of the 3-D theater’s screen to get you.

And that’s the final topic of this chapter: loudness, and its musical meaning. I will try to convince you that loudness modulations are used in music in the 3-D, invade-the-listener’s-space fashion I just described. In particular, this means that the loudness modulations in music tend to mimic loudness modulations due to changes in the proximity of a mover. Before getting into the evidence for this, let’s discuss why I don’t think loudness mimics something else.


Nearness versus Stompiness

I will be suggesting that loudness in music is primarily driven by spatial proximity. Rather than musical pitch being a spatial indicator, as is commonly suggested (see the earlier section “Why Pitch Seems Spatial”), it is loudness in music that has the spatial meaning. As was the case with pitch, here, too, there are several stumbling blocks preventing us from seeing the spatial meaning of loudness. The first is the bias for pitch: if one mistakenly believes that pitch codes for space, then loudness must code for something else. A second stumbling block to interpreting loudness as spatial concerns musical notation, which codes loudness primarily via letters (pp, p, mf, f, ff, and so on), rather than as a spatial code (which is, confusingly, how it codes pitch, as we’ve seen). Musical instruments throw a third smokescreen over the spatial meaning of loudness, because most instruments modulate loudness not by spatial modulations of one’s body, but by hitting, bowing, plucking, or blowing harder.

Therefore, several factors are conspiring to obfuscate the spatial meaning of loudness. But, in addition, the third source of confusion I just mentioned suggests an alternative interpretation: that loudness concerns the energy level of the sound maker. A musician must use more energy to play more loudly, and this can’t help but suggest that louder music might be “trying” to sound like a more energetic mover. The energy with which a behavior is carried out is an obvious real-world source of loudness modulations. These energy modulations are, in addition, highly informative about the behavior and expressions of the mover. A stomper walking nearby means something different than a tiptoer walking nearby. So energy or “stompiness” is a potential candidate for what loudness might mean in music.

Loudness in the real world can, then, come both from the energy of a mover and from the spatial proximity of the mover. And each seems to be the right sort of thing to potentially explain why the loudest parts of music are often so thrilling and evocative: stompiness, because the mover is energized (maybe angry); proximity, because the mover is very close by. Which of these ecological meanings is more likely to drive musical loudness, supposing that music mimics movement? Although I suspect music uses high loudness for both purposes—sometimes to describe a stompy mover, and sometimes to describe a nearby mover—I’m putting my theoretical money on spatial proximity.

One reason to go with the spatial-proximity interpretation of loudness, at the expense of the stompiness interpretation, is pragmatic: the theory is easier! Spatial proximity is simply distance from the listener, and so changes in loudness are due to changes in distance. That’s something I can wrap my theoretical head around. But I don’t know how to make predictions about how walkers vary in their stompiness. Stompers vary their stompiness when they want to, not in the way physics wants to. That is, if musical loudness is stompiness, then what exactly does this predict? It depends on the psychological dynamics of stompiness, and I don’t know that. So, as with any good theorist, spatial proximity becomes my friend, and I ignore stompiness.

But there is a second reason, this one substantive, for latching onto spatial proximity as the meaning of musical loudness. Between proximity and stompiness, proximity can better explain the large range of loudness that is possible in music. Loudness varies as the inverse square of proximity, and so it rises dramatically as a mover nears the listener. Spatial proximity can therefore bring huge swings in loudness, far greater than the loudness changes that can be obtained by stomping softly and then loudly at a constant distance from a listener. That’s why I suspect proximity is the larger driver of loudness modulations in music. And as we will see, the totality of loudness phenomena in music are consistent with proximity, and less plausible for stompiness (including the phenomenon discussed in Encore 5, that note density rises with greater loudness).

Thus, to the question “Is it nearness or stompiness that drives musical loudness modulations?” the answer, for both pragmatic and substantive reasons, is nearness, or proximity. Nearness can modulate loudness much more than stompiness can, and nearness is theoretically tractable in a way that stompiness is not. Let’s see if proximity can make sense of the behavior of loudness in music.


Slow Loudness, Fast Pitch

Have you ever wondered why our musical notation system is as it is? In particular, why does our Western music notation system indicate pitch by shifting the notes up and down on the staff, while it indicates loudness symbolically by letters (e.g., pp, f ) along the bottom? Figure 37 shows a typical piece of music. Even if you don’t read music—and thus don’t know exactly which pitch each note is on—you can instantly interpret how the pitch varies in the melody. In this piece of music, pitch rises, wiggles, falls, falls, falls yet again, only to rise and tumble down. You can see what pitch does because the notation system creates what is roughly a plot of pitch versus time. Loudness, on the other hand, must be read off the letters along the bottom, and their meaning unpacked from your mental dictionary: p for “quiet,” f for “loud,” and so on. Why does pitch get a nice mapping onto spatial position, whereas loudness only gets a lookup table, or glossary?

Figure 37. The usual notation, where vertical position indicates pitch, and intensities are shown along the bottom. The music is a simplification of the seventh through twelfth measures from Johann Christoph Friedrich Bach’s Menuet and Alternativo. It utilizes standard music notation. Standard notation is sensible because pitches vary much more quickly than loudness, so it tends not to be a problem to have to read the loudness levels along the bottom.

Music notation didn’t have to be like this. It could do the reverse: give loudness the spatial metaphor, and relegate pitch to being read along the bottom in symbols. Figure 38 shows the same excerpt we just saw in Figure 37, but now in this alternative musical notation system. Going from the lowest horizontal line upward, the lines now mean pianissimo (pp), piano (p), mezzo forte (mf), forte (f), and fortissimo (ff). The pitches for each note of the song are now shown along the bottom. Once one sees this alternative notation system in Figure 38, it becomes obvious why it is a terrible idea. When vertical height represents loudness, vertical height tends to just stay constant for long periods of time. The first eight notes are all at one loudness level (piano), and the remaining 12 are all at a second loudness level (forte). Visually there are just two plateaus, severely underutilizing your visual talents for seeing spatial wiggles. In standard notation, where pitch is spatially represented, on the other hand, the notes vary vertically much more on the page. Not only does our hypothetical alternative notation underutilize the capabilities of the visuospatial code, it overutilizes the letter codes. We end up with “word salad” along the bottom. In this case, there are 15 instances where the pitch had to be written down, nearly as many as there are notes in the excerpt. In standard notation, where loudness is coded via letters, there were just two letters along the bottom (see Figure 37 again).

Figure 38. A notation system in which vertical position indicates intensity, and pitches are shown along the bottom. This is the same excerpt from J.C.F. Bach as in Figure 37 but here the notation schemes for pitch and loudness have been swapped. This is not a good way to notate music because most of the note-to-note modulations are in pitch, not in loudness, which means an overabundance of pitch labels along the bottom, and little use of the vertical dimension within the horizontal bars. (To read the pitches along the bottom, “E5,” for example, is the E pitch at the fifth octave. When the octave is not labeled, it is presumed to be that of the last mentioned octave.)

The reason this alternative music notation system is so bad is, in one sense, obvious. In music, pitch typically varies quickly, often from note to note. Loudness, on the other hand, is much less variable over time. As exemplified by the excerpt in Figure 37, pitch can change at very short time scales, such as a 16th or 32nd note, but intensities typically persist for many measures before they change. To illustrate this, consider the fictional piece of music shown in Figure 39 (using standard music notation). In this example, pitch hardly ever changes, and loudness is changing very quickly. Music is never like this, which is why the standard notation system is a good one. The standard music notation system is so useful for music because it gives the quickly varying musical quality—pitch—the spatial metaphor, and relegates to the glossary the musical quality that stays much more constant: loudness.

Figure 39. An alternative kind of music that never happens (shown in regular notation). If music were often like this, then the alternative notation system in Figure 38 would be sensible. (This fictional music was created by taking J.C.F. Bach’s piece in Figure 38, keeping the notes as in that reverse notation system, but then pretending they represent pitches, as in normal notation (“f4” means “ffff ”).

To get a more quantitative measure of how quickly pitches and loudnesses change over time, Sean Barnett measured the distribution of time spent on a pitch before changing (Figure 19, earlier), and Eric Jordan measured the distribution of time spent at a given level of loudness before changing (Figure 40 below). One can see that pitches typically switch after about half a beat to a beat, whereas intensities change after about 10 beats.

Figure 40. Distribution of time spent at a loudness level before switching. One can see that loudness durations tend to be about 10 beats, and very often as long as 30 or more beats. This is more than an order of magnitude greater than the time scales for durations at the same pitch, which tend to be about a half a beat to a beat. These data were measured from Denes Agay’s An Anthology of Piano Music, Vol. II: The Classical Period.

While it is obvious why the standard notation system is smarter than my hypothetical “reverse” one for music, it is not obvious why music is like this in the first place. Why does music have “fast pitches” and “slow loudnesses”? If music sounds like movement, and loudness modulations are selected to sound like those due to spatial proximity, then the answer is straightforward. In order to significantly change loudness, the mover must move some distance through space. Melodic pitch, on the other hand, is about directedness toward the listener. In contrast to movement through space, which takes a relatively long time, a mover can “turn on a dime.” In a single step, a mover can turn about 45 degrees with ease (and typically does), which would translate to a fourth of the tessitura width (on average). Melodic pitch changes more quickly than loudness in music simply because human movers can change direction more quickly than they can change their proximity to the listener. That’s why music never sounds like the fictional piece of music in Figure 39, and that’s why our Western musical notation system is an efficient one. The comparative time scales of loudness and melodic pitch are what we should expect if music sounds like human movers, with loudness modulated by the spatial proximity and melodic pitch by the direction of the depicted mover.

In addition to the time scale for loudness modulations being consistent with that for changes in proximity, additional evidence for proximity as the meaning of loudness is provided in four Encore sections:

Encore 4: “Distant Beat” I will discuss how the nearer movers are, the more of their gait sounds are audible, and how this is also found in music: louder portions of music tend to have more notes per beat. (This was also mentioned earlier in this chapter as we finished up our discussion of rhythm, because it concerns the interaction between rhythm and loudness.)

Encore 6: “Fast Tempo, Wide Pitch” I discuss how, as expected from theory, music with a faster tempo has a wider pitch range for its melody. This Encore section also shows, however, that—as predicted—the range of loudnesses in a piece is not correlated with tempo.

Encore 7: “Newton’s First Law of Motion” This Encore section takes up a variety of predictions related to the inertia of moving objects, on the one hand, and the asymmetry between pitch rises and pitch falls, on the other. We will predict, and data will confirm, that this asymmetry changes as a function of loudness: when music indicates (by high loudness level) that the mover is close, the probability rises of long pitch runs downward.

Encore 8: “Medium Encounters” This Encore section concerns regularities in how movers distribute themselves in distance from a listener, and makes predictions about how frequently music makes use of various loudness levels.


Summary

In this chapter and the previous chapter, we have covered a great deal of musical ground (and we will cover still more in the Encore chapter). In Chapter Three, we presented general arguments for the music-is-movement theory, clearing three of the four hurdles for a theory of music: why we have a brain for music, why music should be emotionally moving, and why music should elicit movements in us. In this chapter, we have addressed the fourth hurdle, explaining the structural features of music. As I hope readers can see, there is a wealth of suspicious similarities between music and the sounds of people moving—42 suspicious similarities—which are summarized in the table below.

ection

uman movers

usic

. Drum Core

ootsteps areregularly repeating.

he beat is rgularly repeating.

. Drum Core

ootsteps arethe most fundamental auditory feature of human movement.

he beat is te most fundamental quality of music.

. Drum Core

ootsteps ten to be around one to two per second.

eats tend tobe around one to two per second.

. Drum Core

ootsteps usully are not as regular as a metronome.

eats are oftn looser than that of a metronome.

. Drum Core

eople’s foottep rates lower prior to stopping (ritardando).

he number ofbeats per second lowers prior to musical endings (ritardando).

. Gangly Nots

ootsteps areusually higher-energy collisions than between-the-step bangs.

n-beat notesusually have greater emphasis than off-beat notes.

. Gangly Nots

n addition t footsteps, people’s gangly limbs make sounds in between the footsteps.

n addition t notes on the beat, music has notes in between the beats.

. Gangly Nots

he between-te-steps gangly bangs are time-locked to the steps.

he between-te-beats notes are time-locked to the beat.

. Gangly Nots

he pattern o steps and between-the-steps gangly bangs is crucial to identifying the mover’s behavior.

he pattern o on-beat and off-beat notes (the rhythm) is crucial to the identity of a song.

0. Gangly Noes

uman-mover git sounds (steps and between-the-steps banging ganglies) have rings, and often pitches.

usical notesoften have pitches.

1. The Lengt of Your Gangly

eople typicaly make about zero or one between-the-step bangs.

usic typicaly has about one off-beat note per beat.

2. Backbone

ootsteps canbe highly variable in intensity, and we perceptually sense a step even when inaudible.

eats are fel even when no note occurs on a beat.

3. Backbone

t is not merly the temporal pattern of gait sounds that identifies a mover’s behavior. It matters which sounds are on the beat.

he feel of amusical rhythm does not depend solely on the temporal pattern, but on where the listener interprets the beats to be.

4. The Long nd Short of Hit (Encore)

eople are liely to make a between-the-steps gangly bang near the middle of a step cycle.

ff-beat note most commonly occur near the middle of a beat cycle.

5. The Long nd Short of Hit (Encore)

eople are moe likely to make a between-the-steps gangly bang just before a step than just after. (“Long-shorts” are more common.)

ff-beat note more commonly occur in the second half of a beat cycle (just before the beat) than in the first half (just after the beat).

6. Measure o What? (Encore)

atterns of fotstep emphases are informative as to the mover’s behavior.

ime signatur matters to the identity of music.

7. Gangly Chrds

ait sounds hve temporal patterns and pitch patterns (due to the pitches of the constituent ganglies).

usic typicaly has rhythm chords.

8. Gangly Chrds

mover’s temoral pattern of hits is matched to the pitch pattern (because the pitches are due to the constituent gangly bangs).

hords (e.g.,as played with the left hand on the piano) have the same time signature as the rhythm.

9. Gangly Chrds

ootsteps ten to have lower pitch than other gangly bangs.

or chords, te pitch played on the beat tends to be lower than that played off the beat.

0. Gangly Chrds

he pitches aong gangly bang sounds can occur simultaneously (unlike Doppler shifts, see below).

hords are ofen struck simultaneously.

1. Fancy Foowork (Encore)

hen people trn, they tend to have more complex gangly bangings.

hen melodic ontour rises or falls, the rhythm tends to be more complex.

2. Distant Bat (Encore)

eople that ae nearer have more audible gangly bangs per step.

ouder music as more notes per beat.

3. Choreograhed for You

oppler shiftpitch contours and loudness contours matter for the appropriate visual-auditory fit of a human mover.

elodic contors and loudness contours (not just rhythm) are relevant for choreographers in creating visual movements to match music.

4. Why PitchSeems Spatial

oppler pitchs change continuously over time.

elodic contor tends to change fairly continuously.

5. Only One inger Needed

mover is ony moving in one direction at any moment, and thus has only one Doppler shift for a listener.

elodies are nherently one pitch at a time.

6. Home Pitc (Encore)

or a mover a constant speed, Doppler shifts are confined to a fixed range, the highest (lowest) corresponding to heading directly toward (away from) the listener.

elodies tendto be confined to a fixed range of pitches called the tessitura.

7. Home Pitc (Encore)

eople tend t move in all directions relative to a listener, and to fairly uniformly sample from Doppler pitches within the Doppler range.

elodies tendto sample fairly uniformly across their tessitura.

8. Home Pitc (Encore)

itches at th top and bottom of the Doppler range tend to have longer duration (due to trigonometry).

elodies tendto have longer-duration notes when the pitch is at the top or bottom of the tessitura.

9. Fast Temp, Wide Pitch (Encore)

aster movershave a wider Doppler pitch range.

aster tempo usic tends to have a wider tessitura.

0. Fast Temp, Wide Pitch (Encore)

aster moversdo not have a wider range of proximity-based loudnesses.

aster tempo usic does not tend to have a wider range of loudness.

1. Human Cures

eople take aout two steps to make a right-angle turn.

usic takes aout two beats to traverse the top or bottom half of the tessitura (which corresponds to a right-angle turn).

2. Musical Ecounters

he most geneic kind of human encounter is the Hello–HowAreYou–Good-bye, involving a circling movement beginning when a mover headed away begins to turn toward the listener. The Doppler pitch contour is that of a hill, with flatter slopes at the bottom and top.

elodies in msic have a tendency to be built out of pitch hills.

3. Musical Ecounters

he generic ecounter tends to be around eight steps (two steps per right-angle turn).

he constituet pitch hills in melodies tend to be roughly eight beats long.

4. Newton’s irst Law of Music (Encore)

hanging Dopper pitches have little or no tendency to continue changing, consistent with Newton’s First Law of Motion (inertia).

hen melodic ontour varies, there is little or no tendency to continue changing.

5. Newton’s irst Law of Music (Encore)

ore subtly, oppler shifts possess “momentum” only when falling by a small amount.

ore subtly, elodic contours possess “momentum” only when falling by a small amount.

6. Newton’s irst Law of Music (Encore)

mall Dopplerpitch changes are more likely downward, and large pitch changes more likely upward.

mall melodiccontour changes are more likely downward, and large changes more likely upward.

7. Newton’s irst Law of Music (Encore)

xtended segmnts of falling Doppler pitch are more common than extended segments of rising Doppler pitch (due to passing movers).

ownward meloic runs are more common than upward melodic runs.

8. Newton’s irst Law of Music (Encore)

ore proximal and thus louder, movers are more likely to undergo large downward Doppler pitch runs.

ouder portios of music are more likely to feature large pitch runs downward (compared to upward).

9. Slow Loudess, Fast Pitch

eople can tun quickly, and can thus change Doppler pitch quickly (i.e., half a tessitura in about two steps). But people cannot typically change loudness quickly, because that requires actually moving many steps across space.

elodic contor changes quickly, but loudness changes at much slower time scales.

0. Medium Enounters (Encore)

ncounters wih a person have an average distance, spending more total time at near-average distances than at disproportionately near or far distances (in contrast to the fairly uniform distribution of Doppler pitches).

ost pieces hve a typical loudness level (e.g., mezzo forte), spending most of their time at that loudness level, and progressively less time at loudness levels deviating more from this average (in contrast to the fairly uniform distribution of melodic pitches).

1. Medium Enounters (Encore)

n any given ncounter, a person is more commonly more distant than average than more proximal (because there’s more “far” real estate than “near”).

he distributon of times spent at each loudness level is not only peaked, but asymmetrically disfavors the louder loudness levels.

2. Medium Enounters (Encore)

earer-than-aerage portions of a mover’s encounter tend to be shorter in duration than farther-than-average portions.

ouder-than-aerage segments of music tend to be more transient than softer-than-average segments.


Загрузка...