Shepard's Tones
The
Shepard Tones were generated by psychologist Roger Shepard in 1964 [1]. Much like Escher's ever-ascending staircase,
repeated sequences of the Shepard Tones are perceived as continually rising in
pitch; for example, when the sequence is played twice, the first note of the second sequence
sounds higher than the last note of the first sequence.
A
sequence of Shepard Tones is comprised of twelve tones. In contrast to a pure
tone, which may be described as a sinusoidal function of a single frequency
(see Equation 1), each Shepard Tone is built from several harmonics of a single
(fundamental) frequency. In
other words, one tone may be described as the sum of sinusoids whose
frequencies are integer multiples of a fundamental frequency (see Equation 2).
![]()
Equation 1: A pure tone
![]()
Equation 2: A tone constructed from three harmonics of a fundamental frequency fo.
In
Shepard Tones, the only harmonics are multiples of the fundamental frequency
that are powers of 2. That is,
, 2
, 4
,
and 8
may
be present, but 3
and
5
cannot. Thus, the only frequencies
involved are
where i is an integer, and one tone may be described as:
![]()
(This
equation assumes that the amplitude of each sinusoid is 1. However, the amplitudes of each frequency
component differ, as will be discussed later.)
Consecutive
Shepard Tones are separated by a semitone, or half-step; that is, the ratio
between the fundamental frequencies of two consecutive tones is
. A new fundamental frequency which is j
semitones above the fundamental frequency
is therefore given by
We may then describe the jth
Shepard Tone as:
![]()
or,
more concisely,
![]()
where
![]()
and
is the amplitude of the
,
the ith harmonic of the jth
tone
What
is
?
The amplitude of each frequency component is described by a Gaussian
“envelope”, as shown below.
(Plot:
amplitude v. log frequency, from http://asa.aip.org/demo27.html.
Note that log frequency appears on the x-axis. The lines displayed correspond
to the harmonic present in the first Shepard Tone.)
Denoting
the center of the Gaussian as
, we
may replace
with the following:
![]()
This
link (http://icg.harvard.edu/~scia49/demonstrations/shepards_tone_animation.html) shows an animation of the
frequency components moving under the envelope.
We
may now form the final equation for the jth Shepard
Tone:
![]()
Pitch
is the way our brains perceive frequency – the “height” of a tone. In a pure
tone, the pitch closely corresponds to the frequency of the sinusoid; for
complex tones, our pitch judgement is associated with
the periodicity of the overall function.
Each
Shepard Tone is comprised only of octaves of the fundamental frequency. When
several octaves are present in a tone, our brains can identify the pitch class
(if octaves of a C are played, for example, we hear a “C”). However, the height of the pitch is uncertain
– with which octave shall we associate the perceived note? This phenomenon, in conjunction with the
ear’s preference for locality, produces an illusion of circularity; upon
repeating the sequence, we tend to hear an octave above the starting tone
rather than the starting tone itself.
The addition of more harmonics increases this ambiguity.
Shaping
the amplitudes with a stationary Gaussian envelope enhances the illusion. As we
proceed through the sequence, the intensity of higher harmonics fades, while
the intensity of lower harmonics grows.
Thus, the lower harmonics of a C# have greater amplitudes than the lower
harmonics of a C, and the higher harmonics of a C# have smaller amplitudes than
the higher harmonics of a C. Circularity ensues: as we approach what should be
an octave above the starting tone, the shifting of weights places components at
the bottom of the frequency range, and we have again produced the starting
tone.
(insert matlab code)
[1] R. N. Shepard
(1964), "Circularity in judgments of relative pitch," J. Acoust. Soc. Am. 36, 2346-2353.
[2]
Acoustical Society of