You are likely familiar with the Doppler effect. You might have experienced it as a change in pitch of a car horn, ambulance siren or train whistle as the car/ambulance/train moved past you. In scientific terms, this change in pitch is described as a shift in the frequency of the sound.
The equation for the observed sound frequency $f$, as seen by a receiver moving at a speed $v_r$ relative to the air, is
\begin{equation} f = \frac{c+v_r}{c+v_s}f_0, \label{1} \end{equation}where the sound is originally emitted at a frequency $f_0$ by a source that moves at a speed $v_s$ relative to the air. Here, $c$ is the speed of sound.
We focus on motion along a straight line where we define $v_r$ as positive when the receiver moves toward the source and negative when the source is moving away from the observer.
A classical application of \eqref{1} is the calculation of the speed of a train moving past a stationary receiver. Let the observed frequency of the train whistle as the train moves towards the receiver be $f_1$ and the observed frequency as the train moves away from the receiver be $f_2$.
In this case, we can use \eqref{1} to obtain expressions for $f_1$ and $f_2$: $$ \begin{align} f_1&= \frac{c}{c-v}f_0,\qquad\qquad\qquad\qquad\\ f_2&= \frac{c}{c+v}f_0,\qquad\qquad\qquad\qquad \end{align} $$ where $v$ is now the speed of the train. We use these two equations to eliminate the unknown $f_0$ and solve the resulting equation for $v$
\begin{equation} v = \frac{f_1 −f_2}{f_1+f_2}c. \label{2} \end{equation}We can now determine the speed of a train $v$, using \eqref{2}. However, there will be a slight twist. We will analyse numerically experimental data in the form of a sound recording of a passing train to compute $f_1$ and $f_2$. Subsequently, this will yield $v$.
To do this, we will use an implementation of a powerful tool known as the fast Fourier transform (FFT), found in the numpy library. Essentially, this is nothing but a computationally efficient way of calculating the discrete Fourier transform (DFT), a discrete approximation of the continuous Fourier transform.
A sound file is represented in python as a vector $\vec{x}$ with $N$ elements, where each element is the sound amplitude sampled at time intervals $\Delta t$. The DFT of $\vec{x}$ is then also a vector with $N$ elements. We call it $\vec{X}$. Suppose now that we know $\vec{X}$. Then we can compute each element $x_n$ in $\vec{x}$ by applying the formula $$x_n = \frac{1}{N} \sum_{k=0}^{N-1}X_k \exp\left(i2\pi\frac{k}{N\Delta t}n\Delta t\right).$$ Let us look closer at this expression and try to figure out what it means. What it tells us is quite simply that $\vec{x}$ is a superposition of exponential functions with different frequencies $f_k = \frac{k}{N\Delta t}$ and amplitudes $X_k$. Therefore, we can view the magnitude of amplitudes $|X_k|^2$ as a measure of the "weight of the frequency $f_k$" in $\vec{x}$!
Q: How do we calculate $\vec{X}$ to begin with?
A: We could apply the formula for the discrete Fourier transform,
$$X_k =\sum_{n=0}^{N-1}x_n \exp\left(-i2\pi\frac{k}{N\Delta t}n\Delta t\right).$$This requires $\mathcal{O}(N^2)$ operations. In contrast, the FFT is a more computationally efficient way to calculate $\vec{X}$, requiring only $\mathcal{O}(N \ln N)$ operations. There are several FFT algorithms and many make use of the fact that the exponentials can all be written as
$$\left(\exp\left(-\frac{2\pi i}{N}\right)\right)^{kn}$$In python, you can calculate $\vec{x}$ from $\vec{X}$ by using x=numpy.fft.ifft(X), or the other way round, using X=numpy.fft.fft(x).
Assume now that we store the sound of the train as it moves towards the observer, in the vector sample1. Likewise, we store the sound of the train as it moves away from the observer, in sample2.
We calculate the FFTs of these signals and store them in the vectors p1 and p2 respectively.
p1=fft(sample1)
p2=fft(sample2)
To obtain a measure of the magnitude of the amplitudes, we calculate their absolute values squared, element by element:
P1=np.absolute(p1)**2
P2=np.absolute(p2)**2
Finally, we calculate the frequency corresponding to each of the elements in P1 and P2.
f=linspace[0, N-1, N]/(N*dt)
All the details discussed above are implemented in the function fftwrapper() below. A few more technical details are required about how to import the audio file etc. but this will not be discussed here. However, you are encouraged to look at the code and see if you can make sense of it.
%matplotlib inline
from __future__ import division
import numpy as np
import matplotlib.pyplot as plt
from numpy.fft import fft, ifft
import scipy.io.wavfile as wav
from IPython.core.display import HTML, display
# Set common figure parameters:
newparams = {'axes.labelsize': 11, 'axes.linewidth': 1, 'savefig.dpi': 300,
'lines.linewidth': 1.0, 'figure.figsize': (8, 3),
'ytick.labelsize': 10, 'xtick.labelsize': 10,
'ytick.major.pad': 5, 'xtick.major.pad': 5,
'legend.fontsize': 10, 'legend.frameon': True,
'legend.handlelength': 1.5}
plt.rcParams.update(newparams)
def fftwrapper():
"""
Output
P1 : 1D vector
P2 : 1D vector
f : 1D vector
Example usage
(P1,P2,f) = fftwrapper()
"""
N=80000 # Number of samplings in sample1 and sample2
shift1=190000 # First index of sample1
shift2=320000 # First index of sample2
fcutoff=700 # Highest frequency in returned spectrum
# Load sound file and convert from stereo to mono
Fs, ystereo = wav.read('lwrhuntrd-ns197.wav', 'r')
ymono = (ystereo[:,0] + ystereo[:,1])/2
ymono = ymono/max(abs(ymono))
deltat = 1/Fs
sample1 = ymono[shift1:N+shift1-1]
sample2 = ymono[shift2:N+shift2-1]
# Do FFTs
p1 = fft(sample1)
p2 = fft(sample2)
P1 = np.absolute(p1)**2
P2 = np.absolute(p2)**2
f = np.linspace(0,N-1,N)/(N*deltat)
# Crop vectors to the sizes we are interested in
ifcutoff= np.nonzero(abs(f-fcutoff)==min(abs(f-fcutoff)))[0]-1
f = f[0:ifcutoff]
P1 = P1[0:ifcutoff]
P2 = P2[0:ifcutoff]
return (P1, P2, f)
We start by a call to fftwrapper(),
(P1, P2, f) = fftwrapper()
Then we plot P1 and P2, normalized such that the biggest elements in the plotted vectors are 1.
plt.plot(f, P1/max(P1), f, P2/max(P2))
plt.xlabel(r"$f$ (Hz)");
plt.ylabel(r"$P/P_{max}$")
plt.legend(['Sample 1','Sample 2','Location','NorthWest'], loc=2);
The two samples contain frequency peaks. The peaks in sample2 are shifted towards smaller frequencies in relation to the peaks in sample1. This is consistent with what we hear when a train passes by: the sound it makes as it moves away form us is more low-pitched than that of when it moves towards us.
We choose the frequency $f_1$ corresponding to the tallest peak in sample1 in the plot. We see that the corresponding peak in sample2 is also the tallest and we denote its frequency by $f_2$. We locate $f_1$ and $f_2$ as follows.
f1 = f[P1==max(P1)]
f2 = f[P2==max(P2)]
print("f1 = %f, f2 = %f" % (f1, f2))
f1 = 660.397500, f2 = 619.880625
Notice that we could not expect in advance that the tallest peak in sample1 would also be the tallest in sample2. In principle, this is not a given and we needed the plot to confirm this.
Having found $f_1$ and $f_2$, we can calculate $v$ by use of (2). We define the speed of sound $c$ to be 340.29 m/s and do the following:
c = 340.29 # Speed of sound [m/s]
v = (f1-f2)/(f1+f2)*c # Speed of train [m/s]
v = 3.6*v # Speed of train [km/h]
print("Speed of train is %0.2f km/h." % v)
Speed of train is 38.77 km/h.
To run this on your own computer, remember to download the sound file 2wa_lwrhuntrd_ns197.wav$^1$ into the same directory as the iPython Notebook file. Listen closely to the sound file. Does is it sound reasonable that the train is moving at 38.77 km/h?
$^1$ Courtesy of David Safdy and Greg Lavoie of fwarailfan.net