from pylab import *
import os
def play(data,r=1,pad=8000):
data = array(data*16000.0/amax(abs(data)),'int16')
if r>1: data = array(list(data)*r)
data = r_[zeros(pad,data.dtype),data,zeros(pad,data.dtype)]
data.tofile("_out.raw")
os.system("play -s2L -r 8000 _out.raw")
Sine waves complete one full wave over the interval $[0,2\pi]$. We define a range of $x$ values covering this interval over 1024 sample points.
x = 2*pi*arange(1024,dtype='d')/1024.0
Now let's plot the sine and cosine. Remember how these relate to rotation and polar coordinates.
These functions also frequently come up in differential equations because $\sin'(x) = \cos(x)$ and $\cos'(x) = -\sin(x)$, so
$$\frac{d^2 f(x)}{dx} = -f(x)$$has the solution
$$f(x) = \sin(x)$$plot(x,sin(x))
plot(x,cos(x))
[<matplotlib.lines.Line2D at 0x3c0c9750>]
We can now generate sounds with these.
Pure sounds (and many other oscillations) are sine waves because they are generated by harmonic oscillators. Harmonic oscillators are oscillators in which the restoring force of a motion is proportional to the deviation from equillibrium. By Newton's second law, $F = ma$, this gives rise to the second order differential equation above, and hence sine waves.
Most oscillators are harmonic at least for small oscillations due to the Taylor series expansion.
The frequency of an oscillation is how many waves occur per time period (i.e., how "frequently" the wave occurs in one time period).
Since we defined a time period as 1024 samples, we can generate sine waves as above.
Note that usually frequency is expressed in Hertz, that is vibrations per second. Since we have a sampling rate of 8000/sec in the player, this means that the frequency of this tone in Hertz is $128*8000/1024$ or 1000 Hz.
data128 = sin(128*x)
data128 /= norm(data128)
plot(data128)
play(data128,r=10)
data64 = sin(64*x)
data64 /= norm(data64)
plot(data64)
play(data64,r=10)
data32 = sin(32*x)
data32 /= norm(data32)
plot(data32)
play(data32,r=10)
data32c = cos(32*x)
data32c /= norm(data32c)
plot(data32c)
play(data32c,r=10)
Note that we often work with periodic signals. That is, we repeat a waveform indefinitely. The simple audio player above already does this.
When working with periodic signals, it is important to pay attention to boundary conditions.
dataodd = sin(31.3*x)
plot(dataodd)
play(dataodd,r=10)
Although we thought of our data as audio signals with a given frequency, we can also view them as vectors.
When we do so, there are some interesting relationships.
Since we normalized our signals, the dot product of a signal with itself (as a vector) is 1.
dot(data32,data32)
1.0
The dot product of a sine signal and a cosine signal is 0; the two signals are orthogonal as vectors.
dot(data32,data32c)
-3.6831216008103521e-17
Likewise, the dot product of a signal and a signal of a different frequency is 0, provided both signals actually correspond to periodic signals.
(Actually, it also works for half frequencies; this gives rise to the cosine transform).
dot(data32,data64)
1.7017035905939927e-15
dot(data32,dataodd)
-4.9462849353357798
A general pure sine wave is parameterized by frequency and phase:
$$f(x) = \sin(2\pi f x + \phi)$$Although there are many different sine waves of different frequencies,
for any given frequency, there are really "only two" sine waves of
different phase.
That is, any sine wave of a given frequency and an arbitrary phase
can be represented as a linear combination of two sine waves of that
frequency, one with a phase of $0$ and another one with a phase of
$\frac{\pi}{2}$ (recall that $\cos x = \sin(x+\frac{\pi}{2})$).
The reason is this simple trigonometric identity:
$$\sin(u+v) = \sin u \cos v + \cos u \sin v$$This means that for a given $v$, we can write any $\sin(u+v)$ as a linear combination of $\sin v$ and $\cos v$.
data32p = sin(32*x+0.3)
plot(data32p)
play(data32p,r=10)
dot(data32,data32p)
21.616797112821242
dot(data32c,data32p)
6.6868589474523219
dot(data32,data64)
1.7017035905939927e-15
rec = dot(data32,data32p)*data32 + dot(data32c,data32p)*data32c
print norm(rec-data32p)
plot(rec)
plot(0.9*data32p)
1.48311824805e-13
[<matplotlib.lines.Line2D at 0x3b90e410>]
For periodic signals of length $N$ (here, $N=1024$), we have made the following observations:
This suggests constructing a complete basis in the following way:
This way, we end up with 1024 orthogonal functions, and since our space only is 1024 dimensional, this must be a complete basis for our linear space.
sins = [sin(i*x) for i in range(1,513)]
figsize(12,8)
plot(sins[0])
plot(sins[1])
plot(sins[2])
[<matplotlib.lines.Line2D at 0x4906a5d0>]
coss = [cos(2.0*pi*arange(1024)*i/1024.0) for i in range(0,512)]
figsize(12,8)
plot(coss[0])
plot(coss[1])
plot(coss[2])
[<matplotlib.lines.Line2D at 0x458f97d0>]
ylim(-0.1,1.1)
plot(sin(0*x)[-10:])
plot(cos(0*x)[-10:])
[<matplotlib.lines.Line2D at 0x4597aa10>]
plot(sin(512*x)[-10:])
plot(cos(512*x)[-10:])
[<matplotlib.lines.Line2D at 0x45954790>]
Let's construct matrices out of these basis functions.
sins = array(sins)
coss = array(coss)
Now let's verify the orthogonality conditions.
figsize(24,12)
gray()
subplot(131); imshow(dot(sins,sins.T),vmax=1); print amin(dot(sins,sins.T)),amax(dot(sins,sins.T))
subplot(132); imshow(dot(coss,coss.T),vmax=1); print amin(dot(coss,coss.T)),amax(dot(coss,coss.T))
subplot(133); imshow(dot(sins,coss.T),vmax=1); print amin(dot(sins,coss.T)),amax(dot(sins,coss.T))
-2.03463569425e-11 512.0 -2.88673035129e-11 1024.0 -2.14634190463e-11 2.21060387877e-11
We can also look at these matrices themselves
figsize(12,12)
subplot(121); imshow(sins[:100,:100])
<matplotlib.image.AxesImage at 0x4906ad50>
Now we can perform transforms with this matrix.
figsize(10,5)
plot(dot(sins,data32))
plot(dot(coss,data32))
[<matplotlib.lines.Line2D at 0x48ed5350>]
When there is a phase shift, we get both sine and cosine components. Note that we deleted the 0 frequency vector in the sines, so the non-zero components actually differ by 1.
figsize(10,5)
plot(dot(sins,data32p)[:50])
plot(dot(coss,data32p)[:50])
[<matplotlib.lines.Line2D at 0x414e27d0>]
It's useful to combine these two signals in a way in which each component is independent of phase. This is the spectrum.
figsize(10,5)
plot(dot(sins,data64)[:-1]**2+dot(coss,data64)[1:]**2)
[<matplotlib.lines.Line2D at 0x3b9de590>]
ft = r_[sins,coss]
ift = inv(ft)
def transform(signal):
return dot(ft,signal)
def itransform(signal):
return dot(ift,signal)
def spectrum(signal):
return dot(sins,signal)[:-1]**2+dot(coss,signal)[1:]**2
from scipy.ndimage import filters
wave = filters.gaussian_filter(randn(1024),5.0)
plot(wave)
[<matplotlib.lines.Line2D at 0x42a81e10>]
plot(spectrum(wave))
[<matplotlib.lines.Line2D at 0x3f313810>]
plot(transform(wave))
[<matplotlib.lines.Line2D at 0x3f6085d0>]
norm(wave-itransform(transform(wave)))
2.4632919247364998e-14
for phase in [0.0,pi/2,0.3,0.7]:
frequency = 100
plot(transform(sin(2.0*pi*arange(1024)*frequency/1024.0+phase)))
for phase in [0.0,pi/2,0.3,0.7]:
frequency = 100
plot(transform(sin(2.0*pi*arange(1024)*frequency/1024.0+phase))[95:105])
for phase in [0.0,pi/2,0.3,0.7]:
frequency = 100
signal = sin(frequency*x+phase)
plot(spectrum(signal))
for phase in [0.0,pi/2,0.3,0.7]:
frequency = 100
signal = sin(frequency*x+phase)
plot(spectrum(signal)[95:105])
for frequency in [1,10,100,500]:
signal = sin(frequency*x)
plot(transform(signal))
Recall that we can define complex numbers by introducing an abstract symbol for some unknown imaginary quantity $j$, such that $j^2=-1$. In fact, that's how complex numbers are defined: as a kind of simplest solution to the simplest polynomial equation that doesn't have a solution over the reals.
We can then write numbers like:
$c = x + jy$
and derive formulas for addition, multiplication, etc. For example:
$$c_1 \cdot c_2 = (x_1 + jy_1)(x_2 + jy_2) = x_1x_2 + x_1 jy_2 + jy_1 x_2 + jy_1 jy_2 = (x_1x_2-y_1y_2) + j(x_1y_2+x_2y_1)$$Complex numbers are closely tied to trigonometric functions. That's because complex numbers can be thought of as 2D values and how they behave under multiplication.
We can write a complex number also as a phase angle and magnitude:
$c = x + jy = r (\cos \alpha + j \sin \alpha)$
In fact, a shorter way of writing this is as Euler's formula:
$c = r e^{i\alpha} = r (\cos \alpha + j \sin \alpha)$
There are many ways of seeing that this identity is true, but the simplest one is probably by thinking of $\sin$ and $\cos$ as solutions to differential equations and seeing that Euler's formula satisfies these equations.
To convert between the two representations, note that
$$\alpha = \arctan \frac{y}{x}$$$$r = \sqrt{x^2+y^2}$$(Note that you get get these values with numpy.angle
and numpy.abs
.)
From Euler's formula, we also immediately see that:
$$c_1 \cdot c_2 = r_1 e^{j\alpha_1} \cdot r_2 e^{j\alpha_2} = r_1 r_2 e^{j(\alpha_1+\alpha_2)}$$With complex numbers, we can also now write the basis functions for our transform in a more compact way:
$$b(n,k) = e^{-2\pi~\frac{kn}{N}~j}$$If you expand this, you'll see that this gives rise to the same sine and cosine-based components we have been using above, except for the odd behavior for $k=0$ and $k=N/2$ (we don't have to worry about those with complex numbers, since they are both important there).
Note that complex numbers behave otherwise very much like real numbers; in particular, we can also apply most of linear algebra. So we can perform expansions just like before.
def b(k,n,N):
return exp(-2*pi*1j*k*n/N)
N = 420
n = arange(0,420)
s = b(3,n,N)
ylim(-1.1,1.1)
plot(real(s))
plot(imag(s))
plot(abs(s),color='r')
[<matplotlib.lines.Line2D at 0x4d127f10>]
The formal definition of the discrete Fourier transform is now:
$$X_k = \sum_{n=0}^{N-1} x_n e^{-\frac{2 \pi i}{N} k n} \quad \quad k = 0, \dots, N-1$$As before, we can put the Fourier transform into a matrix, call it F
.
This is now a complex matrix, but its real and complex components operate completely separately (provided we give it real input vectors).
F = array([b(k,n,N) for k in range(0,N)])
a1 = 1.5
a2 = 2.3
omega1 = 6*2*pi/N
omega2 = 14*2*pi/N
y = a1*sin(n*omega1) + a2*sin(n*omega2)
plot(y)
[<matplotlib.lines.Line2D at 0x4d37d390>]
Y = dot(F,y)
plot(abs(Y)**2)
[<matplotlib.lines.Line2D at 0x4e0c4890>]
plot(Y) # ???
[<matplotlib.lines.Line2D at 0x4e3f2350>]
[i for i in range(N) if abs(Y[i])>100]
[6, 14, 406, 414]
def max1(x): return x*1.0/amax(abs(x))
signal = n
transform = dot(F,signal)
plot(max1(signal))
plot(max1(abs(transform)))
[<matplotlib.lines.Line2D at 0x4e61fad0>]
Remember that the Fourier transform implements convolutions with circular boundary conditions. Therefore, the signal that the Fourier transform sees isn't the nice-looking smooth signal above, but one with a sharp discontinuity. We can see this by shifting the signal circularly.
plot(roll(max1(signal),200))
[<matplotlib.lines.Line2D at 0x4ef26a10>]
A double sawtooth illustrates this as well and is similar.
signal = concatenate([n,n])[2*n]
transform = dot(F,signal)
plot(max1(signal))
plot(max1(abs(transform)))
[<matplotlib.lines.Line2D at 0x4e425910>]
The Fourier transform of an impulse is a flat spectrum.
signal = (n==100)
transform = dot(F,signal)
ylim(-0.1,1.1)
plot(max1(signal))
plot(max1(abs(transform)))
[<matplotlib.lines.Line2D at 0x4f221cd0>]
Again, we're actually looking at a circular convolution, so what the Fourier transform "sees" is this.
ylim(-0.1,1.1)
plot(max1(concatenate([signal for i in range(8)])))
[<matplotlib.lines.Line2D at 0x4f4a1490>]
If the signal isn't a perfect impulse but a little broader, the transform changes significantly.
signal = (n>=100) * (n<102)
transform = dot(F,signal)
plot(max1(signal))
plot(max1(abs(transform)))
[<matplotlib.lines.Line2D at 0x4f4d3290>]
signal = (n>=100) * (n<110)
transform = dot(F,signal)
plot(max1(signal))
plot(max1(abs(transform)))
[<matplotlib.lines.Line2D at 0x4f769690>]
The Fourier transform of the Gaussian is very important in practice.
signal = exp(-(n-200)**2/100.0)
signal /= amax(signal)
transform = dot(F,signal)
plot(max1(signal))
plot(max1(abs(transform)))
[<matplotlib.lines.Line2D at 0x4fa3d350>]
It's kind of difficult to see what's going on here. It becomes easier if we shift the output of the transform to overlap with the original signal. Then it becomes pretty clear that the Fourier transform of a Gaussian is again a Gaussian. Some experimentation shows that the wider the original Gaussian, the narrower its transform and vice versa.
plot(max1(signal))
plot(roll(max1(abs(transform)),200))
[<matplotlib.lines.Line2D at 0x50118f10>]
Of course, we can transform back as well.
$F$ is a matrix, so we just take its inverse.
signal = (n>=100) * (n<110)
transform = dot(F,signal)
reconstructed = dot(inv(F),transform)
plot(max1(signal))
plot(max1(abs(transform)))
plot(max1(reconstructed))
[<matplotlib.lines.Line2D at 0x5011a490>]
What happens if we apply the Fourier transform matrix $F$ twice?
double = dot(F,transform)
plot(max1(double))
[<matplotlib.lines.Line2D at 0x5057dbd0>]
imshow(abs(dot(F,F))); gray()
So, from $F \cdot F$, we see that the Fourier transform is almost its own inverse, except for a reflection.
The actual formula for the inverse discrete Fourier transform (IDFT) is
$$x_n = \frac{1}{N} \sum_{k=0}^{N-1} X_k e^{\frac{2\pi i}{N} k n} \quad \quad n = 0,\dots,N-1.$$Note the sign change and the normalization factor. Other than that, the formula is the same as for the regular disrete Fourier transform.
Properties of the Fourier Transform
The Fourier transform has a number of important properties; you should know these by heart.
The Fourier transform is linear:
$$ {\cal F}[a f + b g] = a {\cal F}[f] + b {\cal F} [g] $$The Fourier transform turns translations into multiplications with a complex number (and the other way around):
$$ {\cal F} [ f(x - x_0) ] ( \omega ) = e^{ -2\pi i x_0 \omega} {\cal F} [ f(x) ] (\omega)$$The Fourier transform computes inverse scaling:
$$ {\cal F} [ f(ax) ] (\omega) = \frac{1}{|a|}{\cal F} [ f(x) ] (\frac{\omega}{a}) $$The Fourier transform turns convolutions into multiplication:
$$ {\cal F}[f * g] = {\cal F}[f] \cdot {\cal F}[g] $$Thomasson-Lanczos-Cooley-Tukey-Gauss Lemma
Let's look again at the definition of the discrete Fourier transform:
$$ X_k = \sum_{n=0}^{N-1} x_n e^{-\frac{2 \pi i}{N} k n} \quad \quad k = 0, \dots, N-1 $$We also wrote this as a set of dot products:
$$ X_k = \sum_{n=0}^{N-1} x \cdot b_k $$where
$$ b^{(k)}_n = e^{-\frac{2 \pi i}{N} k n} $$Or we can write it as a matrix multiplication:
$$ X = F \cdot x $$where
$$ F_{kn} = b^{(k)}_n $$For a signal of length $N$, this takes $N^2$ operations.
It turns out that we can compute this particular matrix product in time $\log N$.
The key observation is the Thomasson-Lanczos Lemma, or the Cooley-Tukey Lemma. Gauss also knew this.
The insight is that we can write the DFT $S$ of a sigal $s$ as the weighted sum of two half-length DFTs. In particular,
$$S_j = S^e_j + e^{\frac{2\pi i j}{N}} S^o_j$$$$S_{j+\frac{N}{2}} = S^e_j - e^{\frac{2\pi i j}{N}} S^o_j$$Here, $S^e$ and $S^o$ are the even and odd elements of $S$, respectively.
figsize(10,10); xticks([]); yticks([])
imshow(imread("fft-image.png"))
<matplotlib.image.AxesImage at 0x56fd8110>
This observation is easy to translate into a recursive function for computing the Fourier transform. Note that this function only works for input signals whose length is a power of two.
For other kinds of input lengths, there are more complicated decompositions, or such signals need to get padded. For signals of prime lengths, there is no decomposition like this (i.e., we can only decompose the problem into subproblems whose lenghts are factors of the original length).
def myfft(s):
N = len(s)
if N<=1: return s
Se = fft(s[::2])
So = fft(s[1::2])
e = exp(arange(0,N/2)*2j*pi/N)
return r_[Se+e*So,Se-e*So]
myfft(array([1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0]))
array([ 2.00000000+0.j , 1.70710678+0.70710678j, 1.00000000+1.j , 0.29289322+0.70710678j, 0.00000000+0.j , 0.29289322-0.70710678j, 1.00000000-1.j , 1.70710678-0.70710678j])
k = arange(256)
signal = 1.0*(k>=100)*(k<110)
transform = myfft(signal)
plot(max1(signal))
plot(max1(abs(transform)))
[<matplotlib.lines.Line2D at 0x56314850>]
mix = data32+data128
play(mix,r=10)
plot(abs(fft(mix)))
[<matplotlib.lines.Line2D at 0x508cddd0>]
plot(abs(fft(mix))[:200])
[<matplotlib.lines.Line2D at 0x50aeb8d0>]
plot(abs(fft(mix))[120:140])
[<matplotlib.lines.Line2D at 0x511b1290>]
transform = fft(mix)
transform[120:130] = 0
transform[-130:-120] = 0
plot(abs(transform))
[<matplotlib.lines.Line2D at 0x5167be50>]
plot(real(ifft(transform))[:100])
plot(mix[:100])
[<matplotlib.lines.Line2D at 0x517e3d50>]
play(mix,r=10)
play(real(ifft(transform)),r=10)
test = numpy.fromfile("test.raw","int32")[::2]
test = test*1.0/amax(abs(test))
play(test)
print amax(test)
1.0
noise = test+sin(arange(len(test))/10.0)
play(noise)
Noise = fft(noise)
plot(abs(Noise)**.5)
[<matplotlib.lines.Line2D at 0x552540d0>]
plot((abs(Noise)**.5)[400:500])
[<matplotlib.lines.Line2D at 0x559c46d0>]
Noise[480:500] = 0
Noise[-500:-480] = 0
plot(abs(Noise)**.5)
[<matplotlib.lines.Line2D at 0x55f991d0>]
denoised = ifft(Noise)
play(real(denoised))
play(noise)