I've written an interesting look at 1D and 2D histogram performance here: https://iscinumpy.gitlab.io/post/histogram-speeds-in-python/
The plotting library Bokeh has just reached version 1.0. We'll use that today instead of matplotlib just to make things interesting. It's not as old and respected as matplotlib, but is prettier in a browser.
from bokeh.io import output_notebook, show
from bokeh.plotting import figure
output_notebook()
import numpy as np
Our goal will be to decompose (periodic) functions into Fourier components. A few definitions:
$T$ is the period. $\omega = \frac{2\pi}{T}$ is the true frequency. The Fourier series is then written:
$$ y(t) = \frac{a_0}{2} + \sum_{n=1}^{\infty} \left( a_n \cos n \omega t + b_n \sin n \omega t \right) \tag{1} $$We can compute it by integrating over the function of interest and computing the components:
$$ \left(\begin{matrix}a_n \\ b_n\end{matrix}\right) = \frac{2}{T} \int^{T}_{0} \left(\begin{matrix}\cos n \omega t \\ \sin n \omega t \end{matrix}\right) y(t) \, dt \tag{2} $$Useful tips:
Let's define a step function valid from $-L$ to $L$ (it repeats outside that range). We have:
$$ f(t) = \begin{cases}1 & t \ge 0 \\ -1 & t < 0\end{cases}. $$This is odd, so we only need to integrate from $0$ to $L=T/2$ and multiply by 2. The average value is 0, so we can drop $a_0$. We have $\omega = 2\pi / T = \pi / L$ So equation 2 becomes:
$$ a_n =2 \frac{1}{L} \int^{L}_0 \sin \frac{n \pi t}{L} \, dt. $$Since $\int_0^A \sin a t \, dt = \frac{1 - \cos(a A)}{a}$, we have:
$$ a_n = \frac{2}{L} \frac{L}{n \pi} \left( 1 - \cos n \pi \right), $$which is non-zero only for odd n, giving the final formula:
$$ a_n = \frac{4}{n \pi}, \quad n\ \textrm{odd}. $$Combining with equation 1 gives:
$$ f(t) = \frac{4}{\pi} \sum^{\infty}_{n=1,3,5,\ldots} \frac{1}{n} \sin\left( \frac{n \pi t}{L} \right). $$t = np.linspace(-1.25 * np.pi, 1.25 * np.pi, 2000)
y = np.zeros_like(t)
for n in range(1, 100, 2):
y += 1 / n * np.sin(n * t)
y *= 4 / np.pi
p = figure(width=500, height=300)
p.line(t, y)
show(p)
The function is a best fit to the sawtooth, but it always "overshoots" the edges - this is the Gibbs phenomenon.
We can also look at the spectrum of frequencies:
x = np.arange(1, 100)
s = 4 / (np.pi * x)
s[1::2] = 0
p = figure(width=500, height=300)
p.vbar(x - 0.5, 1, s)
show(p)
From the book, Example 10.2.1.
$$ y(t) = \frac{2}{\pi} \biggr[ \sin \omega t - \sin 2 \omega t + \sin 3 \omega t - \cdots \biggr] $$x = np.linspace(-2.5 * np.pi, 2.5 * np.pi, 2000)
y = np.zeros_like(x)
L = 2 * np.pi
for n in range(1, 50):
y += 1 / n * (-1) ** (n + 1) * np.sin(2 * n * np.pi * x / L)
y *= 2 / np.pi
p = figure(width=500, height=300)
p.line(x, y)
show(p)
We can also look at the spectrum of frequencies:
x = np.arange(1, 50)
s = (-1) ** (x + 1) * 2 / (np.pi * x)
p = figure(width=500, height=300)
p.vbar(x - 0.5, 1, s)
show(p)
Now briefly look at the Fourier transform, using the complex exponential function. Computationally, we will be converting the integrals to series anyway, so this becomes equivalent to the Fourier series.
$$ y(t) = \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^{\infty} e^{i \omega t} Y(\omega) \, d\omega. \tag{3} $$And its inverse:
$$ Y(\omega) = \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^{\infty} e^{- i \omega t} y(t) \, dt. \tag{4} $$You'll notice that different conventions exist for the sign and the location of the scaling factor - just be consistent and you'll be okay. We will follow the book's lead here.
We can apply the transform and the inverse together to obtain the Dirac delta function:
$$ \int_{-\infty}^{\infty} e^{i (\omega' - \omega) t}\, dt \equiv 2 \pi \delta (\omega' - \omega) $$This "function" is useful analytically because it can select on point out of an integration range, but ugly computationally. You should do it by hand before coding, rather than later.
Fourier tools are great for analysis (as long as you remember their limitations). Real measurements generally are discrete. So we need a discrete version of this machinery.
We start with a signal $y(t)$ sampled in $N$ time intervals ($N+1$ measurements). We can assume constant time spacing $h \equiv \Delta t$. The total time is $T$ (carefully selected to look like a period as well, we'll see more later). We have a sampling rate $s \equiv N/T=1/h$ So:
$$ y_k \equiv y(t_k) \\ t_k \equiv k h $$At this point, we can view this Fourier transform as a Fourier Series, since we now have a periodicity imposed; the final measurement must be equal to the first $y_0 = y_N$.
Since we only have $N$ data points (left), we can only determine $N$ independent output Fourier components. We can write all the frequencies as multiples of the base frequency:
$$ \omega_n = n \omega_1 = n \frac{2 \pi}{N h} $$Now we can evaluate the integral in (3). We can just evaluate over the range 0 to T (due to above requirements on periodicity), and we can use the trapezoid rule. Since the first point is equal to the last point, we don't even need special handling at the end points.
Forward transform:
$$ Y_n \equiv \frac{1}{h} Y(\omega_n) \approx \sum_{k=1}^{N} y_k \frac{e^{-2 \pi i k n / N}}{\sqrt{2 \pi}} \tag{5} $$And inverse:
$$ y(t) \approx \sum_{n=1}^{N} \frac{2 \pi}{N h} Y_n \frac{e^{i \omega_n t}}{\sqrt{2 \pi}} \tag{6} $$Let's add one more trick to make this easier to compute: we will introduce $Z=e^{-2\pi i / N}$, and rewrite the above expressions as
$$ y_k = \frac{\sqrt{2 \pi}}{N} \sum_{n=1}^{N} Z^{-n k} Y_n $$$$ Y_n = \frac{1}{\sqrt{2 \pi}} \sum_{k=1}^{N} Z^{n k} y_k $$N = 50
x = np.linspace(0, 2 * np.pi, N + 1)
signal = 30 * np.cos(3 * x) + 20 * np.sin(7 * x) # Change these values
h = 2 * np.pi / N
p = figure(width=500, height=300)
p.circle(x, signal)
show(p)
n = np.arange(N).reshape(-1, 1)
k = np.arange(N).reshape(1, -1)
zexpo = 1j * 2 * np.pi * k * n / N
zsum = np.sum(signal[:N] * np.exp(-zexpo), axis=1)
dftz = zsum / np.sqrt(2 * np.pi)
Note: Formula taken from the book. I think there may be a missing element on the end - but due to symmetric nature, the end element would be the same as the beginning element, so we don't lose anything.
p = figure(width=500, height=300)
p.line(k.ravel(), dftz.real)
p.line(k.ravel(), dftz.imag, color="firebrick")
show(p)
zfft = np.fft.fft(signal)
p = figure(width=500, height=300)
p.line(np.arange(N + 1), zfft.real)
p.line(np.arange(N + 1), zfft.imag, color="firebrick")
show(p)
We'll use the FFT to compute and look at a problem where the frequencies are not quite so perfectly aligned. This example is taken from the SciPy documentation, but is implemented in Numpy and Bokeh.
N = 600
T = 1.0 / 800.0
x = np.linspace(0.0, N * T, N)
y = np.sin(50.0 * 2.0 * np.pi * x) + 0.5 * np.sin(80.0 * 2.0 * np.pi * x)
yf = np.fft.fft(y)
xf = np.linspace(0.0, 1.0 / (2.0 * T), N)
norm_yf = 2.0 / N * np.abs(yf)
p = figure(width=500, height=300)
p.line(x, y)
show(p)
p = figure(width=500, height=300)
p.line(xf[: N // 2 + 1], norm_yf[: N // 2 + 1])
show(p)
rfft
with the full range.Let's look at that in more detail: (the description here is good).
t = np.linspace(0, 1, 1000)
f0 = 13 # Sampling frequency
f1 = 3 # First frequency
k = 1 # Any integer
f2 = f1 + k * f0 # Identical under the sampling frequency!
tp = np.linspace(0, 1, f0 + 1)
y1 = np.sin(2 * f1 * np.pi * t)
y2 = np.sin(2 * f2 * np.pi * t)
ytp = np.sin(2 * f2 * np.pi * tp)
p = figure(width=700, height=300)
p.line(t, y1)
p.line(t, y2, color="firebrick")
p.circle(tp, ytp, color="black")
show(p)