This is one of the 100 recipes of the IPython Cookbook, the definitive guide to high-performance scientific computing and data science in Python.
Download the Weather dataset on the book's website. (http://ipython-books.github.io)
The data has been obtained here: http://www.ncdc.noaa.gov/cdo-web/datasets#GHCND.
scipy.fftpack
which includes many FFT-related routines.import datetime
import numpy as np
import scipy as sp
import scipy.fftpack
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
-9999
is used for N/A values. Pandas can easily handle this. In addition, we tell Pandas to parse dates contained in the DATE
column.df0 = pd.read_csv('data/weather.csv',
na_values=(-9999),
parse_dates=['DATE'])
df = df0[df0['DATE']>='19940101']
df.head()
groupby
method provided by Pandas lets us do that easily. We also remove any NA value.df_avg = df.dropna().groupby('DATE').mean()
df_avg.head()
date = df_avg.index.to_datetime()
temp = (df_avg['TMAX'] + df_avg['TMIN']) / 20.
N = len(temp)
plt.figure(figsize=(6,3));
plt.plot_date(date, temp, '-', lw=.5);
plt.ylim(-10, 40);
plt.xlabel('Date');
plt.ylabel('Mean temperature');
fft
function.temp_fft = sp.fftpack.fft(temp)
temp_psd = np.abs(temp_fft) ** 2
fftfreq
utility function does just that. It takes as input the length of the PSD vector, as well as the frequency unit. Here, we choose an annual unit: a frequency of 1 corresponds to 1 year (365 days). We provide 1./365
because the original unit is in days.fftfreq = sp.fftpack.fftfreq(len(temp_psd), 1./365)
fftfreq
function returns positive and negative frequencies. We are only interested in positive frequencies here since we have a real signal (this will be explained in How it works...).i = fftfreq>0
1/year
).plt.figure(figsize=(8,4));
plt.plot(fftfreq[i], 10*np.log10(temp_psd[i]));
plt.xlim(0, 5);
plt.xlabel('Frequency (1/year)');
plt.ylabel('PSD (dB)');
We observe a peak for $f=1/year$: this is because the fundamental frequency of the signal is the yearly variation of the temperature.
temp_fft_bis = temp_fft.copy()
temp_fft_bis[np.abs(fftfreq) > 1.1] = 0
temp_slow = np.real(sp.fftpack.ifft(temp_fft_bis))
plt.figure(figsize=(6,3));
plt.plot_date(date, temp, '-', lw=.5);
plt.plot_date(date, temp_slow, '-');
plt.xlim(datetime.date(1994, 1, 1), datetime.date(2000, 1, 1));
plt.ylim(-10, 40);
plt.xlabel('Date');
plt.ylabel('Mean temperature');
You'll find all the explanations, figures, references, and much more in the book (to be released later this summer).
IPython Cookbook, by Cyrille Rossant, Packt Publishing, 2014 (500 pages).