#!/usr/bin/env python # coding: utf-8 # > This is one of the 100 recipes of the [IPython Cookbook](http://ipython-books.github.io/), the definitive guide to high-performance scientific computing and data science in Python. # # # 10.1. Analyzing the frequency components of a signal with a Fast Fourier Transform # Download the *Weather* dataset on the book's website. (http://ipython-books.github.io) # The data has been obtained here: http://www.ncdc.noaa.gov/cdo-web/datasets#GHCND. # 1. Let's import the packages, including `scipy.fftpack` which includes many FFT-related routines. # In[ ]: import datetime import numpy as np import scipy as sp import scipy.fftpack import pandas as pd import matplotlib.pyplot as plt get_ipython().run_line_magic('matplotlib', 'inline') # 2. We import the data from the CSV file. The number `-9999` is used for N/A values. Pandas can easily handle this. In addition, we tell Pandas to parse dates contained in the `DATE` column. # In[ ]: df0 = pd.read_csv('data/weather.csv', na_values=(-9999), parse_dates=['DATE']) # In[ ]: df = df0[df0['DATE']>='19940101'] # In[ ]: df.head() # 3. Each row contains the precipitation and extremal temperatures recorded one day by one weather station in France. For every date, we want to get a single average temperature for the whole country. The `groupby` method provided by Pandas lets us do that easily. We also remove any NA value. # In[ ]: df_avg = df.dropna().groupby('DATE').mean() # In[ ]: df_avg.head() # 4. Now, we get the list of dates and the list of corresponding temperature. The unit is in tenth of degree, and we get the average value between the minimal and maximal temperature, which explains why we divide by 20. # In[ ]: date = df_avg.index.to_datetime() temp = (df_avg['TMAX'] + df_avg['TMIN']) / 20. N = len(temp) # 5. Let's take a look at the evolution of the temperature. # In[ ]: plt.figure(figsize=(6,3)); plt.plot_date(date, temp, '-', lw=.5); plt.ylim(-10, 40); plt.xlabel('Date'); plt.ylabel('Mean temperature'); # 6. We now compute the Fourier transform and the spectral density of the signal. The first step is to compute the FFT of the signal using the `fft` function. # In[ ]: temp_fft = sp.fftpack.fft(temp) # 7. Once the FFT has been obtained, one needs to take the square of its absolute value to get the **power spectral density** (PSD). # In[ ]: temp_psd = np.abs(temp_fft) ** 2 # 8. The next step is to get the frequencies corresponding to the values of the PSD. The `fftfreq` utility function does just that. It takes as input the length of the PSD vector, as well as the frequency unit. Here, we choose an annual unit: a frequency of 1 corresponds to 1 year (365 days). We provide `1./365` because the original unit is in days. # In[ ]: fftfreq = sp.fftpack.fftfreq(len(temp_psd), 1./365) # 9. The `fftfreq` function returns positive and negative frequencies. We are only interested in positive frequencies here since we have a real signal (this will be explained in *How it works...*). # In[ ]: i = fftfreq>0 # 10. We now plot the power spectral density of our signal, as a function of the frequency (in unit of `1/year`). # In[ ]: plt.figure(figsize=(8,4)); plt.plot(fftfreq[i], 10*np.log10(temp_psd[i])); plt.xlim(0, 5); plt.xlabel('Frequency (1/year)'); plt.ylabel('PSD (dB)'); # We observe a peak for $f=1/year$: this is because the fundamental frequency of the signal is the yearly variation of the temperature. # 11. Now, we cut out the frequencies higher than the fundamental frequency. # In[ ]: temp_fft_bis = temp_fft.copy() temp_fft_bis[np.abs(fftfreq) > 1.1] = 0 # 12. The next step is to perform an **inverse FFT** to convert the modified Fourier transform back to the temporal domain. This way, we recover a signal that mainly contains the fundamental frequency, as shown in the figure below. # In[ ]: temp_slow = np.real(sp.fftpack.ifft(temp_fft_bis)) # In[ ]: plt.figure(figsize=(6,3)); plt.plot_date(date, temp, '-', lw=.5); plt.plot_date(date, temp_slow, '-'); plt.xlim(datetime.date(1994, 1, 1), datetime.date(2000, 1, 1)); plt.ylim(-10, 40); plt.xlabel('Date'); plt.ylabel('Mean temperature'); # > You'll find all the explanations, figures, references, and much more in the book (to be released later this summer). # # > [IPython Cookbook](http://ipython-books.github.io/), by [Cyrille Rossant](http://cyrille.rossant.net), Packt Publishing, 2014 (500 pages).