This is one of the 100 recipes of the IPython Cookbook, the definitive guide to high-performance scientific computing and data science in Python.

10.2. Applying a linear filter to a digital signal

Download the Nasdaq dataset on the book's website. (http://ipython-books.github.io)

The data has been obtained here: http://finance.yahoo.com/q/hp?s=^IXIC&a=00&b=1&c=1990&d=00&e=1&f=2014&g=d

  1. Let's import the packages.
In [ ]:
import numpy as np
import scipy as sp
import scipy.signal as sg
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

2 We load the Nasdaq data with Pandas.

In [ ]:
nasdaq_df = pd.read_csv('data/nasdaq.csv')
In [ ]:
nasdaq_df.head()
  1. Let's extract two columns: the date, and the daily closing value.
In [ ]:
date = pd.to_datetime(nasdaq_df['Date'])
nasdaq = nasdaq_df['Close']
  1. Let's take a look at the raw signal.
In [ ]:
plt.figure(figsize=(6,4));
plt.plot_date(date, nasdaq, '-');
  1. Now, we will follow a first approach to get the slow component of the signal's variations. We will convolve the signal with a triangular window: this corresponds to a FIR filter. We will explain the idea behind this method in How it works.... Let's just say for now that we replace each value with a weighted mean of the signal around that value.
In [ ]:
# We get a triangular window with 60 samples.
h = sg.get_window('triang', 60)
# We convolve the signal with this window.
fil = sg.convolve(nasdaq, h/h.sum())
In [ ]:
plt.figure(figsize=(6,4));
# We plot the original signal...
plt.plot_date(date, nasdaq, '-', lw=1);
# ... and the filtered signal.
plt.plot_date(date, fil[:len(nasdaq)], '-');
  1. Now, let's use another method. We create an IIR Butterworth low-pass filter to extract the slow variations of the signal. The filtfilt method allows us to apply a filter forward and backward in order to avoid phase delays.
In [ ]:
plt.figure(figsize=(6,4));
plt.plot_date(date, nasdaq, '-', lw=1);
# We create a 4-th order Butterworth low-pass filter.
b, a = sg.butter(4, 2./365)
# We apply this filter to the signal.
plt.plot_date(date, sg.filtfilt(b, a, nasdaq), '-');
  1. Finally, we now use the same method to create a high-pass filter and extract the fast variations of the signal.
In [ ]:
plt.figure(figsize=(6,4));
plt.plot_date(date, nasdaq, '-', lw=1);
b, a = sg.butter(4, 2*5./365, btype='high')
plt.plot_date(date, sg.filtfilt(b, a, nasdaq), '-', lw=.5);

The fast variations around 2000 correspond to the dot-com bubble burst, reflecting the high market volatility and the fast fluctuations of the stock market indices at that time. (http://en.wikipedia.org/wiki/Dot-com_bubble)

You'll find all the explanations, figures, references, and much more in the book (to be released later this summer).

IPython Cookbook, by Cyrille Rossant, Packt Publishing, 2014 (500 pages).