Real Estate Values in Humboldt County, CA

This is a quick-and-dirty example of an IPython Notebook. They are great for documenting projects as they are developed. Each code chunk can be executed as stand-alone code and markdown syntax can be used to narrate code functionality.

In this quick example I'm going to do something really basic: import and display some data on home values over time for Humboldt County, CA.

In [1]:
#first we need to import some libraries
import pylab
import pandas as pd
import pandas.io.data as web
import datetime
%matplotlib inline
import matplotlib.pyplot as plt
from matplotlib import style
import numpy as np
from Quandl import Quandl as quandl
style.use('ggplot')
/Applications/anaconda/lib/python2.7/site-packages/pandas/io/data.py:35: FutureWarning: 
The pandas.io.data module is moved to a separate package (pandas-datareader) and will be removed from pandas in a future version.
After installing the pandas-datareader package (https://github.com/pydata/pandas-datareader), you can change the import ``from pandas.io import data, wb`` to ``from pandas_datareader import data, wb``.
  FutureWarning)

Next, I'm going to bring data in from Quandl. The Quandl module for python makes this really easy. If you wanted to execute this on your own you would need to do 2 things:

  1. Install the Quandl module for Python with your Python distribution. I am using the Anaconda distribution so this was pretty easy for me...I was able to do "conda install Quandl" from the terminal and get it done.

  2. Sign up for an account at Quandl. When you do this they will give you an API key that you can use to pull data from thier databases

The Quandl data series I'm going to use is ZILL/C00399_MSP. This is one of the ~1.4 million Zillow data sets available through Quandl. It has the Zillow estimate of median home value of all homes in Humboldt County, CA. Data are available monthly from 1996.

In [3]:
df = quandl.get("ZILL/CO00399_MSP", authtoken="1i2uuiN7DQ-Ltizgjb_q")
print(df.head())
print(df.tail())
               Value
Date                
1996-04-30  103825.0
1996-05-31  101125.0
1996-06-30  101000.0
1996-07-31  103200.0
1996-08-31  101050.0
               Value
Date                
2016-02-29  242550.0
2016-03-31  243450.0
2016-04-30  263500.0
2016-05-31  267750.0
2016-06-30  263875.0

Finally, I'm going to plot these data using a really basic plot call

In [3]:
df.plot()
Out[3]:
<matplotlib.axes._subplots.AxesSubplot at 0x114d43450>

Let's try one more...I'm going to change the series to 'ZILL/C00822_MSP' (Zillow estimated median market value of median sale price within Eureka, CA).

In [4]:
df = quandl.get("ZILL/C00822_MSP", authtoken="1i2uuiN7DQ-Ltizgjb_q")
print(df.head())
print(df.tail())
df.plot()
              Value
Date               
1996-04-30  90175.0
1996-05-31  93600.0
1996-06-30  92450.0
1996-07-31  91350.0
1996-08-31  89450.0
               Value
Date                
2016-02-29  230950.0
2016-03-31  233600.0
2016-04-30  230700.0
2016-05-31  228150.0
2016-06-30  242450.0
Out[4]:
<matplotlib.axes._subplots.AxesSubplot at 0x11736d9d0>
In [ ]: