Understand what pyplot stores under the hood when we create a single plot:
Figure creates a plot
fig = plt.figure()
. Needs to be called everytime you want to create a plot.
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
unrate = pd.read_csv('unrate.csv')
#convert date column to datetime data b/c its timeseries
unrate['DATE'] = pd.to_datetime(unrate['DATE'])
fig = plt.figure()
<Figure size 432x288 with 0 Axes>
Axes class creates grid in the figure. Useful when we have multiple plots
axes_obj = fig.add_subplot(nrows, ncols, plot_number)
. Needs to be called everytime you want to create a plot.
#Create a grid, 2 rows by 1 column of grid
#Stack two plots together, one after another
fig = plt.figure()
ax1 = fig.add_subplot(2,1,1)
ax2 = fig.add_subplot(2,1,2)
plt.show()
In matplotlib, the plot number starts at the top left position in the grid, moves through the remaining in that row, then jumps to the left most plot in the second row, and so forth
#Two row, two column grid
ax1 = fig.add_subplot(2,2,1)
ax2 = fig.add_subplot(2,2,2)
ax3 = fig.add_subplot(2,2,3)
ax4 = fig.add_subplot(2,2,4)
axes.plot()
accepts any iterable object for these parameteres, including numpy arrays and pandas series objects.
#Create 2 line subplots in a 2 row by 1 column layout:
fig = plt.figure()
ax1 = fig.add_subplot(2,2,1)
ax2 = fig.add_subplot(2,2,2)
first_twelve_values = unrate[:12]
second_twelve_values = unrate[12:24]
ax1.plot(first_twelve_values['DATE'], first_twelve_values['VALUE'])
ax2.plot(second_twelve_values['DATE'], second_twelve_values['VALUE'])
plt.show()
fig = plt.figure(figsize = (width, height))
changes the dimension of plotting areaPass the
figsize
parameter when calling the pyplot functionfigure()
fig = plt.figure(figsize = (12,12))
ax1 = fig.add_subplot(2,1,1)
ax2 = fig.add_subplot(2,1,2)
first = unrate[:12]
second = unrate[12:24]
ax1.plot(first['DATE'], first['VALUE'])
ax2.plot(second['DATE'], second['VALUE'])
ax1.set_title('Monthly Unemployment Rate, 1948')
ax2.set_title('Monthly Unemployment Rate, 1949')
plt.show()
We're going to visualize data from a few more years to see if we find any evidence for seasonality between those years.
use
for i in range():
function to not repeat unncessary code
# Plot the years from 1948 to 1952
fig = plt.figure(figsize=(12,12))
for i in range(5):
axes = fig.add_subplot(5,1,i+1)
start_plot = i*12
end_plot = (i+1)*12
subset = unrate[start_plot:end_plot]
axes.plot(subset['DATE'], subset['VALUE'])
plt.show()
To reduce scanning the graphs, put the line charts together in a single subplot.
DATE
column
pandas.Series.dt.month
to make (e.g.1
for Jan,2
for Feb).
Note: e.g.) pandas.series = unrate['DATE']
call
pyplot.plot()
multiple times
Note: pyplot = plt
use
c = 'red'
when callingplot()
fig = plt.figure(figsize = (6,3))
unrate['MONTH'] = unrate['DATE'].dt.month
first = unrate[:12]
second = unrate[12:24]
plt.plot(first['MONTH'], first['VALUE'], c = 'red')
plt.plot(second['MONTH'], second['VALUE'], c = 'blue')
plt.show()
Visualize 5 years worth of unemployment rates on the same subplot
fig = plt.figure(figsize = (10,6))
#make subset colors to put into range function
colors = ['red', 'blue', 'green', 'orange', 'black']
for i in range(5):
start = i*12
end = (i+1)*12
subset = unrate[start:end]['MONTH'], unrate[start:end]['VALUE']
plt.plot(subset, colors[i])
plt.show()
fig = plt.figure(figsize=(10,6))
colors = ['red', 'blue', 'green', 'orange', 'black']
for i in range(5):
start_index = i*12
end_index = (i+1)*12
subset = unrate[start_index:end_index]
plt.plot(subset['MONTH'], subset['VALUE'], c=colors[i])
plt.show()