My BGE Smart Meter Data and Weather

This notebook composed by Justin Elszasz.

The analysis below explores the relationship between the weather and electricity consumption for my apartment. My current apartment has a split A/C which incorporates an electric resistance heater. Electric loads for our apartment are therefore much higher.

In [2]:
import pandas as pd
import numpy as np
import import_funcs
import statsmodels.api as sm
In [3]:
# Import the BGE hourly electricity data and weather data using import_funcs.py class
elec_data = import_funcs.BGEdata()
weather = import_funcs.weather()

# Merge into one Pandas dataframe
elec_and_weather = pd.merge(weather,elec_data,left_index=True,right_index=True)
elec_and_weather[:5]
Out[3]:
hum precipm tempm wspdm tempF COST NOTES TYPE UNITS USAGE timestamp_end
timestamp
2014-01-18 00:00:00 56 0 4 7.4 39.2 NaN NaN Electric usage kWh 1.13 2014-01-18 00:59:00
2014-01-18 01:00:00 61 0 4 7.4 39.2 NaN NaN Electric usage kWh 0.98 2014-01-18 01:59:00
2014-01-18 02:00:00 61 0 4 7.4 39.2 NaN NaN Electric usage kWh 0.94 2014-01-18 02:59:00
2014-01-18 03:00:00 65 0 4 7.4 39.2 NaN NaN Electric usage kWh 1.11 2014-01-18 03:59:00
2014-01-18 04:00:00 70 0 4 7.4 39.2 NaN NaN Electric usage kWh 1.34 2014-01-18 04:59:00

Exploratory

This is a snapshot of this time series data for the first seven days of February 2014.

In [18]:
fig = plt.figure()

plot1=elec_and_weather['tempF']['2014-02-01 00:00:00':'2014-02-07 23:00:00'].plot(color='b',label='Temp',grid=False)
plot1.set_ylabel('Ambient Temp. (deg F)')
legend(['Temp'],loc='upper left')
plot1.set_ylim([15,65])
plot1.set_xlabel('')

plot2=elec_and_weather['USAGE']['2014-02-01 00:00:00':'2014-02-07 23:00:00'].plot(color='g',secondary_y=True,label='Elec',mark_right=False,grid=False)
ylabel('Elec. Usage (kWh)')
legend(['Elec.'],loc='upper right')
plot2.set_ylim([1,2.8])
plot2.set_xlabel('')

fig.savefig('Elec_and_Temp_TS.png')

In order to start exploring the relationship between weather and electricity consumption, I'll start with just the scatter plot of outdoor temperature on the horizontal axis and usage (kWh) on the vertical axis.

In [19]:
fig = plt.figure()
plot_scatter = plot(elec_and_weather['tempF'],elec_and_weather['USAGE'],'gx')
xlabel('Outdoor Temperature (deg F)')
ylabel('Hourly Electricity Use (kWh)')
ylim(ymin=0)
title('Hourly Electricity Consumption \n Jan. 18 through March 2014')

fig.savefig('Elec_and_Temp_Scatter.png')

Linear Regression

There's definitely an inverse relationship between the outdoor temperature and the outdoor temperature. Of course later in the year, the relationship should flip due to the use of air conditioning. We haven't turned on the A/C yet. I'll start by seeing how strong the correlation is with a ordinary least squares regression. The Statsmodels module has a wide range of statistics tools available.

In [12]:
# Linear regression on elec. vs. outdoor T (deg F)
model = sm.OLS(elec_and_weather['USAGE'],sm.add_constant(elec_and_weather['tempF']))
res = model.fit()
print res.summary()
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                  USAGE   R-squared:                       0.483
Model:                            OLS   Adj. R-squared:                  0.483
Method:                 Least Squares   F-statistic:                     1634.
Date:                Thu, 17 Apr 2014   Prob (F-statistic):          7.53e-253
Time:                        07:55:23   Log-Likelihood:                -1481.8
No. Observations:                1750   AIC:                             2968.
Df Residuals:                    1748   BIC:                             2978.
Df Model:                           1                                         
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
const          2.9632      0.040     73.916      0.000         2.885     3.042
tempF         -0.0453      0.001    -40.428      0.000        -0.048    -0.043
==============================================================================
Omnibus:                      153.943   Durbin-Watson:                   0.131
Prob(Omnibus):                  0.000   Jarque-Bera (JB):               55.711
Skew:                          -0.176   Prob(JB):                     7.99e-13
Kurtosis:                       2.200   Cond. No.                         106.
==============================================================================
In [20]:
fig = plt.figure()

plot_scatter = plot(elec_and_weather['tempF'],elec_and_weather['USAGE'],'gx')
plot_model = plot(elec_and_weather['tempF'],res.fittedvalues,'b',label='OLS $R^2$=%.2f' % res.rsquared)
xlabel('Outdoor Temperature (deg F)')
ylabel('Hourly Electricity Use (kWh)')
ylim(ymin=0)
title('Hourly Electricity Consumption \n Jan. 18 through March 2014')
legend()


fig.savefig('Elec_And_Temp_OLS.png')

Other Weather Variables

Wind Speed

In [12]:
fig = plt.figure()

plot1=elec_and_weather['wspdm']['2014-02-01 00:00:00':'2014-02-07 23:00:00'].plot(color='b',label='Wind',grid=False)
plot1.set_ylabel('Mean Wind Speed (kph)')
legend(loc='upper left')
#plot1.set_ylim([15,65])
plot1.set_xlabel('')

plot2=elec_and_weather['USAGE']['2014-02-01 00:00:00':'2014-02-07 23:00:00'].plot(color='g',secondary_y=True,label='Elec',mark_right=False,grid=False)
ylabel('Elec. Usage (kWh)')
legend(['Elec.'],loc='upper right')
plot2.set_ylim([1,2.8])
plot2.set_xlabel('')

#fig.savefig('Elec_and_Temp_TS.png')
Out[12]:
<matplotlib.text.Text at 0xbe98fb0>
In [14]:
fig = plt.figure()
plot_scatter = plot(elec_and_weather['wspdm'],elec_and_weather['USAGE'],'gx')
xlabel('Wind Speed (km/h)')
ylabel('Elec. Usage (kWh)')
#ylim(ymin=0)
#title('Hourly Electricity Consumption \n Jan. 18 through March 2014')

#fig.savefig('Elec_and_Temp_Scatter.png')
Out[14]:
<matplotlib.text.Text at 0xbc7be90>
In [21]:
elec_and_weather['wspdm'].hist()
Out[21]:
<matplotlib.axes.AxesSubplot at 0xc5363f0>

Humidity

In [17]:
fig = plt.figure()

plot1=elec_and_weather['hum']['2014-02-01 00:00:00':'2014-02-07 23:00:00'].plot(color='b',label='Hum.',grid=False)
plot1.set_ylabel('Relative Humidity (%)')
legend(loc='lower right')
#plot1.set_ylim([15,65])
plot1.set_xlabel('')

plot2=elec_and_weather['USAGE']['2014-02-01 00:00:00':'2014-02-07 23:00:00'].plot(color='g',secondary_y=True,label='Elec',mark_right=False,grid=False)
ylabel('Elec. Usage (kWh)')
legend(['Elec.'],loc='upper right')
plot2.set_ylim([1,2.8])
plot2.set_xlabel('')

#fig.savefig('Elec_and_Temp_TS.png')
Out[17]:
<matplotlib.text.Text at 0xc30b1d0>
In [19]:
fig = plt.figure()
plot_scatter = plot(elec_and_weather['hum'],elec_and_weather['USAGE'],'gx')
xlabel('Relative Humidity (%)')
ylabel('Elec. Usage (kWh)')
#ylim(ymin=0)
#title('Hourly Electricity Consumption \n Jan. 18 through March 2014')

#fig.savefig('Elec_and_Temp_Scatter.png')
Out[19]:
<matplotlib.text.Text at 0xc508a90>
In [20]:
elec_and_weather['hum'].hist()
Out[20]:
<matplotlib.axes.AxesSubplot at 0xc554090>

Bayesian Regression

In [ ]: