In [1]:

```
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
pd.set_option('display.mpl_style', 'default') # Make the graphs a bit prettier
plt.rcParams['figure.figsize'] = (15, 5)
plt.rcParams['font.family'] = 'sans-serif'
# This is necessary to show lots of columns in pandas 0.12.
# Not necessary in pandas 0.13.
pd.set_option('display.width', 5000)
pd.set_option('display.max_columns', 60)
```

First, we need to load up the data. We've done this before.

In [2]:

```
bikes = pd.read_csv('../data/bikes.csv', sep=';', encoding='latin1', parse_dates=['Date'], dayfirst=True, index_col='Date')
bikes['Berri 1'].plot()
```

Out[2]:

Next up, we're just going to look at the Berri bike path. Berri is a street in Montreal, with a pretty important bike path. I use it mostly on my way to the library now, but I used to take it to work sometimes when I worked in Old Montreal.

So we're going to create a dataframe with just the Berri bikepath in it

In [3]:

```
berri_bikes = bikes[['Berri 1']].copy()
```

In [4]:

```
berri_bikes[:5]
```

Out[4]:

In [5]:

```
berri_bikes.index
```

Out[5]:

You can see that actually some of the days are missing -- only 310 days of the year are actually there. Who knows why.

Pandas has a bunch of really great time series functionality, so if we wanted to get the day of the month for each row, we could do it like this:

In [6]:

```
berri_bikes.index.day
```

Out[6]:

We actually want the weekday, though:

In [7]:

```
berri_bikes.index.weekday
```

Out[7]:

These are the days of the week, where 0 is Monday. I found out that 0 was Monday by checking on a calendar.

Now that we know how to *get* the weekday, we can add it as a column in our dataframe like this:

In [8]:

```
berri_bikes.loc[:,'weekday'] = berri_bikes.index.weekday
berri_bikes[:5]
```

Out[8]:

This turns out to be really easy!

Dataframes have a `.groupby()`

method that is similar to SQL groupby, if you're familiar with that. I'm not going to explain more about it right now -- if you want to to know more, the documentation is really good.

In this case, `berri_bikes.groupby('weekday').aggregate(sum)`

means "Group the rows by weekday and then add up all the values with the same weekday".

In [9]:

```
weekday_counts = berri_bikes.groupby('weekday').aggregate(sum)
weekday_counts
```

Out[9]:

It's hard to remember what 0, 1, 2, 3, 4, 5, 6 mean, so we can fix it up and graph it:

In [10]:

```
weekday_counts.index = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
weekday_counts
```

Out[10]:

In [11]:

```
weekday_counts.plot(kind='bar')
```

Out[11]:

So it looks like Montrealers are commuter cyclists -- they bike much more during the week. Neat!

Let's put all that together, to prove how easy it is. 6 lines of magical pandas!

If you want to play around, try changing `sum`

to `max`

, `numpy.median`

, or any other function you like.

In [12]:

```
bikes = pd.read_csv('../data/bikes.csv',
sep=';', encoding='latin1',
parse_dates=['Date'], dayfirst=True,
index_col='Date')
# Add the weekday column
berri_bikes = bikes[['Berri 1']].copy()
berri_bikes.loc[:,'weekday'] = berri_bikes.index.weekday
# Add up the number of cyclists by weekday, and plot!
weekday_counts = berri_bikes.groupby('weekday').aggregate(sum)
weekday_counts.index = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
weekday_counts.plot(kind='bar')
```

Out[12]: