I'm training for a marathon and I use MapMyFitness (MMF) on my iPhone to track my mileage and pace for each workout. MMF has a public API and Jason Sanford has written a Python front end for it. Which means that I can easily get hold of all my data in Python and explore it with Pandas! To run this notebook, you'll need:
You should also
pip install mapmyfitness
Here we go...
%matplotlib inline
import pandas as pd
from mapmyfitness import MapMyFitness
import matplotlib.pyplot as plt
import matplotlib
import numpy as np
pd.options.display.mpl_style = 'default'
The next cell logs into MMF, grabs all my workout data, filters for a specified activity type (running, in my case), and extracts the date, distance and pace for each workout. You'll need to enter your API key and access token to use it.
def get_workouts(verbose=True, workout_type='run'):
# Log in
mmf = MapMyFitness(api_key='your-key', \
access_token='your-token')
# get all workouts
workout_pages = mmf.workout.search(user=48155002,per_page=40,cache_finds=True) # doesn't work if per_page>40
paces = []
distances = []
dates = []
workouts = []
for pagenum in workout_pages.page_range:
workout_list = workout_pages.page(pagenum)
for i,workout in enumerate(workout_list):
if verbose:
print "processing workout " + str(i+1) + " of " + str(len(workout_list))
if workout_type in workout.activity_type.root_activity_type.name.lower():
workouts.append(workout)
distances.append(workout.distance_total/1609.344) # convert meters to miles
paces.append(26.8224/workout.speed_avg) # convert m/s to minutes per mile
dates.append(workout.start_datetime)
return distances, paces, dates, workouts
Be warned that this function takes a while. You can set verbose=1
to have it update you regularly.
distances, paces, dates, workout_list = get_workouts(verbose=0)
/Users/ketch/anaconda/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py:730: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html (This warning will only appear once by default.) InsecureRequestWarning)
The basic Pandas object we'll use is a dataframe. We can create it as follows:
dist_ts = pd.Series(distances,index=dates)
pace_ts = pd.Series(paces,index=dates)
df = pd.DataFrame({'Distance': dist_ts, 'Pace': pace_ts})
Let's see what's in it. Rather than printing the whole (long) table, I'll use head
to print the first few rows.
df.head()
Distance | Pace | |
---|---|---|
2014-09-02 03:24:23+00:00 | 1.69787 | 9.307203 |
2014-09-04 16:49:09+00:00 | 2.05400 | 9.356798 |
2014-09-06 18:10:46+00:00 | 2.90366 | 9.370637 |
2014-09-08 03:00:49+00:00 | 3.36122 | 9.943686 |
2014-09-10 03:02:17+00:00 | 3.09783 | 9.788455 |
As you can see, the workouts are sorted chronologically. I started training (and using MapMyFitness) at the beginning of September, about 4 months ago. At the time I could only comfortably run 2-3 miles, and my pace was slower than 9 minutes per mile.
Now we can easily plot workout distance and pace versus date:
fs = 15
plotargs = {'figsize' : (12,4), 'fontsize' : fs}
df.Distance.plot(title='Workout distance (miles)',lw=2,**plotargs);
df.Pace.plot(title='Average pace (minutes per mile)',lw=2,**plotargs);
Clearly, I'm running farther and faster as my training program progresses! My pace is down to around 8 minutes per mile, and my typical runs are 5 miles or more. Here's a histogram of the distances for all my workouts since I started:
df.Distance.plot(kind='hist',title='Workout distance (miles)');
plt.xlabel('Miles'); plt.ylabel('Number of runs');
How far have I run in total? Here's a cumulative distance plot, showing that I've run more than 200 miles.
df.Distance.cumsum().plot(title='Total workout distance (miles)',lw=2,**plotargs);
Let's see how my paces in the last two months compare to those in the first two months. I suspect Pandas has a better way to do this than what I've implemented below, but this works...
pace1 = []
pace2 = []
for date,pace in zip(df.index,df['Pace']):
if date.month<11:
pace1.append(pace)
pace2.append(np.nan)
else:
pace1.append(np.nan)
pace2.append(pace)
df2 = pd.DataFrame({'First two months pace' : pd.Series(pace1,index=dates),
'Last two months pace' : pd.Series(pace2,index=dates)})
df2.plot(kind='hist',stacked=True,fontsize=fs); plt.ylabel('# of runs');
Again, it's clear that my pace has improved.
Here are my five longest runs:
df.sort('Distance')[-5:][::-1]
Distance | Pace | |
---|---|---|
2014-12-27 04:39:34+00:00 | 10.28480 | 8.698211 |
2014-12-06 04:36:18+00:00 | 9.00748 | 8.506801 |
2014-12-13 04:39:00+00:00 | 8.97491 | 8.635715 |
2014-11-22 03:09:55+00:00 | 8.47542 | 8.469062 |
2014-11-15 03:09:18+00:00 | 7.65717 | 8.535385 |
Next we aggregate data for each week and each month:
weekly = df.resample('W',how=['mean','sum'],kind='period')
monthly = df.resample('M',how=['mean','sum'],kind='period')
Now we can plot the average pace for each week:
fig = weekly.Pace['mean'].plot(title='Average pace',**plotargs)
fig.set_ylabel('minutes per mile',fontsize=fs);
The total milage per week:
fig = weekly.Distance['sum'].plot(kind='bar',title='Total miles per week',**plotargs)
fig.set_ylabel('Miles',fontsize=fs);
The plot above shows that I have not been terribly consistent, and have missed a number of workouts due to travel or sickness. For instance, the second week of October I was on vacation in Jordan, and the last week of November I was in Mexico on business.
Or the average mileage per run, by month:
fig = monthly.Distance['mean'].plot(kind='bar',title='Average miles per run',**plotargs)
fig.set_ylabel('Miles',fontsize=fs);
MapMyFitness records the actual routes, so we can also plot them.
import mpld3
mpld3.enable_notebook()
plt.figure(figsize=(8,8))
for w in workout_list:
if w.route is not None:
points = [(p['lat'],p['lng']) for p in w.route.points()];
lat, long = zip(*points);
if min(long)>39: # Omit workouts in other countries
plt.plot(long,lat)