Velib' Open Data

Let's take a look at the Velib' open data. You'll need to register for free to get an API key if you want to run this notebook.

Python Preambule

We first import some modules to load and analyze the data.

In [1]:
import urllib2
import json
import os
import datetime
import pandas as pd

We retrieve the API key in ~/.velib, a text file with just the API key.

In [2]:
with open(os.path.expanduser('~/.velib'), 'r') as f:
    key = f.read()

This function generates the full URL from a short REST path.

In [3]:
def geturl(path):
    delim = '&' if '?' in path else '?'  
    return "https://api.jcdecaux.com/vls/v1/{0:s}{1:s}apiKey={2:s}".format(path, delim, key)

This function returns the requested data in a Python dictionary.

In [4]:
def get(path):
    url = geturl(path)
    return json.loads(urllib2.urlopen(url).read())

Here, we retrieve the list of all contracts, and show only the Paris contract.

In [5]:
filter(lambda d: d['name'] == 'Paris', get('contracts'))
Out[5]:
[{u'cities': [u'Arcueil',
   u'Aubervilliers',
   u'Bagnolet',
   u'Boulogne Billancourt',
   u'Charenton',
   u'Clichy',
   u'Fontenay-sous-Bois',
   u'Gentilly',
   u'Issy les Moulineaux',
   u'Ivry',
   u'Joinville',
   u'Le Kremlin Bic\xeatre',
   u'Le Pr\xe9 St Gervais',
   u'Les Lilas',
   u'Levallois-Perret',
   u'Malakoff',
   u'Montreuil',
   u'Montrouge',
   u'Neuilly',
   u'Nogent',
   u'Pantin',
   u'Paris',
   u'Puteaux',
   u'Saint Cloud',
   u'Saint Denis',
   u'Saint Mand\xe9',
   u'Saint Maurice',
   u'Saint Ouen',
   u'Suresnes',
   u'Vanves',
   u'Vincennes'],
  u'commercial_name': u'Velib',
  u'name': u'Paris'}]

Now, we retrieve the list of all stations in the Paris contract.

In [6]:
stations = get('stations?contract=Paris')

We also generate a Pandas DataFrame from this dictionary.

In [7]:
stations_df = pd.DataFrame(stations)

Bike stands

Let's analyse the bike stands.

In [8]:
stands = stations_df.bike_stands
print("""There are {0:d} stations with a total of {1:d} bike stands near Paris.
Each station has between {2:d} and {3:d} stands, with a mean of {4:.1f} stands.
""".format(
    stands.count(),
    stands.sum(),
    stands.min(),
    stands.max(),
    stands.mean(),
))
There are 1227 stations with a total of 39920 bike stands near Paris.
Each station has between 7 and 72 stands, with a mean of 32.5 stands.

Let's plot a histogram with the number of bike stands per station.

In [9]:
stands.hist();
title("Number of bike stands per station.");

Available bike stands

When was the last bike availability update across all stations?

In [10]:
timestamp = stations_df.last_update.max()
date = datetime.datetime.fromtimestamp(timestamp / 1000.).strftime('%Y-%m-%d %H:%M:%S')
print(date)
2013-05-05 15:05:24
In [11]:
available_bike_stands = stations_df.available_bike_stands
available_bikes = stations_df.available_bikes
print("""There are {0:d} stations with no bikes out of {1:d} stations on {2:s}.""".format(
    np.sum(available_bikes == 0),
    available_bikes.count(),
    date,))
There are 207 stations with no bikes out of 1227 stations on 2013-05-05 15:05:24.

Station positions

We retrieve the coordinates of all stations, and remove the stations with no coordinates.

In [12]:
positions = np.array([(d['position']['lng'], d['position']['lat']) for d in stations])
indices = positions.min(axis=1) != 0.0
positions = positions[indices,:]
x, y = positions.T

Let's get the number of bike stands and available bikes for these stations.

In [13]:
sizes = stations_df.bike_stands[indices]
available_stands = stations_df.available_bike_stands[indices]

We now display all stations with the size proportional to the number of stands, and the color indicating the number of available bike stands (red=few free stands available, blue=most stands available).

In [14]:
figure(figsize=(12,8));
scatter(x, y, c=available_stands, s=sizes, edgecolors='none', cmap=get_cmap('RdYlGn'));
xticks([]);
yticks([]);
title("Available bike stands in Paris Velib' stations, {0:s}".format(date));

This is a sunny day, there are probably a lot of people around the Seine...