Connecting to Facebook API

Written by Kat Chuang @katychuang


The goal of this exercise is to connect with Facebook Graph Api to collect information about my most recent posts, and also to collect each posts' subsequent comments and likes.


  • I first created a virtual environment for my notebooks: mkvirtualenv --python=/usr/local/bin/python3 dataAnalysis
  • Then installed the notebook server to the machine. Instructions can be found here
  • Start server in the root directory, jupyter notebook

In addition to getting the console ready, I saved my access token information in a separate local folder, _keys in a file name Inside you want to save a string variable like the following:


Making a request

In [1]:
from _keys.facebook import USER_ID, ACCESS_TOKEN, paging_token
import requests

host = ''

u = '{}/{}/posts?access_token={}'.format(host, USER_ID, ACCESS_TOKEN)
data1 = requests.get(u).json()

pg2 = '{}/{}/posts?limit=25&until=1486832400&__paging_token={}&access_token={}'.format(host, USER_ID, paging_token, ACCESS_TOKEN)
data2 = requests.get(pg2).json()

data = []

print(data[0].keys(), "\n")
dict_keys(['message', 'story', 'created_time', 'id']) 

The JSON data is saved into the variable data,which is a list of dictionaries, so we can now use the data structure functions to access information for each story. Here are the two most recent posts' IDs and timestamps.

In [2]:
# Two most post time:

Parsing data

The next step is to save this data for analysis. We could save it into a text file. We could save it into a database. Since this is a small amount of data (n=25), I chose to iterate quickly on developing the code by loading the JSON response directly into a Python dataframe.

Let's see what happens when we use parse the timestamps:

In [3]:
import datetime
from dateutil import parser

# return 
def scrub(timestamp):
    d = parser.parse(timestamp)
    return dow(d), hod(d)

# returns day of week
def dow(date): return date.strftime("%A")

# returns hour of day
def hod(time): return time.strftime("%-I:%M%p")

a = list(map((lambda x: scrub(x['created_time'])), data ))
In [4]:

dow = list(map( (lambda x: x[0]) , a))

print("===========    =======")
for day in days:
    print (day, " \t", dow.count(day))
day		posts
===========    =======
Monday  	 2
Tuesday  	 3
Wednesday  	 8
Thursday  	 8
Friday  	 7
Saturday  	 6
Sunday  	 12

Saving data to file

You can easily save the list of dictionaries to a text file using the metho json.dump() like so:

import json

with open(output_file, 'w') as jsonfile:
    json.dump(data, jsonfile)
In [ ]: