Notebook

Connecting to Facebook API¶

Written by Kat Chuang @katychuang

Objective¶

The goal of this exercise is to connect with Facebook Graph Api to collect information about my most recent posts, and also to collect each posts' subsequent comments and likes.

Setup¶

I first created a virtual environment for my notebooks: mkvirtualenv --python=/usr/local/bin/python3 dataAnalysis
Then installed the notebook server to the machine. Instructions can be found here
Start server in the root directory, jupyter notebook

In addition to getting the console ready, I saved my access token information in a separate local folder, _keys in a file name facebook.py. Inside you want to save a string variable like the following:

ACCESS_TOKEN="XXXXXX"

Making a request¶

In [1]:

from _keys.facebook import USER_ID, ACCESS_TOKEN, paging_token
import requests

host = 'https://graph.facebook.com/v2.8'

u = '{}/{}/posts?access_token={}'.format(host, USER_ID, ACCESS_TOKEN)
data1 = requests.get(u).json()

pg2 = '{}/{}/posts?limit=25&until=1486832400&__paging_token={}&access_token={}'.format(host, USER_ID, paging_token, ACCESS_TOKEN)
data2 = requests.get(pg2).json()

data = []
data.extend(data1["data"])
data.extend(data2["data"])

print(data[0].keys(), "\n")

dict_keys(['message', 'story', 'created_time', 'id'])

The JSON data is saved into the variable data,which is a list of dictionaries, so we can now use the data structure functions to access information for each story. Here are the two most recent posts' IDs and timestamps.

In [2]:

# Two most post time:
print(data[0]['created_time'])

2017-03-19T22:23:36+0000

Parsing data¶

The next step is to save this data for analysis. We could save it into a text file. We could save it into a database. Since this is a small amount of data (n=25), I chose to iterate quickly on developing the code by loading the JSON response directly into a Python dataframe.

Let's see what happens when we use parse the timestamps:

In [3]:

import datetime
from dateutil import parser

# return 
def scrub(timestamp):
    d = parser.parse(timestamp)
    return dow(d), hod(d)

# returns day of week
def dow(date): return date.strftime("%A")

# returns hour of day
def hod(time): return time.strftime("%-I:%M%p")

a = list(map((lambda x: scrub(x['created_time'])), data ))

In [4]:

days=["Monday","Tuesday","Wednesday","Thursday","Friday","Saturday","Sunday"]

dow = list(map( (lambda x: x[0]) , a))

print("day\t\tposts")
print("===========    =======")
for day in days:
    print (day, " \t", dow.count(day))

day		posts
===========    =======
Monday  	 2
Tuesday  	 3
Wednesday  	 8
Thursday  	 8
Friday  	 7
Saturday  	 6
Sunday  	 12

Saving data to file¶

You can easily save the list of dictionaries to a text file using the metho json.dump() like so:

import json

with open(output_file, 'w') as jsonfile:
    json.dump(data, jsonfile)

In [ ]: