Written by Kat Chuang @katychuang
The goal of this exercise is to connect with Facebook Graph Api to collect information about my most recent posts, and also to collect each posts' subsequent comments and likes.
mkvirtualenv --python=/usr/local/bin/python3 dataAnalysis
jupyter notebook
In addition to getting the console ready, I saved my access token information in a separate local folder, _keys
in a file name facebook.py
. Inside you want to save a string variable like the following:
ACCESS_TOKEN="XXXXXX"
from _keys.facebook import USER_ID, ACCESS_TOKEN, paging_token
import requests
host = 'https://graph.facebook.com/v2.8'
u = '{}/{}/posts?access_token={}'.format(host, USER_ID, ACCESS_TOKEN)
data1 = requests.get(u).json()
pg2 = '{}/{}/posts?limit=25&until=1486832400&__paging_token={}&access_token={}'.format(host, USER_ID, paging_token, ACCESS_TOKEN)
data2 = requests.get(pg2).json()
data = []
data.extend(data1["data"])
data.extend(data2["data"])
print(data[0].keys(), "\n")
dict_keys(['message', 'story', 'created_time', 'id'])
The JSON data is saved into the variable data
,which is a list of dictionaries, so we can now use the data structure functions to access information for each story. Here are the two most recent posts' IDs and timestamps.
# Two most post time:
print(data[0]['created_time'])
2017-03-19T22:23:36+0000
The next step is to save this data for analysis. We could save it into a text file. We could save it into a database. Since this is a small amount of data (n=25), I chose to iterate quickly on developing the code by loading the JSON response directly into a Python dataframe.
Let's see what happens when we use parse the timestamps:
import datetime
from dateutil import parser
# return
def scrub(timestamp):
d = parser.parse(timestamp)
return dow(d), hod(d)
# returns day of week
def dow(date): return date.strftime("%A")
# returns hour of day
def hod(time): return time.strftime("%-I:%M%p")
a = list(map((lambda x: scrub(x['created_time'])), data ))
days=["Monday","Tuesday","Wednesday","Thursday","Friday","Saturday","Sunday"]
dow = list(map( (lambda x: x[0]) , a))
print("day\t\tposts")
print("=========== =======")
for day in days:
print (day, " \t", dow.count(day))
day posts =========== ======= Monday 2 Tuesday 3 Wednesday 8 Thursday 8 Friday 7 Saturday 6 Sunday 12
You can easily save the list of dictionaries to a text file using the metho json.dump()
like so:
import json
with open(output_file, 'w') as jsonfile:
json.dump(data, jsonfile)