Slides | YouTube-Data-API Package | NB-Viewer
To import packages for this demo:
pip install -r requirements.txt
#These are the packages we will use in the demonstration
import os
import json
import pandas as pd
import datetime
from youtube_api import YoutubeDataApi
key = os.environ.get('YT_KEY')
yt = YoutubeDataApi(key)
#Formats json items to print for readability
def dump(doc):
def default_handler(o):
if isinstance(o, datetime.datetime):
return o.isoformat()
print(json.dumps(doc, sort_keys=True, indent=4, default=default_handler))
You may start with a channel name like 'LastWeekTonight' or 'TheNewYorkTimes'. Any data collected about channels must be collected using the channel ID, not the channel name.
The channel ID can be pulled by running yt.get_channel_id_from_user(CHANNEL_ID)
# get the channel ID for the TV show Last Week Tonight
channel_id = yt.get_channel_id_from_user('LastWeekTonight')
channel_id
'UC3XTzVzaHQEd30rQbuvCtTQ'
From this chnanel ID, we can get a wide variety of data including (but not limited to):
# get channel metadata for the channel ID we pulled for Last Week Tonight
channel_meta = yt.get_channel_metadata(channel_id)
dump(channel_meta)
{ "account_creation_date": "2014-03-18T17:41:39", "channel_id": "UC3XTzVzaHQEd30rQbuvCtTQ", "collection_date": "2019-04-12T07:12:21.012418", "country": null, "description": "Breaking news on a weekly basis. Sundays at 11PM - only on HBO.\nSubscribe to the Last Week Tonight channel for the latest videos from John Oliver and the LWT team.", "keywords": null, "playlist_id_likes": "LL3XTzVzaHQEd30rQbuvCtTQ", "playlist_id_uploads": "UU3XTzVzaHQEd30rQbuvCtTQ", "subscription_count": "6926188", "title": "LastWeekTonight", "topic_ids": "https://en.wikipedia.org/wiki/Entertainment|https://en.wikipedia.org/wiki/Humour|https://en.wikipedia.org/wiki/Television_program", "video_count": "267", "view_count": "1936918852" }
In addition to channel metadata, we can pull relational data for channels like the people they subscribe to or feature
# Get the channels that Last Week Tonight subscribes to
subscriptions = yt.get_subscriptions(channel_id)
dump(subscriptions[:2])
[ { "collection_date": "2019-04-12T07:12:22.720954", "subscription_channel_id": "UCWPQB43yGKEum3eW0P9N_nQ", "subscription_kind": "youtube#channel", "subscription_publish_date": "2014-03-20T19:05:54", "subscription_title": "HBOBoxing" }, { "collection_date": "2019-04-12T07:12:22.721991", "subscription_channel_id": "UCy6kyFxaMqGtpE3pQTflK8A", "subscription_kind": "youtube#channel", "subscription_publish_date": "2014-12-11T18:55:41", "subscription_title": "Real Time with Bill Maher" } ]
YouTube is consructed such that the video uploads by a user are stored in a playlist based on the user's channel ID. We can use this detail to generate the Upload Playlist ID for a given user and collect all videos posted by them.
# Install some utility functions from the YouTube package
from youtube_api import youtube_api_utils as utils
# get the playlist ID for Last Week Tonight's uploads
playlist_id = utils.get_upload_playlist_id(channel_id)
playlist_id
'UU3XTzVzaHQEd30rQbuvCtTQ'
playlist_id
¶The function yt.get_videos_from_playlist_id(PLAYLIST_ID)
returns a list of videos from the playlist ID, in this case, the uploads. This returns a list of videos, their channels, and the publishing date.
# get the videos posted by Last Week Tonight and display the last 5
videos = yt.get_videos_from_playlist_id(playlist_id)
df = pd.DataFrame(videos[:5])
df
publish_date | video_id | channel_id | collection_date | |
---|---|---|---|---|
0 | 2019-04-08 06:30:00 | jCC8fPQOaxU | UC3XTzVzaHQEd30rQbuvCtTQ | 2019-04-12 07:12:28.690187 |
1 | 2019-04-01 06:30:01 | m8UQ4O7UiDs | UC3XTzVzaHQEd30rQbuvCtTQ | 2019-04-12 07:12:28.691222 |
2 | 2019-03-18 06:30:01 | Yq7Eh6JTKIg | UC3XTzVzaHQEd30rQbuvCtTQ | 2019-04-12 07:12:28.691222 |
3 | 2019-03-11 06:30:00 | FO0iG_P0P6M | UC3XTzVzaHQEd30rQbuvCtTQ | 2019-04-12 07:12:28.691222 |
4 | 2019-03-04 07:30:01 | _h1ooyyFkF0 | UC3XTzVzaHQEd30rQbuvCtTQ | 2019-04-12 07:12:28.691222 |
From the list of videos from the uploads playlist, we can pull more detailed information about videos using the video IDs using the function yt.get_video_metadata(VIDEO_ID)
. This function can handle a single video ID or a list of video IDs.
For the video IDs passed, the package gets:
# Get the video metadata for Last Week Tonight's videos
video_meta = yt.get_video_metadata(df.video_id.tolist())
dump(video_meta[:1])
[ { "channel_id": "UC3XTzVzaHQEd30rQbuvCtTQ", "channel_title": "LastWeekTonight", "collection_date": "2019-04-12T07:12:36.841405", "video_category": "24", "video_comment_count": "12526", "video_description": "Mobile homes may seem like an affordable housing option, but large investment companies are making them less and less so.\n\nConnect with Last Week Tonight online... \n\nSubscribe to the Last Week Tonight YouTube channel for more almost news as it almost happens: www.youtube.com/lastweektonight \n\nFind Last Week Tonight on Facebook like your mom would: www.facebook.com/lastweektonight \n\nFollow us on Twitter for news about jokes and jokes about news: www.twitter.com/lastweektonight \n\nVisit our official site for all that other stuff at once: www.hbo.com/lastweektonight", "video_dislike_count": "3104", "video_id": "jCC8fPQOaxU", "video_like_count": "105534", "video_publish_date": "2019-04-08T06:30:00", "video_tags": "", "video_thumbnail": "https://i.ytimg.com/vi/jCC8fPQOaxU/hqdefault.jpg", "video_title": "Mobile Homes: Last Week Tonight with John Oliver (HBO)", "video_view_count": "4927325" } ]
This is the functional equivalent of going onto YouTube and typing a phrase into the search bar and seeing the results
# pull search results for 'John Oliver'
searches = yt.search('john oliver', max_results=5)
dump(searches[:2])
[ { "channel_id": "UC3XTzVzaHQEd30rQbuvCtTQ", "channel_title": "LastWeekTonight", "collection_date": "2019-04-12T07:12:39.248427", "video_category": null, "video_description": "Mobile homes may seem like an affordable housing option, but large investment companies are making them less and less so. Connect with Last Week Tonight ...", "video_id": "jCC8fPQOaxU", "video_publish_date": "2019-04-08T06:30:00", "video_thumbnail": "https://i.ytimg.com/vi/jCC8fPQOaxU/hqdefault.jpg", "video_title": "Mobile Homes: Last Week Tonight with John Oliver (HBO)" }, { "channel_id": "UC3XTzVzaHQEd30rQbuvCtTQ", "channel_title": "LastWeekTonight", "collection_date": "2019-04-12T07:12:39.249427", "video_category": null, "video_description": "John Oliver discusses how the WWE takes care of its wrestlers \u2014 and how it doesn't. Connect with Last Week Tonight online... Subscribe to the Last Week ...", "video_id": "m8UQ4O7UiDs", "video_publish_date": "2019-04-01T06:30:01", "video_thumbnail": "https://i.ytimg.com/vi/m8UQ4O7UiDs/hqdefault.jpg", "video_title": "WWE: Last Week Tonight with John Oliver (HBO)" } ]
These are the videos listed to the side or below YouTube videos while they play. They are YouTube's best guess as to what you may like based on the video you are currently watching. You can get the recommended
# get recommended videos for Last Week Tonight's video on the WWE
recommendations = yt.get_recommended_videos('m8UQ4O7UiDs')
dump(recommendations[:2])
[ { "channel_id": "UC3XTzVzaHQEd30rQbuvCtTQ", "channel_title": "LastWeekTonight", "collection_date": "2019-04-12T07:12:41.364671", "video_category": null, "video_description": "Mobile homes may seem like an affordable housing option, but large investment companies are making them less and less so.\n\nConnect with Last Week Tonight online... \n\nSubscribe to the Last Week Tonight YouTube channel for more almost news as it almost happens: www.youtube.com/lastweektonight \n\nFind Last Week Tonight on Facebook like your mom would: www.facebook.com/lastweektonight \n\nFollow us on Twitter for news about jokes and jokes about news: www.twitter.com/lastweektonight \n\nVisit our official site for all that other stuff at once: www.hbo.com/lastweektonight", "video_id": "jCC8fPQOaxU", "video_publish_date": "2019-04-08T00:45:08", "video_thumbnail": "https://i.ytimg.com/vi/jCC8fPQOaxU/hqdefault.jpg", "video_title": "Mobile Homes: Last Week Tonight with John Oliver (HBO)" }, { "channel_id": "UC3XTzVzaHQEd30rQbuvCtTQ", "channel_title": "LastWeekTonight", "collection_date": "2019-04-12T07:12:41.364671", "video_category": null, "video_description": "Ivanka Trump and Jared Kushner hold an incredible amount of political power. That's troubling considering their incredibly small amount of political experience.\n\nConnect with Last Week Tonight online...\nSubscribe to the Last Week Tonight YouTube channel for more almost news as it almost happens: www.youtube.com/user/LastWeekTonight\n\nFind Last Week Tonight on Facebook like your mom would:\nhttp://Facebook.com/LastWeekTonight\n\nFollow us on Twitter for news about jokes and jokes about news:\nhttp://Twitter.com/LastWeekTonight\n\nVisit our official site for all that other stuff at once:\nhttp://www.hbo.com/lastweektonight", "video_id": "wD8AwgO0AQI", "video_publish_date": "2017-04-24T02:20:55", "video_thumbnail": "https://i.ytimg.com/vi/wD8AwgO0AQI/hqdefault.jpg", "video_title": "Ivanka & Jared: Last Week Tonight with John Oliver (HBO)" } ]