This project is focused on comparing different types of postings on the "Hacker News" web platform made by interested parties; asking questions, showing something in particular and all other reasons for posting.
The focus will be limited to question type posts and comparing the average number of comments/responses to questions within each hour of a day. The purpose in doing this is to determine if the data analysis results will provide best time of day to post questions where I would expect the highest frequency of comments/responses.
# This cell is set up to import the file of interest, 'hacker_news.csv'
# and create a list named 'hn'.
opened_file = open('hacker_news.csv')
from csv import reader
read_file = reader(opened_file)
hn = list(read_file)
print(hn[:5])
[['id', 'title', 'url', 'num_points', 'num_comments', 'author', 'created_at'], ['12224879', 'Interactive Dynamic Video', 'http://www.interactivedynamicvideo.com/', '386', '52', 'ne0phyte', '8/4/2016 11:52'], ['10975351', 'How to Use Open Source and Shut the Fuck Up at the Same Time', 'http://hueniverse.com/2016/01/26/how-to-use-open-source-and-shut-the-fuck-up-at-the-same-time/', '39', '10', 'josep2', '1/26/2016 19:30'], ['11964716', "Florida DJs May Face Felony for April Fools' Water Joke", 'http://www.thewire.com/entertainment/2013/04/florida-djs-april-fools-water-joke/63798/', '2', '1', 'vezycash', '6/23/2016 22:20'], ['11919867', 'Technology ventures: From Idea to Enterprise', 'https://www.amazon.com/Technology-Ventures-Enterprise-Thomas-Byers/dp/0073523429', '3', '1', 'hswarna', '6/17/2016 0:01']]
# This cell is set up to create two files; one with the header information only
# and one excluding the header information.
headers = hn[0]
hn = hn[1:]
print(headers)
print('\n')
print(hn[:5])
['id', 'title', 'url', 'num_points', 'num_comments', 'author', 'created_at'] [['12224879', 'Interactive Dynamic Video', 'http://www.interactivedynamicvideo.com/', '386', '52', 'ne0phyte', '8/4/2016 11:52'], ['10975351', 'How to Use Open Source and Shut the Fuck Up at the Same Time', 'http://hueniverse.com/2016/01/26/how-to-use-open-source-and-shut-the-fuck-up-at-the-same-time/', '39', '10', 'josep2', '1/26/2016 19:30'], ['11964716', "Florida DJs May Face Felony for April Fools' Water Joke", 'http://www.thewire.com/entertainment/2013/04/florida-djs-april-fools-water-joke/63798/', '2', '1', 'vezycash', '6/23/2016 22:20'], ['11919867', 'Technology ventures: From Idea to Enterprise', 'https://www.amazon.com/Technology-Ventures-Enterprise-Thomas-Byers/dp/0073523429', '3', '1', 'hswarna', '6/17/2016 0:01'], ['10301696', 'Note by Note: The Making of Steinway L1037 (2007)', 'http://www.nytimes.com/2007/11/07/movies/07stein.html?_r=0', '8', '2', 'walterbell', '9/30/2015 4:12']]
# This cell is set up to extract the "Ask HN" and "Show HN" type posts and to quantify
# the number of each over the total period of time that the data file covered.
ask_posts = []
show_posts = []
other_posts = []
for row in hn:
title = row[1]
if title.lower().startswith('ask hn'):
ask_posts.append(row)
elif title.lower().startswith('show hn'):
show_posts.append(row)
else:
other_posts.append(row)
print('Total Ask HNs Posts = ', len(ask_posts))
print('Total Show HNs Posts = ', len(show_posts))
print('Total Other Posts = ', len(other_posts))
Total Ask HNs Posts = 1744 Total Show HNs Posts = 1162 Total Other Posts = 17194
The total quantity of Ask HN posts as shown above was 1744, while the total Show HN type posts was 1162. I would consider these totals to be reasonable sample sizes to continue with the planned analysis and make reliable conclusions.
# This cell is set up to calculate the average number of comments/responses
# for each of the two types of posts: Ask HN and Show HN.
total_ask_comments = 0
for row in ask_posts:
comments = int(row[4])
total_ask_comments = total_ask_comments + comments
avg_ask_comments = total_ask_comments / len(ask_posts)
print('Total Ask Comments =', total_ask_comments)
print('\n')
print('Average Ask Comments =', avg_ask_comments)
print('Average Rounded Ask Comments =', round(avg_ask_comments))
total_show_comments = 0
for row in show_posts:
comments = int(row[4])
total_show_comments = total_show_comments + comments
avg_show_comments = total_show_comments / len(show_posts)
print('\n')
print('Total Show Comments =', total_show_comments)
print('\n')
print('Average Show Comments =', avg_show_comments)
print('Average Rounded Show Comments =', round(avg_show_comments))
Total Ask Comments = 24483 Average Ask Comments = 14.038417431192661 Average Rounded Ask Comments = 14 Total Show Comments = 11988 Average Show Comments = 10.31669535283993 Average Rounded Show Comments = 10
As shown in the output above, the average Ask HN comments per post is 14 while the average Show HN comments per post is 10. This does not surprise me that the average Ask HN comments per post is greater than the average Show HN comments per post.
I would expect a question posted to receive more responses or comments than a post associated with showing something. The individuals posting questions are looking for responses whereas individuals posting something to show are not necessarily looking for responses. Questions invite opportunities for debate among response contributors and may attract more people to respond or comment than would something to be shown on "Hacker News".
# This cell is set up to calculate the average number of Ask HN posts per hour.
# The number of comments per hour is calculated and then the total number of
# posts per hour. The average number of comments per post for each hour is then
# calculated by dividing the number of comments per hour by the number of posts per hour.
import datetime as dt
result_list = []
for row in ask_posts:
created_at = row[6]
comments = int(row[4])
result_list.append([created_at,comments])
print('\n')
print(result_list[:10])
counts_by_hour = {}
comments_by_hour = {}
date_format = "%m/%d/%Y %H:%M"
for row in result_list:
entry_date = row[0]
comments = row[1]
time = dt.datetime.strptime(entry_date, date_format).strftime("%H")
if time in counts_by_hour:
counts_by_hour[time] += 1
comments_by_hour[time] += comments
else:
counts_by_hour[time] = 1
comments_by_hour[time] = comments
print(comments_by_hour)
print('\n')
print(counts_by_hour)
avg_by_hour = []
for hour in counts_by_hour:
avg_by_hour.append([hour, comments_by_hour[hour] / counts_by_hour[hour]])
print('\n')
print('Average Comments by Hour')
print('------------------------')
for row in avg_by_hour:
hour = row[0]
average = row[1]
print(hour, round(average,2))
[['8/16/2016 9:55', 6], ['11/22/2015 13:43', 29], ['5/2/2016 10:14', 1], ['8/2/2016 14:20', 3], ['10/15/2015 16:38', 17], ['9/26/2015 23:23', 1], ['4/22/2016 12:24', 4], ['11/16/2015 9:22', 1], ['2/24/2016 17:57', 1], ['6/4/2016 17:17', 2]] {'09': 251, '13': 1253, '10': 793, '14': 1416, '16': 1814, '23': 543, '12': 687, '17': 1146, '15': 4477, '21': 1745, '20': 1722, '02': 1381, '18': 1439, '03': 421, '05': 464, '19': 1188, '01': 683, '22': 479, '08': 492, '04': 337, '00': 447, '06': 397, '07': 267, '11': 641} {'09': 45, '13': 85, '10': 59, '14': 107, '16': 108, '23': 68, '12': 73, '17': 100, '15': 116, '21': 109, '20': 80, '02': 58, '18': 109, '03': 54, '05': 46, '19': 110, '01': 60, '22': 71, '08': 48, '04': 47, '00': 55, '06': 44, '07': 34, '11': 58} Average Comments by Hour ------------------------ 09 5.58 13 14.74 10 13.44 14 13.23 16 16.8 23 7.99 12 9.41 17 11.46 15 38.59 21 16.01 20 21.52 02 23.81 18 13.2 03 7.8 05 10.09 19 10.8 01 11.38 22 6.75 08 10.25 04 7.17 00 8.13 06 9.02 07 7.85 11 11.05
We see in the above output the average comments per Ask HN post by hour. Neither the hours nor averages are in sequential order.
The cell below is set up to change the average comments per hour from the second column to the first column by "swapping" and then to sort them from highest to lowest using the "sorted" function. This should make it a little easier to draw conclusions.
sorted_by_hour = sorted(avg_by_hour)
print(sorted_by_hour)
swap_avg_by_hour = []
for row in avg_by_hour:
swap_avg_by_hour.append([row[1], row[0]])
sorted_swap = sorted(swap_avg_by_hour, reverse=True)
print('\n')
print(sorted_swap)
print('\n')
time_format = "%H"
print('Top 5 Hours for Ask Posts Comments Across All Days of Week ')
print('-----------------------------------------------------------')
for top_5_hours in sorted_swap[:5]:
avg_comments = top_5_hours[0]
hour = top_5_hours[1]
hour1 = dt.datetime.strptime(hour, time_format).strftime("%H:%M")
print(hour1, round(avg_comments,2), ' average comments per post')
[['00', 8.127272727272727], ['01', 11.383333333333333], ['02', 23.810344827586206], ['03', 7.796296296296297], ['04', 7.170212765957447], ['05', 10.08695652173913], ['06', 9.022727272727273], ['07', 7.852941176470588], ['08', 10.25], ['09', 5.5777777777777775], ['10', 13.440677966101696], ['11', 11.051724137931034], ['12', 9.41095890410959], ['13', 14.741176470588234], ['14', 13.233644859813085], ['15', 38.5948275862069], ['16', 16.796296296296298], ['17', 11.46], ['18', 13.20183486238532], ['19', 10.8], ['20', 21.525], ['21', 16.009174311926607], ['22', 6.746478873239437], ['23', 7.985294117647059]] [[38.5948275862069, '15'], [23.810344827586206, '02'], [21.525, '20'], [16.796296296296298, '16'], [16.009174311926607, '21'], [14.741176470588234, '13'], [13.440677966101696, '10'], [13.233644859813085, '14'], [13.20183486238532, '18'], [11.46, '17'], [11.383333333333333, '01'], [11.051724137931034, '11'], [10.8, '19'], [10.25, '08'], [10.08695652173913, '05'], [9.41095890410959, '12'], [9.022727272727273, '06'], [8.127272727272727, '00'], [7.985294117647059, '23'], [7.852941176470588, '07'], [7.796296296296297, '03'], [7.170212765957447, '04'], [6.746478873239437, '22'], [5.5777777777777775, '09']] Top 5 Hours for Ask Posts Comments Across All Days of Week ----------------------------------------------------------- 15:00 38.59 average comments per post 02:00 23.81 average comments per post 20:00 21.52 average comments per post 16:00 16.8 average comments per post 21:00 16.01 average comments per post
The top 5 hours with highest average comments per Ask HN post are shown above. I made an attempt to find out what time zone the posts are logged in within "Hacker News". I wasn't able to find it, so I will make some conclusions here based on assuming Eastern Time Zone which may make some sense based on my observations.
The top average comments per post (38.59) occurs at the 15:00 hour of the day (3:00 pm), which really means between 3:00 PM and 4:00 PM. This would make sense to me if the majority of the responders by comment are in the school age range. That time period is about when students get out of school. However, the age of responders is not given in this data set.
Further, the results above suggests that 3:00 PM to 4:00 PM is the peak period for comment volume per post, regardless of day of the week! This may not be true for Saturday and Sunday since school is not attended on the weekend. Also, if we created a histogram for average comments per post by hour, the distribution would not be Unimodal with only one peak (at 15:00), but would be Multi-Modal with 3 peaks; the top 3 times as shown above (3:00 PM, 2:00 AM and 8:00 PM). This causes me to suspect that day of the week matters regarding top hour for highest average comments per post.
This begs me to Dig Deeper and break out the data by Day as well!
# This cell is set up to call out day of the week in format
# ('Monday', 'Tuesday' ...) and calculate number of posts
# and number of comments by day of the week.
# Then I use the same procedures as in a previous cell further
# above to calculate average number of comments per post by day.
counts_by_day = {}
comments_by_day = {}
date_format = "%m/%d/%Y %H:%M"
for row in result_list:
entry_date = row[0]
comments = row[1]
day = dt.datetime.strptime(entry_date, date_format).strftime("%A")
if day in counts_by_day:
counts_by_day[day] += 1
comments_by_day[day] += comments
else:
counts_by_day[day] = 1
comments_by_day[day] = comments
print('\n')
print(comments_by_day)
print('\n')
print(counts_by_day)
avg_by_day = []
for day in counts_by_day:
avg_by_day.append([day, comments_by_day[day] / counts_by_day[day]])
print('\n')
print(avg_by_day)
{'Tuesday': 3051, 'Sunday': 3125, 'Monday': 3589, 'Thursday': 3334, 'Saturday': 2971, 'Friday': 4758, 'Wednesday': 3655} {'Tuesday': 288, 'Sunday': 162, 'Monday': 285, 'Thursday': 254, 'Saturday': 190, 'Friday': 271, 'Wednesday': 294} [['Tuesday', 10.59375], ['Sunday', 19.290123456790123], ['Monday', 12.592982456140351], ['Thursday', 13.125984251968504], ['Saturday', 15.636842105263158], ['Friday', 17.55719557195572], ['Wednesday', 12.431972789115646]]
Looking at the total number of comments per day, we see that the quantity (sample size) is reasonably high (> 2970) to break down the results by hour within each day and draw conclusions from that.
It's interesting that the number of posts per day is lowest on Saturday and Sunday. Average comments per post within each day is highest for Friday, Saturday and Sunday.
# This cell is set up to break out post date into date and hour and then
# create a list with 3 elements: date, hour and comment qty.
# Then, loops are set up separately within each day ('Monday', 'Tuesday' ...)
# to determine average number of comments per post by hour within each day.
import datetime as dt
results_list2 = []
for row in ask_posts:
created_at = row[6]
date, time = created_at.split()
comments = int(row[4])
results_list2.append([date, time, comments])
print('\n')
print(results_list2[:10])
print('\n')
date_format = "%m/%d/%Y"
hour_format = "%H:%M"
results_list3 = []
for row in results_list2:
entry_date = row[0]
entry_hour = row[1]
comments = row[2]
date2 = dt.datetime.strptime(entry_date, date_format).strftime("%A")
hour2 = dt.datetime.strptime(entry_hour, hour_format).strftime("%H")
results_list3.append([date2, hour2, comments])
print(results_list3[:10])
counts_by_Mon = {}
comments_by_Mon = {}
results_Mon = []
for row in results_list3:
if row[0] == 'Monday':
hour = row[1]
commentsm = row[2]
results_Mon.append([hour, commentsm])
else:
data = 0
for row in results_Mon:
hour3 = row[0]
comments3 = row[1]
if hour3 in counts_by_Mon:
counts_by_Mon[hour3] += 1
comments_by_Mon[hour3] += comments3
else:
counts_by_Mon[hour3] = 1
comments_by_Mon[hour3] = comments3
avg_by_hour_Mon = []
for hour in counts_by_Mon:
avg_by_hour_Mon.append([hour, comments_by_Mon[hour] / counts_by_Mon[hour]])
print('\n')
for hour in avg_by_hour_Mon:
sorted_by_Mon = sorted(avg_by_hour_Mon)
time_format = "%H"
print('Average Comments by Hour on Mondays')
print('-----------------------------------')
for row in sorted_by_Mon:
hour = row[0]
hour1 = dt.datetime.strptime(hour, time_format).strftime("%H:%M")
avg_comments = row[1]
print(hour1, round(avg_comments,2))
counts_by_Tue = {}
comments_by_Tue = {}
results_Tue = []
for row in results_list3:
if row[0] == 'Tuesday':
hour = row[1]
commentsm = row[2]
results_Tue.append([hour, commentsm])
else:
data = 0
for row in results_Tue:
hour3 = row[0]
comments3 = row[1]
if hour3 in counts_by_Tue:
counts_by_Tue[hour3] += 1
comments_by_Tue[hour3] += comments3
else:
counts_by_Tue[hour3] = 1
comments_by_Tue[hour3] = comments3
avg_by_hour_Tue = []
for hour in counts_by_Tue:
avg_by_hour_Tue.append([hour, comments_by_Tue[hour] / counts_by_Tue[hour]])
print('\n')
for hour in avg_by_hour_Tue:
sorted_by_Tue = sorted(avg_by_hour_Tue)
print('Average Comments by Hour on Tuesdays')
print('------------------------------------')
for row in sorted_by_Tue:
hour = row[0]
hour1 = dt.datetime.strptime(hour, time_format).strftime("%H:%M")
avg_comments = row[1]
print(hour1, round(avg_comments,2))
counts_by_Wed = {}
comments_by_Wed = {}
results_Wed = []
for row in results_list3:
if row[0] == 'Wednesday':
hour = row[1]
commentsm = row[2]
results_Wed.append([hour, commentsm])
else:
data = 0
for row in results_Wed:
hour3 = row[0]
comments3 = row[1]
if hour3 in counts_by_Wed:
counts_by_Wed[hour3] += 1
comments_by_Wed[hour3] += comments3
else:
counts_by_Wed[hour3] = 1
comments_by_Wed[hour3] = comments3
avg_by_hour_Wed = []
for hour in counts_by_Wed:
avg_by_hour_Wed.append([hour, comments_by_Wed[hour] / counts_by_Wed[hour]])
print('\n')
for hour in avg_by_hour_Wed:
sorted_by_Wed = sorted(avg_by_hour_Wed)
print('Average Comments by Hour on Wednesdays')
print('--------------------------------------')
for row in sorted_by_Wed:
hour = row[0]
hour1 = dt.datetime.strptime(hour, time_format).strftime("%H:%M")
avg_comments = row[1]
print(hour1, round(avg_comments,2))
counts_by_Thur = {}
comments_by_Thur = {}
results_Thur = []
for row in results_list3:
if row[0] == 'Thursday':
hour = row[1]
commentsm = row[2]
results_Thur.append([hour, commentsm])
else:
data = 0
for row in results_Thur:
hour3 = row[0]
comments3 = row[1]
if hour3 in counts_by_Thur:
counts_by_Thur[hour3] += 1
comments_by_Thur[hour3] += comments3
else:
counts_by_Thur[hour3] = 1
comments_by_Thur[hour3] = comments3
avg_by_hour_Thur = []
for hour in counts_by_Thur:
avg_by_hour_Thur.append([hour, comments_by_Thur[hour] / counts_by_Thur[hour]])
print('\n')
for hour in avg_by_hour_Thur:
sorted_by_Thur = sorted(avg_by_hour_Thur)
print('Average Comments by Hour on Thursdays')
print('-------------------------------------')
for row in sorted_by_Thur:
hour = row[0]
hour1 = dt.datetime.strptime(hour, time_format).strftime("%H:%M")
avg_comments = row[1]
print(hour1, round(avg_comments,2))
counts_by_Fri = {}
comments_by_Fri = {}
results_Fri = []
for row in results_list3:
if row[0] == 'Friday':
hour = row[1]
commentsm = row[2]
results_Fri.append([hour, commentsm])
else:
data = 0
for row in results_Fri:
hour3 = row[0]
comments3 = row[1]
if hour3 in counts_by_Fri:
counts_by_Fri[hour3] += 1
comments_by_Fri[hour3] += comments3
else:
counts_by_Fri[hour3] = 1
comments_by_Fri[hour3] = comments3
avg_by_hour_Fri = []
for hour in counts_by_Fri:
avg_by_hour_Fri.append([hour, comments_by_Fri[hour] / counts_by_Fri[hour]])
print('\n')
for hour in avg_by_hour_Fri:
sorted_by_Fri = sorted(avg_by_hour_Fri)
print('Average Comments by Hour on Fridays')
print('-----------------------------------')
for row in sorted_by_Fri:
hour = row[0]
hour1 = dt.datetime.strptime(hour, time_format).strftime("%H:%M")
avg_comments = row[1]
print(hour1, round(avg_comments,2))
counts_by_Sat = {}
comments_by_Sat = {}
results_Sat = []
for row in results_list3:
if row[0] == 'Saturday':
hour = row[1]
commentsm = row[2]
results_Sat.append([hour, commentsm])
else:
data = 0
for row in results_Sat:
hour3 = row[0]
comments3 = row[1]
if hour3 in counts_by_Sat:
counts_by_Sat[hour3] += 1
comments_by_Sat[hour3] += comments3
else:
counts_by_Sat[hour3] = 1
comments_by_Sat[hour3] = comments3
avg_by_hour_Sat = []
for hour in counts_by_Sat:
avg_by_hour_Sat.append([hour, comments_by_Sat[hour] / counts_by_Sat[hour]])
print('\n')
for hour in avg_by_hour_Sat:
sorted_by_Sat = sorted(avg_by_hour_Sat)
print('Average Comments by Hour on Saturdays')
print('-------------------------------------')
for row in sorted_by_Sat:
hour = row[0]
hour1 = dt.datetime.strptime(hour, time_format).strftime("%H:%M")
avg_comments = row[1]
print(hour1, round(avg_comments,2))
counts_by_Sun = {}
comments_by_Sun = {}
results_Sun = []
for row in results_list3:
if row[0] == 'Sunday':
hour = row[1]
commentsm = row[2]
results_Sun.append([hour, commentsm])
else:
data = 0
for row in results_Sun:
hour3 = row[0]
comments3 = row[1]
if hour3 in counts_by_Sun:
counts_by_Sun[hour3] += 1
comments_by_Sun[hour3] += comments3
else:
counts_by_Sun[hour3] = 1
comments_by_Sun[hour3] = comments3
avg_by_hour_Sun = []
for hour in counts_by_Sun:
avg_by_hour_Sun.append([hour, comments_by_Sun[hour] / counts_by_Sun[hour]])
print('\n')
for hour in avg_by_hour_Sun:
sorted_by_Sun = sorted(avg_by_hour_Sun)
print('Average Comments by Hour on Sundays')
print('-----------------------------------')
for row in sorted_by_Sun:
hour = row[0]
hour1 = dt.datetime.strptime(hour, time_format).strftime("%H:%M")
avg_comments = row[1]
print(hour1, round(avg_comments,2))
[['8/16/2016', '9:55', 6], ['11/22/2015', '13:43', 29], ['5/2/2016', '10:14', 1], ['8/2/2016', '14:20', 3], ['10/15/2015', '16:38', 17], ['9/26/2015', '23:23', 1], ['4/22/2016', '12:24', 4], ['11/16/2015', '9:22', 1], ['2/24/2016', '17:57', 1], ['6/4/2016', '17:17', 2]] [['Tuesday', '09', 6], ['Sunday', '13', 29], ['Monday', '10', 1], ['Tuesday', '14', 3], ['Thursday', '16', 17], ['Saturday', '23', 1], ['Friday', '12', 4], ['Monday', '09', 1], ['Wednesday', '17', 1], ['Saturday', '17', 2]] Average Comments by Hour on Mondays ----------------------------------- 00:00 9.4 01:00 10.5 02:00 3.6 03:00 7.0 04:00 8.5 05:00 2.9 06:00 19.71 07:00 6.64 08:00 8.43 09:00 5.0 10:00 17.33 11:00 5.71 12:00 11.0 13:00 5.92 14:00 10.25 15:00 97.2 16:00 5.52 17:00 6.62 18:00 7.62 19:00 6.83 20:00 3.45 21:00 11.8 22:00 4.6 23:00 5.67 Average Comments by Hour on Tuesdays ------------------------------------ 00:00 2.85 01:00 17.33 02:00 7.87 03:00 10.0 04:00 8.75 05:00 2.33 06:00 3.4 07:00 2.6 08:00 3.86 09:00 2.56 10:00 8.75 11:00 6.0 12:00 5.43 13:00 21.08 14:00 14.5 15:00 22.24 16:00 5.08 17:00 7.11 18:00 28.56 19:00 5.06 20:00 17.33 21:00 7.8 22:00 10.93 23:00 4.1 Average Comments by Hour on Wednesdays -------------------------------------- 00:00 10.0 01:00 5.2 02:00 6.46 03:00 5.6 04:00 2.5 05:00 27.6 06:00 16.14 07:00 16.17 08:00 17.89 09:00 11.57 10:00 10.78 11:00 3.57 12:00 17.0 13:00 27.21 14:00 15.47 15:00 25.72 16:00 9.91 17:00 5.47 18:00 7.33 19:00 6.05 20:00 10.91 21:00 12.29 22:00 6.07 23:00 8.3 Average Comments by Hour on Thursdays ------------------------------------- 00:00 11.89 01:00 10.73 02:00 3.6 03:00 10.36 04:00 5.11 05:00 9.0 06:00 6.8 07:00 2.5 08:00 3.29 09:00 2.71 10:00 8.33 11:00 40.71 12:00 6.5 13:00 12.44 14:00 8.74 15:00 60.47 16:00 8.73 17:00 6.77 18:00 19.5 19:00 4.43 20:00 7.5 21:00 5.73 22:00 3.77 23:00 6.56 Average Comments by Hour on Fridays ----------------------------------- 00:00 5.0 01:00 10.75 02:00 17.89 03:00 8.78 04:00 6.0 05:00 16.38 06:00 6.75 07:00 2.5 08:00 8.17 09:00 6.2 10:00 18.67 11:00 4.14 12:00 9.79 13:00 4.64 14:00 6.0 15:00 38.8 16:00 43.0 17:00 18.72 18:00 7.18 19:00 18.89 20:00 45.85 21:00 29.14 22:00 9.44 23:00 6.77 Average Comments by Hour on Saturdays ------------------------------------- 00:00 5.25 01:00 15.29 02:00 106.11 03:00 5.25 04:00 8.83 05:00 4.14 06:00 4.1 07:00 9.33 08:00 7.2 09:00 11.2 10:00 10.75 11:00 3.71 12:00 8.29 13:00 7.2 14:00 20.91 15:00 17.0 16:00 13.0 17:00 7.2 18:00 16.0 19:00 9.6 20:00 8.67 21:00 26.0 22:00 3.25 23:00 4.12 Average Comments by Hour on Sundays ----------------------------------- 00:00 53.0 01:00 12.33 02:00 13.5 03:00 9.0 04:00 8.67 05:00 17.25 06:00 4.5 07:00 9.2 08:00 19.57 09:00 2.62 10:00 36.75 11:00 19.88 12:00 9.86 13:00 11.43 14:00 23.4 15:00 3.83 16:00 41.86 17:00 36.22 18:00 6.89 19:00 25.54 20:00 50.45 21:00 13.09 22:00 6.17 23:00 28.5
The peak hours for average number of comments per post are not the same for each day of the week!!
For Monday through Friday, 3:00 PM is still the dominant peak, either the top one or in the top 3. This would make sense if the majority of the "comment responders" were students just after getting out of school between 3:00 PM and 4:00 PM.
However, on the weekend, the peak hours of average comments per post on Saturday and Sunday, are either "late-late" night (12:00 midnight - 3:00 AM) or "mid-evening" (8:00 PM - 10:00 PM). The mid-evening peak is also true for Friday night.
So, during which hours should I create a question type post to have a higher chance of receiving comments?
Well the answer is: "It depends".
First, it depends on the type of question I intend to pose.
Secondly, it depends on whether I have access during week days to "Hacker News" between 3:00 PM and 4:00 PM. (i.e. possibly restricted by my employer if I'm employed). If I'm unemployed or retired, then I would have freedom to pose questions in that time period.
Let's say I do have access to posting in "Hacker News" on week days between 3:00 PM - 4:00 PM. If I'm interested in comments from the younger generation, then I would post in that time frame on either Monday (97 comments per post) or Thursday (60 comments per post). On the other hand, if the type of question I want to pose is totally irrelevant to the younger generation, then I would most likely choose the best posting time period on the weekend.
If I'm a working person without access to "Hackers News" during the week days ay 3:00 PM, then I would most likely post "mid-evening" (8:00 PM - 10:00 PM) either Friday (46 comments per post) or Sunday (50 comments per post).
Saturday morning at 2:00 AM had a higher average (102 comments per post), but I'm not keen on staying up that late just to pose a question.
Notice that all of the average comments per post determined by taking day of the week into consideration (lowest was 46 and highest was 102) are greater than the highest average across all days of the week (38). Digging Deeper was worth the extra time and effort to get more details and improve accuracy of conclusion!