#!/usr/bin/env python
# coding: utf-8
# # Days of the Week of each Month through the Years
# ### [Album Shen](http://albumshen.com/)
# *Wednesday the 21st, November, 2018*
#
# I've been helping my 5th grade sister with her math league problems, and some brainteasers involving leap years came up. This sparked a seemingly (probably actually is) trivial question of how do the days of the week compare to one another in how often they appear in each month of our calendars? One would guess that over time the frequency of each weekday occurring for each month would more-or-less get evened out, right? (i.e. an equal number of mon, tues, wed, etc.) Let's find out by taking a look at how the months on our calendar actually cycle through the days of the week, and when/if the sequence comes back to itself. Some questions we may answer are:
#
# * Over time, will there be more Mondays in January than Tuesdays in January?
# * How many Wednesdays have there been in March since the start of the Gregorian calendar?
# * If I host an event every 5th Thursday in February, would I be hosting more events than if I were to do it every 5th Friday?
#
# There are a number of [existing algorithms that can determine the day of the week](https://en.wikipedia.org/wiki/Determination_of_the_day_of_the_week) for any given date. But I wanted to see how the distribution of these days in each month of the year play out over the course of an entire Gregorian cycle. We'll be running a program to evaluating the frequency of each weekday as they've occur in each month throughout the years since our current calendar's conception.
#
# **Fun fact:** The first day of the [Gregorian calendar](https://en.wikipedia.org/wiki/Gregorian_calendar) was October 15, 1582. A *Friday*.
# In[3]:
# Setting up the libaries & functions we'll be using
import pandas as pd
from IPython.display import display_html, HTML
from datetime import date, datetime, timedelta as td
from plotly.grid_objs import Grid, Column
import plotly.plotly as py
import time
# Dictionaries for dataframe column & row headers
weekdays = {0:"sun", 1:"mon", 2:"tues", 3:"wed",
4:"thurs", 5:"fri", 6:"sat"}
months = {0:"jan", 1:"feb", 2:"mar", 3:"apr",
4:"may", 5:"jun", 6:"jul", 7:"aug",
8:"sept", 9:"oct", 10:"nov", 11:"dec"}
ordinal = {0:"1st", 1:"2nd", 2:"3rd", 3:"4th",
4:"5th", 5:"6th", 6:"7th"}
# Print the dates + weekday information given the
# intervals of days from a specified origin date
def printWeekDays(daysFrom, originDate):
DayOnes = pd.to_datetime(daysFrom, unit='D', origin=pd.Timestamp(originDate))
for day in DayOnes:
print (day.strftime("%Y %b %d: %A (Day %w)"))
# Display dataframes side-by-side with their names on top
def disp_dfs(*args):
html_str = ''
for df in args:
html_str += '
'\
'
'\
+df.name+'
'\
+df.to_html()+'
'
display_html(html_str.replace('table', 'table style=display:inline'), raw=True)
# Date ranges using datetime dates "date(%Y,%m,%d)" as input
def dateRange(start_date, end_date):
for n in range(int ((end_date - start_date).days)):
yield start_date + td(n)
# #### Weekdays & Leap Cycles
#
# How long does it take for a given date to cycle back and coincide on the same day of the week again?
#
# The day of the week of any given date shifts 1 day for each nonleap year, and 2 years forward for each leap year.
# In[4]:
# What a difference 4 years make
daysFrom = [0, 365+1, 365*2+1, 365*3+1, 365*4+1]
printWeekDays(daysFrom, '2000-1-1')
# Over the course of each leap year interval, the total shift is by 5 (or -2) weekdays.
#
# So after 7 intervals (4*7 = 28 years), we should be back to the same day of the week on that date of the year.
# In[5]:
# 7 leap cycles
daysFrom = [0] * 8
for day in range(len(daysFrom)):
daysFrom[day] = 1461*day
printWeekDays(daysFrom, '2000-1-1')
# #### The Lesser Known Century Rule
# 28 day intervals maintain the same day of the week, but this does not account for the fact that there's an additional adjustment such that every century year
# that is not divisible by 400 is not a leap year so we start to see a drift and overcount by a day in the date, if we iterate every 28 years, as we pass such centuries.
# In[4]:
# Drifting through the centuries
daysFrom = [0] * 10
for day in range(len(daysFrom)):
daysFrom[day] = 1461*7*day
printWeekDays(daysFrom, '2000-1-1')
# A full Gregorian calendar cycle is 400 years, with 3 leap years omitted because they are century years nondivisible by 400.
#
# The total number of days in a full Gregorian cycle is 146097 = (400 yr * 365 days/yr) + 97 days from leap years.
# In[6]:
# Full Greg
daysFrom = [0, 146097]
printWeekDays(daysFrom, '1700-1-1')
# 146097 days is divisible by 7, so we see that every 400 years we are back to the day of the week in which we started, on the date of the calendar in which we started.
#
# But 400 years is not divisible by the 7 days of the week, and because days of the month other than nonleap year Feb have a number of days nondivisible by 7 (i.e. 29, 30, 31), there will be extra counts for the first one to three days of the week that that month began on. So this means we should expect an unequal distribution of the frequency of days of the week for each month.
# In[6]:
# Number of days in each month for each month
nonleap = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
leap = [31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
# Number of days shifted month-to-month
nonShift = [0]*12
leapShift = [0]*12
# Dataframes for each type of year's distribution.
# Every month has 28 or more days, so we can start with
# at at least 4 counts for each day of the week.
nonDist = pd.DataFrame([[4]*7]*12)
nonDist.name = 'Distribution of Days in Each Month by Order\
of Appearance in Non-Leap Years'
leapDist = pd.DataFrame([[4]*7]*12)
leapDist.name = 'Distribution of Days in Each Month by Order\
of Appearance in Leap Years'
# Then we can add in extra days depending on how many days
# over 28 each month has.
for month in range(0,12):
nonShift[month] = nonleap[month]%7
for extraDay in range(0, nonShift[month]):
nonDist.at[month, extraDay] = 5
leapShift[month] = leap[month]%7
for extraDay in range(0, leapShift[month]):
leapDist.at[month, extraDay] = 5
nonDist.rename(columns=ordinal, index=months, inplace=True)
leapDist.rename(columns=ordinal, index=months, inplace=True)
disp_dfs(nonDist, leapDist)
# ### The Distribution of Days of the Week for Each Month Over 400 Years
#
# If we were 5th graders competing in a pencil + paper math contest, we could use the tables above, count the number of nonleap and leap years, and make adjustments for which weekdays begin on which year.
#
# But the easiest way to get this distribution would be to and iterate through a period of 400 years. On a computer it only takes about a minute.
# In[67]:
# Initialize blank dataframe
counts = pd.DataFrame([[0]*7]*12)
counts.rename(columns=weekdays, index=months, inplace=True)
# Count range (end_date non-inclusive)
start_date = date(1582,10,15)
end_date = date.today()
# Columns for making grid
yList = []
mList = list(months.values())
wList = list(weekdays.values())
current_columns = []
# Adds a column every year
def addColumns(columns, wList, y):
for i in wList:
y_col_name = '{year}_{weekday}_{header}'.format(year=y, weekday=i, header='year')
y_col = Column((list([y]*12)), y_col_name)
columns.append(y_col)
mListInt = [(j*10+(1.5*wList.index(i))) for j in range(0, 12)]
m_col_name = '{year}_{weekday}_{header}'.format(year=y, weekday=i, header='month')
m_col = Column(mListInt, m_col_name)
columns.append(m_col)
c_col_name = '{year}_{weekday}_{header}'.format(year=y, weekday=i, header='count')
c_col = Column(counts[i].tolist(), c_col_name)
columns.append(c_col)
# Let's count
for single_date in dateRange(start_date, end_date):
# Updates grid/csv on new year's day
if (single_date.strftime('%m %d')=='01 01') and (int(single_date.strftime('%y'))%4==0):
y = int(single_date.strftime('%Y'))-1
addColumns(current_columns, wList, y)
yList.append(str(int(single_date.strftime('%Y'))-1))
# Updates count in dataframe
m = (1 * int(single_date.strftime("%m")))-1
w = int(single_date.strftime("%w"))
counts.iloc[m, w] += 1
# Update if end date not on new year's day
if (end_date.strftime('%m %d')!='01 01'):
addColumns(current_columns, wList, end_date.strftime('%Y'))
yList.append(str(int(single_date.strftime('%Y'))))
# Upload grid to plotly
countGrid = Grid(current_columns)
url = py.grid_ops.upload(countGrid, 'weekday_counter_1582_2018_grid'+str(time.time()), auto_open=False)
url
# I ran and stored the final dataframe for a full Gregorian cycle to '400_years.csv'. Drumroll please.
# In[7]:
gregCycle = pd.read_csv('400_years.csv', index_col=0)
gregCycle.name = 'Number of Weekdays That Occur in Each Month in Each Gregorian Calendar Cycle'
# To check that we did in fact count all the days
print ("Counted " + str(gregCycle.values.sum()) + " days")
disp_dfs(gregCycle)
# ### Animated Graph
# In[68]:
# Figure
figure = {
'data': [],
'layout': {},
'frames': [],
'config': {'scrollzoom': True}
}
# Fill in most of layout
figure['layout']['xaxis'] = {'title': 'Month', 'gridcolor': '#FFFFFF', 'range': [-2, 120], 'zeroline': False,
'tickvals': [(i*10+3) for i in range(0, 12)], 'ticktext': mList}
figure['layout']['yaxis'] = {'title': 'Days Counted', 'type': 'lin', 'range': [0, 450], 'gridcolor': '#FFFFFF', 'autorange':False}
figure['layout']['title'] = 'Counting the Days of Each Weekday in Each Month'
figure['layout']['hovermode'] = 'x'
figure['layout']['plot_bgcolor'] = 'rgb(223, 232, 243)'
figure['layout']['autosize'] = True
# Year Slider
sliders_dict = {
'active': 0,
'yanchor': 'top',
'xanchor': 'left',
'currentvalue': {
'font': {'size': 20},
'prefix': 'Year:',
'visible': True,
'xanchor': 'right'
},
'transition': {'duration': 100, 'easing': 'cubic-in-out'},
'pad': {'b': 10, 't': 50},
'len': 0.9,
'x': 0.1,
'y': 0,
'steps': [],
}
# Play & Pause
figure['layout']['updatemenus'] = [
{
'buttons': [
{
'args': [None, {'frame': {'duration': 300, 'redraw': False},
'fromcurrent': True, 'transition': {'duration': 400, 'easing': 'quadratic-in-out'}}],
'label': 'Play',
'method': 'animate'
},
{
'args': [[None], {'frame': {'duration': 0, 'redraw': False}, 'mode': 'immediate',
'transition': {'duration': 0}}],
'label': 'Pause',
'method': 'animate'
},
{
'args': [{'yaxis.autorange': True, 'xaxis.autorange': True}],
'label': 'Rescale',
'method': 'relayout'
},
],
'direction': 'left',
'pad': {'r': 10, 't': 87},
'showactive': False,
'type': 'buttons',
'x': 0.1,
'xanchor': 'right',
'y': 0,
'yanchor': 'top'
}
]
# Custom marker styles
color = {
'sun': 'rgb(250, 249, 20)', 'mon': 'rgb(250, 20, 5)', 'tues': 'rgb(50, 170, 255)', 'wed': 'rgb(222, 182, 0)',
'thurs': 'rgb(90, 110, 250)', 'fri': 'rgb(115, 211, 143)', 'sat': 'rgb(20, 211, 43)'
}
symbol = {
'sun': 'circle-open-dot', 'mon': 'square-cross', 'tues': 'star-diamond', 'wed': 'hexagram',
'thurs': 'diamond', 'fri': 'pentagon', 'sat': 'star'
}
line_color = {
'sun': 'rgb(250, 99, 220)', 'mon': 'rgb(230, 99, 250)', 'tues': 'rgb(99, 110, 250)', 'wed': 'rgb(222, 222, 44)',
'thurs': 'rgb(50, 170, 255)', 'fri': 'rgb(115, 211, 143)', 'sat': 'rgb(220, 111, 243)'
}
gradient_color = {
'sun': 'rgb(0, 0, 0)', 'mon': 'rgb(230, 20, 0)', 'tues': 'rgb(22, 55, 250)', 'wed': 'rgb(222, 140, 0)',
'thurs': 'rgb(50, 170, 255)', 'fri': 'rgb(115, 211, 143)', 'sat': 'rgb(22, 22, 111)'
}
gradient_type = {
'sun': 'radial', 'mon': 'horizontal', 'tues': 'vertical', 'wed': 'horizontal',
'thurs': 'vertical', 'fri': 'radial', 'sat': 'radial'
}
set_size = 6
set_opacity = 0.6
set_line_width = 3
# Import data from grid
col_name_template = '{year}_{weekday}_{header}'
year = yList[0]
for day in wList:
data_dict = {
'xsrc': countGrid.get_column_reference(col_name_template.format(
year=year, weekday=day, header='month'
)),
'ysrc': countGrid.get_column_reference(col_name_template.format(
year=year, weekday=day, header='count'
)),
'mode': 'markers',
'textsrc': countGrid.get_column_reference(col_name_template.format(
year=year, weekday=day, header='month'
)),
'hoverinfo': 'y+name',
'marker': {
'size': set_size,
'symbol': symbol[day],
'color': color[day],
'opacity': set_opacity,
'line': {'color': line_color[day], 'width': set_line_width },
'gradient': {'color': gradient_color[day], 'type':gradient_type[day]}
},
'name': day
}
figure['data'].append(data_dict)
# Updating frames
for year in yList:
frame = {'data': [], 'name': str(year), 'layout':[]}
for day in wList:
data_dict = {
'xsrc': countGrid.get_column_reference(col_name_template.format(
year=year, weekday=day, header='month'
)),
'ysrc': countGrid.get_column_reference(col_name_template.format(
year=year, weekday=day, header='count'
)),
'mode': 'markers',
'textsrc': countGrid.get_column_reference(col_name_template.format(
year=year, weekday=day, header='month'
)),
'marker': {
'size': set_size,
'symbol': symbol[day],
'color': color[day],
'opacity': set_opacity,
'line': {'color': line_color[day], 'width': set_line_width },
'gradient': {'color': gradient_color[day], 'type':'radial'}
},
'name': day,
}
frame['data'].append(data_dict)
layout_dict = {
'yaxis': {'autorange': True}
}
frame['layout'].append(layout_dict)
figure['frames'].append(frame)
slider_step = {'args': [
[year],
{'frame': {'duration': 30, 'redraw': False},
'mode': 'immediate',
'transition': {'duration': 10}}
],
'label': year,
'method': 'animate'}
sliders_dict['steps'].append(slider_step)
figure['layout']['sliders'] = [sliders_dict]
# Default home zoom
yMin = (int(min(yList)) - 1582) * 4
yMax = ((int(max(yList)) - int(min(yList))) * 4.5) + yMin
figure['layout']['yaxis']['range'] = [yMin, yMax]
# **Have you ever seen those [marble racing videos](https://www.youtube.com/watch?v=iG_jGYqsZZo)?**
#
# Here are some dots to represent the days of the week for each month. Press "play" to watch them race from the start of the Gregorian calendar to present day.
#
# *(Pressing the "Rescale" button zooms in on the action. I haven't figured out how to get the scale to autoupdate with each frame yet-- if you know, please let me know! You might have to keep clicking rescale to follow the movement, or hover over the top of the graph and use the pan tool. The house-shaped icon will reset the axes.)*
# In[69]:
py.icreate_animations(figure, 'weekday-counter-1582-2018'+str(time.time()))
# **Here are some takeaways about each month for a full Gregorian Cycle:**
# * **Jan:** least frequent day is Mon.
# * **Feb:** has 99-159 fewer of each day than the other months.
# * **Mar:** least frequent day is Wed.
# * **May:** least frequent day is Mon.
# * **Jul:** least frequent day is Sat.
# * **Aug:** least frequent day is Tues.
# *With 1772 Fridays and 1772 Saturdays, August is the month that has the most weekend days!!*
# * **Oct:** least frequent day is Sun.
# * **Dec:** least frequent day is Fri.
#
# **Highs & Lows**
# * Highest count for any weekday: 1772
# * Lowest count for non-Feb months: 1714
# * Range for Feb days: 1613-1615
#
# So there you have it. You can file that under ~~useful~~ information.
# #### Further Reading
#
# I had written a similar shorter program in js on Feb 24, 2016, which I later learned coincided as the 434th anniversary of the papal bull known as the *Inter gravissimas*, issused by Pope_Greg13, which gave us the calendar that we have all come to know and know.
#
# If you want to go down this rabbit hole some more, here are some links related to time-related adjustments we face as a consequence of living on this planet:
# * **Gregorian Calendar:** https://en.wikipedia.org/wiki/Gregorian_calendar
# * **Inter Gravissimas:** https://en.wikisource.org/?curid=566140
# * **Perpetual Calendars:** https://en.wikipedia.org/wiki/Perpetual_calendar
# * **Determination of the day of the week:** https://en.wikipedia.org/wiki/Determination_of_the_day_of_the_week
# * **"How to Figure Out the Day of the Week For Any Date Ever [with just one hand]":** https://www.youtube.com/watch?v=714LTMNJy5M
# * **Doomsday Rule:** https://en.wikipedia.org/wiki/Doomsday_rule
# * **"Leap years: we can do better":** https://www.youtube.com/watch?v=qkt_wmRKYNQ
# * **Leap Seconds:** https://en.wikipedia.org/wiki/Leap_second
# * **Leap Smears:** https://www.webopedia.com/TERM/L/leap-smear.html
# * **Google Smears:** https://developers.google.com/time/smear#othersmears
# * **Leap second Linux server crashes:** https://serverfault.com/questions/403732/anyone-else-experiencing-high-rates-of-linux-server-crashes-during-a-leap-second
# * **Unix Time:** https://en.wikipedia.org/wiki/Unix_time
# * **Year 2038 Problem:** https://en.wikipedia.org/wiki/Year_2038_problem
# * **Names of the Days of the Week:** https://en.wikipedia.org/wiki/Names_of_the_days_of_the_week
# * **Lightning calculation and other "mathemagic" | Arthur Benjamin:** https://www.youtube.com/watch?v=M4vqr3_ROIk
#
# *Thanks to Nick Lanam, Ethan McIntyre for pointing me in the direction of some more helpful links.*