Days of the Week of each Month through the Years

Album Shen

Wednesday the 21st, November, 2018

I've been helping my 5th grade sister with her math league problems, and some brainteasers involving leap years came up. This sparked a seemingly (probably actually is) trivial question of how do the days of the week compare to one another in how often they appear in each month of our calendars? One would guess that over time the frequency of each weekday occurring for each month would more-or-less get evened out, right? (i.e. an equal number of mon, tues, wed, etc.) Let's find out by taking a look at how the months on our calendar actually cycle through the days of the week, and when/if the sequence comes back to itself. Some questions we may answer are:

  • Over time, will there be more Mondays in January than Tuesdays in January?
  • How many Wednesdays have there been in March since the start of the Gregorian calendar?
  • If I host an event every 5th Thursday in February, would I be hosting more events than if I were to do it every 5th Friday?

There are a number of existing algorithms that can determine the day of the week for any given date. But I wanted to see how the distribution of these days in each month of the year play out over the course of an entire Gregorian cycle. We'll be running a program to evaluating the frequency of each weekday as they've occur in each month throughout the years since our current calendar's conception.

Fun fact: The first day of the Gregorian calendar was October 15, 1582. A Friday.

In [3]:
# Setting up the libaries & functions we'll be using
import pandas as pd
from IPython.display import display_html, HTML
from datetime import date, datetime, timedelta as td
from plotly.grid_objs import Grid, Column
import plotly.plotly as py
import time

# Dictionaries for dataframe column & row headers
weekdays = {0:"sun", 1:"mon", 2:"tues", 3:"wed",
            4:"thurs", 5:"fri", 6:"sat"}
months = {0:"jan", 1:"feb", 2:"mar", 3:"apr",
          4:"may", 5:"jun", 6:"jul", 7:"aug",
          8:"sept", 9:"oct", 10:"nov", 11:"dec"}
ordinal = {0:"1st", 1:"2nd", 2:"3rd", 3:"4th",
           4:"5th", 5:"6th", 6:"7th"}

# Print the dates + weekday information given the
# intervals of days from a specified origin date
def printWeekDays(daysFrom, originDate):
    DayOnes = pd.to_datetime(daysFrom, unit='D', origin=pd.Timestamp(originDate))
    for day in DayOnes:
        print (day.strftime("%Y %b %d: %A (Day %w)"))

# Display dataframes side-by-side with their names on top
def disp_dfs(*args):
    html_str = ''
    for df in args:
        html_str += '<div style=max-width:45%;float:left;>'\
            '<p style=font-weight:bold;text-align:center;>'\
            +df.name+'</p>'\
            +df.to_html()+'</div>'
    display_html(html_str.replace('table', 'table style=display:inline'), raw=True)
    
# Date ranges using datetime dates "date(%Y,%m,%d)" as input
def dateRange(start_date, end_date):
    for n in range(int ((end_date - start_date).days)):
        yield start_date + td(n)

Weekdays & Leap Cycles

How long does it take for a given date to cycle back and coincide on the same day of the week again?

The day of the week of any given date shifts 1 day for each nonleap year, and 2 years forward for each leap year.

In [4]:
# What a difference 4 years make
daysFrom = [0, 365+1, 365*2+1, 365*3+1, 365*4+1]
printWeekDays(daysFrom, '2000-1-1')
2000 Jan 01: Saturday (Day 6)
2001 Jan 01: Monday (Day 1)
2002 Jan 01: Tuesday (Day 2)
2003 Jan 01: Wednesday (Day 3)
2004 Jan 01: Thursday (Day 4)

Over the course of each leap year interval, the total shift is by 5 (or -2) weekdays.

So after 7 intervals (4*7 = 28 years), we should be back to the same day of the week on that date of the year.

In [5]:
# 7 leap cycles
daysFrom = [0] * 8
for day in range(len(daysFrom)):
    daysFrom[day] = 1461*day
printWeekDays(daysFrom, '2000-1-1')
2000 Jan 01: Saturday (Day 6)
2004 Jan 01: Thursday (Day 4)
2008 Jan 01: Tuesday (Day 2)
2012 Jan 01: Sunday (Day 0)
2016 Jan 01: Friday (Day 5)
2020 Jan 01: Wednesday (Day 3)
2024 Jan 01: Monday (Day 1)
2028 Jan 01: Saturday (Day 6)

The Lesser Known Century Rule

28 day intervals maintain the same day of the week, but this does not account for the fact that there's an additional adjustment such that every century year that is not divisible by 400 is not a leap year so we start to see a drift and overcount by a day in the date, if we iterate every 28 years, as we pass such centuries.

In [4]:
# Drifting through the centuries
daysFrom = [0] * 10
for day in range(len(daysFrom)):
    daysFrom[day] = 1461*7*day
printWeekDays(daysFrom, '2000-1-1')
2000 Jan 01: Saturday (Day 6)
2028 Jan 01: Saturday (Day 6)
2056 Jan 01: Saturday (Day 6)
2084 Jan 01: Saturday (Day 6)
2112 Jan 02: Saturday (Day 6)
2140 Jan 02: Saturday (Day 6)
2168 Jan 02: Saturday (Day 6)
2196 Jan 02: Saturday (Day 6)
2224 Jan 03: Saturday (Day 6)
2252 Jan 03: Saturday (Day 6)

A full Gregorian calendar cycle is 400 years, with 3 leap years omitted because they are century years nondivisible by 400.

The total number of days in a full Gregorian cycle is 146097 = (400 yr * 365 days/yr) + 97 days from leap years.

In [6]:
# Full Greg
daysFrom = [0, 146097]
printWeekDays(daysFrom, '1700-1-1')
1700 Jan 01: Friday (Day 5)
2100 Jan 01: Friday (Day 5)

146097 days is divisible by 7, so we see that every 400 years we are back to the day of the week in which we started, on the date of the calendar in which we started.

But 400 years is not divisible by the 7 days of the week, and because days of the month other than nonleap year Feb have a number of days nondivisible by 7 (i.e. 29, 30, 31), there will be extra counts for the first one to three days of the week that that month began on. So this means we should expect an unequal distribution of the frequency of days of the week for each month.

In [6]:
# Number of days in each month for each month
nonleap = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
leap = [31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]

# Number of days shifted month-to-month
nonShift = [0]*12
leapShift = [0]*12

# Dataframes for each type of year's distribution.
# Every month has 28 or more days, so we can start with
# at at least 4 counts for each day of the week.
nonDist = pd.DataFrame([[4]*7]*12)
nonDist.name = 'Distribution of Days in Each Month by Order\
                of Appearance in Non-Leap Years'
leapDist = pd.DataFrame([[4]*7]*12)
leapDist.name = 'Distribution of Days in Each Month by Order\
                of Appearance in Leap Years'

# Then we can add in extra days depending on how many days
# over 28 each month has.
for month in range(0,12):
    nonShift[month] = nonleap[month]%7
    for extraDay in range(0, nonShift[month]):
        nonDist.at[month, extraDay] = 5
    leapShift[month] = leap[month]%7
    for extraDay in range(0, leapShift[month]):
        leapDist.at[month, extraDay] = 5
        
nonDist.rename(columns=ordinal, index=months, inplace=True)
leapDist.rename(columns=ordinal, index=months, inplace=True)
disp_dfs(nonDist, leapDist)

Distribution of Days in Each Month by Order of Appearance in Non-Leap Years

1st 2nd 3rd 4th 5th 6th 7th
jan 5 5 5 4 4 4 4
feb 4 4 4 4 4 4 4
mar 5 5 5 4 4 4 4
apr 5 5 4 4 4 4 4
may 5 5 5 4 4 4 4
jun 5 5 4 4 4 4 4
jul 5 5 5 4 4 4 4
aug 5 5 5 4 4 4 4
sept 5 5 4 4 4 4 4
oct 5 5 5 4 4 4 4
nov 5 5 4 4 4 4 4
dec 5 5 5 4 4 4 4

Distribution of Days in Each Month by Order of Appearance in Leap Years

1st 2nd 3rd 4th 5th 6th 7th
jan 5 5 5 4 4 4 4
feb 5 4 4 4 4 4 4
mar 5 5 5 4 4 4 4
apr 5 5 4 4 4 4 4
may 5 5 5 4 4 4 4
jun 5 5 4 4 4 4 4
jul 5 5 5 4 4 4 4
aug 5 5 5 4 4 4 4
sept 5 5 4 4 4 4 4
oct 5 5 5 4 4 4 4
nov 5 5 4 4 4 4 4
dec 5 5 5 4 4 4 4

The Distribution of Days of the Week for Each Month Over 400 Years

If we were 5th graders competing in a pencil + paper math contest, we could use the tables above, count the number of nonleap and leap years, and make adjustments for which weekdays begin on which year.

But the easiest way to get this distribution would be to and iterate through a period of 400 years. On a computer it only takes about a minute.

In [67]:
# Initialize blank dataframe
counts = pd.DataFrame([[0]*7]*12)
counts.rename(columns=weekdays, index=months, inplace=True)

# Count range (end_date non-inclusive)
start_date = date(1582,10,15)
end_date = date.today()

# Columns for making grid
yList = []
mList = list(months.values())
wList = list(weekdays.values())
current_columns = []

# Adds a column every year
def addColumns(columns, wList, y):
    for i in wList:
        y_col_name = '{year}_{weekday}_{header}'.format(year=y, weekday=i, header='year')
        y_col = Column((list([y]*12)), y_col_name)
        columns.append(y_col)
        mListInt = [(j*10+(1.5*wList.index(i))) for j in range(0, 12)]
        m_col_name = '{year}_{weekday}_{header}'.format(year=y, weekday=i, header='month')
        m_col = Column(mListInt, m_col_name)
        columns.append(m_col)
        c_col_name = '{year}_{weekday}_{header}'.format(year=y, weekday=i, header='count')
        c_col = Column(counts[i].tolist(), c_col_name)
        columns.append(c_col)

# Let's count
for single_date in dateRange(start_date, end_date):
    # Updates grid/csv on new year's day
    if (single_date.strftime('%m %d')=='01 01') and (int(single_date.strftime('%y'))%4==0):
        y = int(single_date.strftime('%Y'))-1
        addColumns(current_columns, wList, y)
        yList.append(str(int(single_date.strftime('%Y'))-1))
    # Updates count in dataframe
    m = (1 * int(single_date.strftime("%m")))-1
    w = int(single_date.strftime("%w"))
    counts.iloc[m, w] += 1

# Update if end date not on new year's day
if (end_date.strftime('%m %d')!='01 01'):
    addColumns(current_columns, wList, end_date.strftime('%Y'))
    yList.append(str(int(single_date.strftime('%Y'))))
    
# Upload grid to plotly
countGrid = Grid(current_columns)
url = py.grid_ops.upload(countGrid, 'weekday_counter_1582_2018_grid'+str(time.time()), auto_open=False)
url
Out[67]:
'https://plot.ly/~album/114/'

I ran and stored the final dataframe for a full Gregorian cycle to '400_years.csv'. Drumroll please.

In [7]:
gregCycle = pd.read_csv('400_years.csv', index_col=0)
gregCycle.name = 'Number of Weekdays That Occur in Each Month in Each Gregorian Calendar Cycle'

# To check that we did in fact count all the days
print ("Counted " + str(gregCycle.values.sum()) + " days")

disp_dfs(gregCycle)
Counted 146097 days

Number of Weekdays That Occur in Each Month in Each Gregorian Calendar Cycle

sun mon tues wed thurs fri sat
jan 1772 1770 1772 1771 1772 1772 1771
feb 1613 1615 1613 1615 1613 1614 1614
mar 1772 1771 1772 1770 1772 1771 1772
apr 1714 1715 1714 1715 1714 1714 1714
may 1772 1770 1772 1771 1772 1772 1771
jun 1714 1715 1714 1714 1714 1714 1715
jul 1772 1771 1772 1772 1771 1772 1770
aug 1771 1772 1770 1772 1771 1772 1772
sept 1715 1714 1715 1714 1714 1714 1714
oct 1770 1772 1771 1772 1772 1771 1772
nov 1715 1714 1714 1714 1714 1715 1714
dec 1771 1772 1772 1771 1772 1770 1772

Animated Graph

In [68]:
# Figure
figure = {
    'data': [],
    'layout': {},
    'frames': [],
    'config': {'scrollzoom': True}
}

# Fill in most of layout
figure['layout']['xaxis'] = {'title': 'Month', 'gridcolor': '#FFFFFF', 'range': [-2, 120], 'zeroline': False,
                             'tickvals': [(i*10+3) for i in range(0, 12)], 'ticktext': mList}
figure['layout']['yaxis'] = {'title': 'Days Counted', 'type': 'lin', 'range': [0, 450], 'gridcolor': '#FFFFFF', 'autorange':False}
figure['layout']['title'] = 'Counting the Days of Each Weekday in Each Month'
figure['layout']['hovermode'] = 'x'
figure['layout']['plot_bgcolor'] = 'rgb(223, 232, 243)'
figure['layout']['autosize'] = True

# Year Slider
sliders_dict = {
    'active': 0,
    'yanchor': 'top',
    'xanchor': 'left',
    'currentvalue': {
        'font': {'size': 20},
        'prefix': 'Year:',
        'visible': True,
        'xanchor': 'right'
    },
    'transition': {'duration': 100, 'easing': 'cubic-in-out'},
    'pad': {'b': 10, 't': 50},
    'len': 0.9,
    'x': 0.1,
    'y': 0,
    'steps': [],
}

# Play & Pause
figure['layout']['updatemenus'] = [
    {
        'buttons': [
            {
                'args': [None, {'frame': {'duration': 300, 'redraw': False},
                         'fromcurrent': True, 'transition': {'duration': 400, 'easing': 'quadratic-in-out'}}],
                'label': 'Play',
                'method': 'animate'
            },
            {
                'args': [[None], {'frame': {'duration': 0, 'redraw': False}, 'mode': 'immediate',
                'transition': {'duration': 0}}],
                'label': 'Pause',
                'method': 'animate'
            },
            {
                'args': [{'yaxis.autorange': True, 'xaxis.autorange': True}],
                'label': 'Rescale',
                'method': 'relayout'
            },
        ],
        'direction': 'left',
        'pad': {'r': 10, 't': 87},
        'showactive': False,
        'type': 'buttons',
        'x': 0.1,
        'xanchor': 'right',
        'y': 0,
        'yanchor': 'top'
    }
]

# Custom marker styles
color = {
    'sun': 'rgb(250, 249, 20)', 'mon': 'rgb(250, 20, 5)', 'tues': 'rgb(50, 170, 255)', 'wed': 'rgb(222, 182, 0)',
    'thurs': 'rgb(90, 110, 250)', 'fri': 'rgb(115, 211, 143)', 'sat': 'rgb(20, 211, 43)'
}
symbol = {
    'sun': 'circle-open-dot', 'mon': 'square-cross', 'tues': 'star-diamond', 'wed': 'hexagram',
    'thurs': 'diamond', 'fri': 'pentagon', 'sat': 'star'
}
line_color = {
    'sun': 'rgb(250, 99, 220)', 'mon': 'rgb(230, 99, 250)', 'tues': 'rgb(99, 110, 250)', 'wed': 'rgb(222, 222, 44)',
    'thurs': 'rgb(50, 170, 255)', 'fri': 'rgb(115, 211, 143)', 'sat': 'rgb(220, 111, 243)'
}
gradient_color = {
    'sun': 'rgb(0, 0, 0)', 'mon': 'rgb(230, 20, 0)', 'tues': 'rgb(22, 55, 250)', 'wed': 'rgb(222, 140, 0)',
    'thurs': 'rgb(50, 170, 255)', 'fri': 'rgb(115, 211, 143)', 'sat': 'rgb(22, 22, 111)'
}
gradient_type = {
    'sun': 'radial', 'mon': 'horizontal', 'tues': 'vertical', 'wed': 'horizontal',
    'thurs': 'vertical', 'fri': 'radial', 'sat': 'radial'
}
set_size = 6
set_opacity = 0.6
set_line_width = 3

# Import data from grid
col_name_template = '{year}_{weekday}_{header}'
year = yList[0]
for day in wList:
    data_dict = {
        'xsrc': countGrid.get_column_reference(col_name_template.format(
            year=year, weekday=day, header='month'
        )),
        'ysrc': countGrid.get_column_reference(col_name_template.format(
            year=year, weekday=day, header='count'
        )),
        'mode': 'markers',
        'textsrc': countGrid.get_column_reference(col_name_template.format(
            year=year, weekday=day, header='month'
        )),
        'hoverinfo': 'y+name',
        'marker': {
            'size': set_size,
            'symbol': symbol[day],
            'color': color[day],
            'opacity': set_opacity,
            'line': {'color': line_color[day], 'width': set_line_width },
            'gradient': {'color': gradient_color[day], 'type':gradient_type[day]}
        },
        'name': day
    }
    figure['data'].append(data_dict)

# Updating frames
for year in yList:
    frame = {'data': [], 'name': str(year), 'layout':[]}
    for day in wList:
        data_dict = {
            'xsrc': countGrid.get_column_reference(col_name_template.format(
                year=year, weekday=day, header='month'
            )),
            'ysrc': countGrid.get_column_reference(col_name_template.format(
                year=year, weekday=day, header='count'
            )),
            'mode': 'markers',
            'textsrc': countGrid.get_column_reference(col_name_template.format(
                year=year, weekday=day, header='month'
                )),
            'marker': {
                'size': set_size,
                'symbol': symbol[day],
                'color': color[day],
                'opacity': set_opacity,
                'line': {'color': line_color[day], 'width': set_line_width },
                'gradient': {'color': gradient_color[day], 'type':'radial'}
            },
            'name': day,
        }
        frame['data'].append(data_dict)
        layout_dict = {
            'yaxis': {'autorange': True}
        }
        frame['layout'].append(layout_dict)
    figure['frames'].append(frame)

    slider_step = {'args': [
        [year],
        {'frame': {'duration': 30, 'redraw': False},
         'mode': 'immediate',
       'transition': {'duration': 10}}
     ],
     'label': year,
     'method': 'animate'}
    sliders_dict['steps'].append(slider_step)
    figure['layout']['sliders'] = [sliders_dict]

# Default home zoom
yMin = (int(min(yList)) - 1582) * 4
yMax = ((int(max(yList)) - int(min(yList))) * 4.5) + yMin
figure['layout']['yaxis']['range'] = [yMin, yMax]

Have you ever seen those marble racing videos?

Here are some dots to represent the days of the week for each month. Press "play" to watch them race from the start of the Gregorian calendar to present day.

(Pressing the "Rescale" button zooms in on the action. I haven't figured out how to get the scale to autoupdate with each frame yet-- if you know, please let me know! You might have to keep clicking rescale to follow the movement, or hover over the top of the graph and use the pan tool. The house-shaped icon will reset the axes.)

In [69]:
py.icreate_animations(figure, 'weekday-counter-1582-2018'+str(time.time()))
Out[69]:

Here are some takeaways about each month for a full Gregorian Cycle:

  • Jan: least frequent day is Mon.
  • Feb: has 99-159 fewer of each day than the other months.
  • Mar: least frequent day is Wed.
  • May: least frequent day is Mon.
  • Jul: least frequent day is Sat.
  • Aug: least frequent day is Tues.
    With 1772 Fridays and 1772 Saturdays, August is the month that has the most weekend days!!
  • Oct: least frequent day is Sun.
  • Dec: least frequent day is Fri.

Highs & Lows

  • Highest count for any weekday: 1772
  • Lowest count for non-Feb months: 1714
  • Range for Feb days: 1613-1615

So there you have it. You can file that under useful information.

Further Reading

I had written a similar shorter program in js on Feb 24, 2016, which I later learned coincided as the 434th anniversary of the papal bull known as the Inter gravissimas, issused by Pope_Greg13, which gave us the calendar that we have all come to know and know.

If you want to go down this rabbit hole some more, here are some links related to time-related adjustments we face as a consequence of living on this planet:

Thanks to Nick Lanam, Ethan McIntyre for pointing me in the direction of some more helpful links.