Note: Github is having trouble rendering some of the LaTeX formulas and equation numbers. Please view in nbviewer.

Colley's Matrix Method¶

Introduction¶

Welcome! This is the first in a two part series on Colley's Matrix Method for creating a resume rating system. In this first part we will gain access to Colley's brilliantly clean way to rate the resume of every FBS team in college football. Any system that attempts to rank all 130 FBS teams will have it's problems, but I believe that this is the best system that can be created under Colley's strict rules for keeping our resume ratings unbiased.

In the second part, we'll have some fun and break most of Colley's rules. A resume rating simply attempts to measure what every team has achieved relative to one another. It is not concerned with being predictive. In the second part we will introduce some simple priors and hyperparameters to move the resume ratings in the direction of power ratings -- ones that better match common knowledge about the wide range of team ability in college football.

Background¶

The Colley Matrix Method is a resume rating system that was a part of the Official Bowl Championship Series Ranking from 2001 to 2013. To be a good resume rating system means a few things to Colley, as he explains here:

eliminates any bias toward conference, history or tradition,
eliminates the need to invoke some ad hoc means of deflating runaway scores, and
eliminates any other ad hoc adjustments, such as home/away tweaks.

Following these self imposed restrictions, Colley begins by giving every FBS team a starting rating of 1/2 and by taking into account only their wins and losses will arrive at their final rating. What makes his method so powerful is that he uses a simple mathematical method to account for the fact that we don't just want to look at a team's win percentage to rate them. We want to account for strength of schedule, the ability of teams played on a given schedule.

Consider teams A,B and C. If A beats B, and B beats C, then we would say that A has a transitive win over C. It is natural to want to consider transitive wins when ranking teams, because beating a team with a winning record is better than beating a team with no wins at all. In a system that only cares about wins and losses, strength of schedule is simply a proper valuation of transitive wins and losses. Colley found a way to account for strength of schedule by looking at the complete college picture of who beat whom and who lost to whom. What's amazing is that what sounds like a complicated spider-web of tracing these transitive wins and losses can be completly encapsulated into a simple formula.

A Little Math¶

Colley does an excellent job describing his system and its motivation here, which I will now abbreviate. This derivation will be very important to us in the second part of this series. Lets consider the following simple rating for a team,

$\begin{equation} r = \frac{1 + n_w}{2+n_{tot}} \end{equation}$

where $n_w$ is their number of wins, and $n_{tot}$ is the number of games they have played. Notice that a team's rating must be between 0 and 1. A team that has played no games begins with a rating of $r = \frac{1+0}{2+0} = \frac{1}{2}$ . If they play 10 games in a season and win 7 they will have a rating of $r = \frac{1 + 7}{2+10} = \frac{2}{3}$ . This seems reasonable. Now it just takes a moderate amount of algebra to account for strength of schedule. Let's multiply both sides by the denominator.

$\begin{equation} (2+ n_{tot}) r = 1 + n_w \end{equation}$

It wouldn't be algebra without a clever identity, so let's add one now.

$\begin{equation} n_w = \frac{n_w - n_\ell}{2} + \frac{n_w + n_\ell}{2} \end{equation}$

We've added a new symbol, $n_\ell$ , the number of losses. Let's now replace $n_w$ in equation (2) with what we have in equation (3)

$\begin{equation} (2+ n_{tot}) r = 1 + \frac{n_w - n_\ell}{2} + \frac{n_w + n_\ell}{2} \end{equation}$

Let's move some stuff to the other side.

$\begin{equation} (2+ n_{tot}) r - \frac{n_w + n_\ell}{2} = 1 + \frac{n_w - n_\ell}{2} \end{equation}$

Notice that every game is a win or a loss. $n_{tot} = n_w + n_\ell$ .

$\begin{equation} (2+ n_{tot}) r - \frac{n_{tot}}{2} = 1 + \frac{n_w - n_\ell}{2} \end{equation}$

Bring the $n_{tot}$ terms together on the left hand side.

$\begin{equation} 2r + n_{tot}(r - \frac{1}{2}) = 1 + \frac{n_w - n_\ell}{2} \end{equation}$

Now remember that multiplication is simply repeated addition.

$\begin{equation} 2r + \displaystyle\sum^{n_{tot}}(r - \frac{1}{2}) = 1 + \frac{n_w - n_\ell}{2} \end{equation}$

The $\Sigma$ symbol means we will sum $n_{tot}$ times.

whew! That was a lot of small steps that really added up. Lets take a step back and interpret our equation. If zero games have been played, everything goes away except for $2r = 1$ which recovers a rating of $\frac{1}{2}$ . As more games are played, the right hand side either increases or decreases by a half depending on if it is a win or a loss. In order to maintain equality, the rating r on the left hand side has to increase or decrease to match.

Notice that the summation on the left hand side is over every game played. For every game we take the difference between the team's rating, r, and the average rating of an opponent, $\frac{1}{2}$ . Colley's insight was that instead of taking the difference from the average rating, we can actually take the difference from the rating of the teams they have played. In order to do this we need a little more notation. Adding a superscript $i$ will denote that a given symbol pertains to team i.

$\begin{equation} 2r^i + \displaystyle\sum^{n^i_{tot}}(r^i - \frac{1}{2}) = 1 + \frac{n^i_w - n^i_\ell}{2} \end{equation}$

Lets use a subscript of j for each team played by team i. Then $r^i_j$ is rating of the $j^{th}$ team played by team i. Let's replace the $\frac{1}{2}$ term with these $r^i_j$ .

$\begin{equation} 2r^i + \displaystyle\sum_{j=1}^{n^i_{tot}}(r^i - r^i_j) = 1 + \frac{n^i_w - n^i_\ell}{2} \end{equation}$

Every team will have one of these equations, so we can package the whole system as a matrix equation.

$\begin{gather} \begin{bmatrix} 2+n^1_{tot} & -n^{1,2} & \ldots & -n^{1,M} \\ -n^{2,1} & 2+n^2_{tot} & \ldots & -n^{2,M} \\ \vdots & \vdots & \ddots & \vdots \\ -n^{M,1} & -n^{M,2} & \ldots &2+n^2_{tot} \\ \end{bmatrix} \begin{bmatrix} r^1 \\ r^2 \\ \vdots \\ r^M \\ \end{bmatrix}= \begin{bmatrix} 1 + \frac{n^1_w - n^1_\ell}{2} \\ 1+ \frac{n^2_w - n^2_\ell}{2} \\ \vdots \\ 1+ \frac{n^M_w - n^M_\ell}{2} \\ \end{bmatrix} \end{gather}$

Here we assume that we have M teams. The diagonal counts 2 plus the number of games played by team i. The off diagonal counts how many times team i has played team j. Note that $n^{i,j} = n^{j,i}$ , so this matrix is symmetric. The r column vector has the ratings we want to calculate, and the column vector after the equals accounts for the total wins and losses. Now all we need to do is build this matrix and use a solver to get those ratings!

Building the matrix¶

Finally some python! collegefootballdata.com has an excellent API for getting all of the games we want.

In [39]:

import pandas as pd
import requests
import numpy as np

year = 2019

response = requests.get(r'https://api.collegefootballdata.com/games?'
                            'year={year}&seasonType=both'.format(year = year))
games = pd.read_json(response.text)

games.head()

Out[39]:

	id	season	week	season_type	start_date	start_time_tbd	neutral_site	conference_game	attendance	venue_id	...	home_points	home_line_scores	home_post_win_prob	away_id	away_team	away_conference	away_points	away_line_scores	away_post_win_prob	excitement_index
0	401110723	2019	1	regular	2019-08-24T23:00:00.000Z	NaN	True	False	66543.0	4013	...	24	[7, 0, 10, 7]	0.905953	2390	Miami	ACC	20	[3, 10, 0, 7]	0.094047	8.767910
1	401114164	2019	1	regular	2019-08-25T02:30:00.000Z	NaN	False	False	22396.0	3610	...	45	[14, 14, 7, 10]	0.688630	12	Arizona	Pac-12	38	[0, 21, 14, 3]	0.311370	7.842417
2	401117855	2019	1	regular	2019-08-29T23:00:00.000Z	NaN	False	False	19648.0	3892	...	24	[7, 3, 14, 0]	0.728942	2681	Wagner	None	21	[0, 0, 14, 7]	0.271058	1.834351
3	401119255	2019	1	regular	2019-08-29T23:00:00.000Z	NaN	False	False	18412.0	3965	...	38	[21, 7, 10, 0]	0.999788	2523	Robert Morris	None	10	[7, 3, 0, 0]	0.000212	0.118588
4	401119254	2019	1	regular	2019-08-29T23:00:00.000Z	NaN	False	False	17620.0	3700	...	46	[13, 17, 7, 9]	0.999979	2415	Morgan State	None	3	[0, 3, 0, 0]	0.000021	0.472968

5 rows × 24 columns

Great! Now, lets simplify. The next three lines do three things:

Take just the FBS games (no FCS games)
Drop any unplayed or canceled games
Take just the columns we need

In [40]:

games = games[(~games['home_conference'].isnull()) & (~games['away_conference'].isnull())]
games = games[(games['home_points'] > 0) | (games['away_points'] > 0)]
games = games[['home_team','home_points','away_team','away_points']]

games.head()

Out[40]:

	home_team	home_points	away_team	away_points
0	Florida	24	Miami	20
1	Hawai'i	45	Arizona	38
5	Cincinnati	24	UCLA	14
9	Clemson	52	Georgia Tech	14
11	Tulane	42	Florida International	14

That looks better! Let's add a $\pm1$ for whether the home or away team weans, and a column of ones.

In [41]:

games['home_win'] = -1+ 2*(games['home_points'] > games['away_points']).astype(int)
games['ones'] = 1

games.head()

Out[41]:

	home_team	home_points	away_team	away_points	home_win	ones
0	Florida	24	Miami	20	1	1
1	Hawai'i	45	Arizona	38	1	1
5	Cincinnati	24	UCLA	14	1	1
9	Clemson	52	Georgia Tech	14	1	1
11	Tulane	42	Florida International	14	1	1

It will be useful to have a list of the teams so lets get that now.

In [42]:

teams = pd.DataFrame(games['home_team'].append(games['away_team']).unique(),columns = ['team'])
teams = teams.sort_values(by = ['team']).reset_index(drop = True)

teams.head()

Out[42]:

	team
0	Air Force
1	Akron
2	Alabama
3	Appalachian State
4	Arizona

Okay! Now lets get the vector on the right hand side of the matrix equation.

In [43]:

colley_vec = 1+(games[['home_team','home_win']].groupby('home_team').sum()\
         -games[['away_team','home_win']].groupby('away_team').sum())/2
colley_vec = colley_vec.rename(columns = {'home_win':'str_of_rec'})

colley_vec.head()

Out[43]:

	str_of_rec
home_team
Air Force	5.0
Akron	-5.0
Alabama	5.0
Appalachian State	6.5
Arizona	-1.5

Creating the matrix takes a couple clever moves. First we will make a vector that counts games played and use that to create the diagonal of the colley matrix. We'll only look at a few teams since this matrix is 130x130.

In [44]:

games_played = (games[['home_team','ones']].groupby('home_team').sum()+games[['away_team','ones']].groupby('away_team').sum())
diag = pd.DataFrame(2*np.identity(len(colley_vec))+np.diag(games_played['ones']),teams['team'],teams['team'])

diag.loc[['Michigan','Wisconsin','Ohio State'],['Michigan','Wisconsin','Ohio State']]

Out[44]:

team	Michigan	Wisconsin	Ohio State
team
Michigan	15.0	0.0	0.0
Wisconsin	0.0	16.0	0.0
Ohio State	0.0	0.0	16.0

In order to create the off-diagonal entries, we will pivot on our dataframe twice, once for counting games for the home team, and once more for the away team. Adding this to our diagonal gives the Colley Matrix.

In [45]:

piv1 = pd.pivot_table(games,values = 'ones',index = 'home_team', \
                      columns = 'away_team', aggfunc = np.sum).fillna(0)

piv2 = pd.pivot_table(games,values = 'ones',index = 'away_team', \
                      columns = 'home_team', aggfunc = np.sum).fillna(0)
    
colley_mat = diag - piv1 - piv2

colley_mat.loc[['Michigan','Wisconsin','Ohio State'],['Michigan','Wisconsin','Ohio State']]

Out[45]:

team	Michigan	Wisconsin	Ohio State
team
Michigan	15.0	-1.0	-1.0
Wisconsin	-1.0	16.0	-2.0
Ohio State	-1.0	-2.0	16.0

Great! We can see that each team played one another at least once, and Wisconsin and Ohio State played each other twice.

We just run a matrix solver at this point and we'll have our ratings!

In [46]:

colley_inv = pd.DataFrame(np.linalg.pinv(colley_mat.values),colley_mat.columns,colley_mat.index)
ratings = colley_inv.dot(colley_vec)
ratings.rename(columns={'str_of_rec':'rating'},inplace=True)

ratings = ratings.sort_values(by = ['rating'], ascending = False)

ratings.head(8)

Out[46]:

	rating
team
LSU	1.064182
Ohio State	0.986428
Clemson	0.943394
Georgia	0.926277
Penn State	0.891403
Florida	0.876903
Oregon	0.869208
Notre Dame	0.850672

Awesome! It looks reasonable too! We can compare this to Colley's ratings to see if we're right. As of 2007, Colley added in a roundabout way of including FCS teams, but our ratings should be very close to his. We can check 2006 and see that they agree up to four decimal places, which is good enough for me!

Next time, we'll take this resume ranking system and see what we can do to make it more representative of team's power. Colley's Matrix Method is a compelling way for accounting for strength of schedule. If we can find a way to add in more information than simply wins and losses, we may be able to create some pretty reliable power ratings!