Note: Github is having trouble rendering some of the LaTeX formulas and equation numbers. Please view in nbviewer.
Welcome! This is the first in a two part series on Colley's Matrix Method for creating a resume rating system. In this first part we will gain access to Colley's brilliantly clean way to rate the resume of every FBS team in college football. Any system that attempts to rank all 130 FBS teams will have it's problems, but I believe that this is the best system that can be created under Colley's strict rules for keeping our resume ratings unbiased.
In the second part, we'll have some fun and break most of Colley's rules. A resume rating simply attempts to measure what every team has achieved relative to one another. It is not concerned with being predictive. In the second part we will introduce some simple priors and hyperparameters to move the resume ratings in the direction of power ratings -- ones that better match common knowledge about the wide range of team ability in college football.
The Colley Matrix Method is a resume rating system that was a part of the Official Bowl Championship Series Ranking from 2001 to 2013. To be a good resume rating system means a few things to Colley, as he explains here:
Following these self imposed restrictions, Colley begins by giving every FBS team a starting rating of 1/2 and by taking into account only their wins and losses will arrive at their final rating. What makes his method so powerful is that he uses a simple mathematical method to account for the fact that we don't just want to look at a team's win percentage to rate them. We want to account for strength of schedule, the ability of teams played on a given schedule.
Consider teams A,B and C. If A beats B, and B beats C, then we would say that A has a transitive win over C. It is natural to want to consider transitive wins when ranking teams, because beating a team with a winning record is better than beating a team with no wins at all. In a system that only cares about wins and losses, strength of schedule is simply a proper valuation of transitive wins and losses. Colley found a way to account for strength of schedule by looking at the complete college picture of who beat whom and who lost to whom. What's amazing is that what sounds like a complicated spider-web of tracing these transitive wins and losses can be completly encapsulated into a simple formula.
Colley does an excellent job describing his system and its motivation here, which I will now abbreviate. This derivation will be very important to us in the second part of this series. Lets consider the following simple rating for a team,
r=1+nw2+ntotwhere nw is their number of wins, and ntot is the number of games they have played. Notice that a team's rating must be between 0 and 1. A team that has played no games begins with a rating of r=1+02+0=12. If they play 10 games in a season and win 7 they will have a rating of r=1+72+10=23. This seems reasonable. Now it just takes a moderate amount of algebra to account for strength of schedule. Let's multiply both sides by the denominator.
(2+ntot)r=1+nwIt wouldn't be algebra without a clever identity, so let's add one now.
nw=nw−nℓ2+nw+nℓ2We've added a new symbol, nℓ, the number of losses. Let's now replace nw in equation (2) with what we have in equation (3)
(2+ntot)r=1+nw−nℓ2+nw+nℓ2Let's move some stuff to the other side.
(2+ntot)r−nw+nℓ2=1+nw−nℓ2Notice that every game is a win or a loss. ntot=nw+nℓ.
(2+ntot)r−ntot2=1+nw−nℓ2Bring the ntot terms together on the left hand side.
2r+ntot(r−12)=1+nw−nℓ2Now remember that multiplication is simply repeated addition.
2r+ntot∑(r−12)=1+nw−nℓ2The Σ symbol means we will sum ntot times.
whew! That was a lot of small steps that really added up. Lets take a step back and interpret our equation. If zero games have been played, everything goes away except for 2r=1 which recovers a rating of 12. As more games are played, the right hand side either increases or decreases by a half depending on if it is a win or a loss. In order to maintain equality, the rating r on the left hand side has to increase or decrease to match.
Notice that the summation on the left hand side is over every game played. For every game we take the difference between the team's rating, r, and the average rating of an opponent, 12. Colley's insight was that instead of taking the difference from the average rating, we can actually take the difference from the rating of the teams they have played. In order to do this we need a little more notation. Adding a superscript i will denote that a given symbol pertains to team i.
2ri+nitot∑(ri−12)=1+niw−niℓ2Lets use a subscript of j for each team played by team i. Then rij is rating of the jth team played by team i. Let's replace the 12 term with these rij.
2ri+nitot∑j=1(ri−rij)=1+niw−niℓ2Every team will have one of these equations, so we can package the whole system as a matrix equation.
[2+n1tot−n1,2…−n1,M−n2,12+n2tot…−n2,M⋮⋮⋱⋮−nM,1−nM,2…2+n2tot][r1r2⋮rM]=[1+n1w−n1ℓ21+n2w−n2ℓ2⋮1+nMw−nMℓ2]Here we assume that we have M teams. The diagonal counts 2 plus the number of games played by team i. The off diagonal counts how many times team i has played team j. Note that ni,j=nj,i, so this matrix is symmetric. The r column vector has the ratings we want to calculate, and the column vector after the equals accounts for the total wins and losses. Now all we need to do is build this matrix and use a solver to get those ratings!
Finally some python! collegefootballdata.com has an excellent API for getting all of the games we want.
import pandas as pd
import requests
import numpy as np
year = 2019
response = requests.get(r'https://api.collegefootballdata.com/games?'
'year={year}&seasonType=both'.format(year = year))
games = pd.read_json(response.text)
games.head()
id | season | week | season_type | start_date | start_time_tbd | neutral_site | conference_game | attendance | venue_id | ... | home_points | home_line_scores | home_post_win_prob | away_id | away_team | away_conference | away_points | away_line_scores | away_post_win_prob | excitement_index | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 401110723 | 2019 | 1 | regular | 2019-08-24T23:00:00.000Z | NaN | True | False | 66543.0 | 4013 | ... | 24 | [7, 0, 10, 7] | 0.905953 | 2390 | Miami | ACC | 20 | [3, 10, 0, 7] | 0.094047 | 8.767910 |
1 | 401114164 | 2019 | 1 | regular | 2019-08-25T02:30:00.000Z | NaN | False | False | 22396.0 | 3610 | ... | 45 | [14, 14, 7, 10] | 0.688630 | 12 | Arizona | Pac-12 | 38 | [0, 21, 14, 3] | 0.311370 | 7.842417 |
2 | 401117855 | 2019 | 1 | regular | 2019-08-29T23:00:00.000Z | NaN | False | False | 19648.0 | 3892 | ... | 24 | [7, 3, 14, 0] | 0.728942 | 2681 | Wagner | None | 21 | [0, 0, 14, 7] | 0.271058 | 1.834351 |
3 | 401119255 | 2019 | 1 | regular | 2019-08-29T23:00:00.000Z | NaN | False | False | 18412.0 | 3965 | ... | 38 | [21, 7, 10, 0] | 0.999788 | 2523 | Robert Morris | None | 10 | [7, 3, 0, 0] | 0.000212 | 0.118588 |
4 | 401119254 | 2019 | 1 | regular | 2019-08-29T23:00:00.000Z | NaN | False | False | 17620.0 | 3700 | ... | 46 | [13, 17, 7, 9] | 0.999979 | 2415 | Morgan State | None | 3 | [0, 3, 0, 0] | 0.000021 | 0.472968 |
5 rows × 24 columns
Great! Now, lets simplify. The next three lines do three things:
games = games[(~games['home_conference'].isnull()) & (~games['away_conference'].isnull())]
games = games[(games['home_points'] > 0) | (games['away_points'] > 0)]
games = games[['home_team','home_points','away_team','away_points']]
games.head()
home_team | home_points | away_team | away_points | |
---|---|---|---|---|
0 | Florida | 24 | Miami | 20 |
1 | Hawai'i | 45 | Arizona | 38 |
5 | Cincinnati | 24 | UCLA | 14 |
9 | Clemson | 52 | Georgia Tech | 14 |
11 | Tulane | 42 | Florida International | 14 |
That looks better! Let's add a ±1 for whether the home or away team weans, and a column of ones.
games['home_win'] = -1+ 2*(games['home_points'] > games['away_points']).astype(int)
games['ones'] = 1
games.head()
home_team | home_points | away_team | away_points | home_win | ones | |
---|---|---|---|---|---|---|
0 | Florida | 24 | Miami | 20 | 1 | 1 |
1 | Hawai'i | 45 | Arizona | 38 | 1 | 1 |
5 | Cincinnati | 24 | UCLA | 14 | 1 | 1 |
9 | Clemson | 52 | Georgia Tech | 14 | 1 | 1 |
11 | Tulane | 42 | Florida International | 14 | 1 | 1 |
It will be useful to have a list of the teams so lets get that now.
teams = pd.DataFrame(games['home_team'].append(games['away_team']).unique(),columns = ['team'])
teams = teams.sort_values(by = ['team']).reset_index(drop = True)
teams.head()
team | |
---|---|
0 | Air Force |
1 | Akron |
2 | Alabama |
3 | Appalachian State |
4 | Arizona |
Okay! Now lets get the vector on the right hand side of the matrix equation.
colley_vec = 1+(games[['home_team','home_win']].groupby('home_team').sum()\
-games[['away_team','home_win']].groupby('away_team').sum())/2
colley_vec = colley_vec.rename(columns = {'home_win':'str_of_rec'})
colley_vec.head()
str_of_rec | |
---|---|
home_team | |
Air Force | 5.0 |
Akron | -5.0 |
Alabama | 5.0 |
Appalachian State | 6.5 |
Arizona | -1.5 |
Creating the matrix takes a couple clever moves. First we will make a vector that counts games played and use that to create the diagonal of the colley matrix. We'll only look at a few teams since this matrix is 130x130.
games_played = (games[['home_team','ones']].groupby('home_team').sum()+games[['away_team','ones']].groupby('away_team').sum())
diag = pd.DataFrame(2*np.identity(len(colley_vec))+np.diag(games_played['ones']),teams['team'],teams['team'])
diag.loc[['Michigan','Wisconsin','Ohio State'],['Michigan','Wisconsin','Ohio State']]
team | Michigan | Wisconsin | Ohio State |
---|---|---|---|
team | |||
Michigan | 15.0 | 0.0 | 0.0 |
Wisconsin | 0.0 | 16.0 | 0.0 |
Ohio State | 0.0 | 0.0 | 16.0 |
In order to create the off-diagonal entries, we will pivot on our dataframe twice, once for counting games for the home team, and once more for the away team. Adding this to our diagonal gives the Colley Matrix.
piv1 = pd.pivot_table(games,values = 'ones',index = 'home_team', \
columns = 'away_team', aggfunc = np.sum).fillna(0)
piv2 = pd.pivot_table(games,values = 'ones',index = 'away_team', \
columns = 'home_team', aggfunc = np.sum).fillna(0)
colley_mat = diag - piv1 - piv2
colley_mat.loc[['Michigan','Wisconsin','Ohio State'],['Michigan','Wisconsin','Ohio State']]
team | Michigan | Wisconsin | Ohio State |
---|---|---|---|
team | |||
Michigan | 15.0 | -1.0 | -1.0 |
Wisconsin | -1.0 | 16.0 | -2.0 |
Ohio State | -1.0 | -2.0 | 16.0 |
Great! We can see that each team played one another at least once, and Wisconsin and Ohio State played each other twice.
We just run a matrix solver at this point and we'll have our ratings!
colley_inv = pd.DataFrame(np.linalg.pinv(colley_mat.values),colley_mat.columns,colley_mat.index)
ratings = colley_inv.dot(colley_vec)
ratings.rename(columns={'str_of_rec':'rating'},inplace=True)
ratings = ratings.sort_values(by = ['rating'], ascending = False)
ratings.head(8)
rating | |
---|---|
team | |
LSU | 1.064182 |
Ohio State | 0.986428 |
Clemson | 0.943394 |
Georgia | 0.926277 |
Penn State | 0.891403 |
Florida | 0.876903 |
Oregon | 0.869208 |
Notre Dame | 0.850672 |
Awesome! It looks reasonable too! We can compare this to Colley's ratings to see if we're right. As of 2007, Colley added in a roundabout way of including FCS teams, but our ratings should be very close to his. We can check 2006 and see that they agree up to four decimal places, which is good enough for me!
Next time, we'll take this resume ranking system and see what we can do to make it more representative of team's power. Colley's Matrix Method is a compelling way for accounting for strength of schedule. If we can find a way to add in more information than simply wins and losses, we may be able to create some pretty reliable power ratings!