In this notebook, Markov Chain is used to rank the teams that played in EPL 2016-2017.
The algorithm used discussed in --
ColumbiaX: Machine Learning
Lecture 20
Prof. John Paisley
Department of Electrical Engineering
& Data Science Institute
Columbia University
Data is downloaded from here. The key for the dataset is here.
# Import the required modules
from __future__ import division
import numpy as np
import csv
file_name = "../data/ranking-algorithm/epl_16_17.csv"
x = [];
# Read the CSV files
with open(file_name, 'rb') as csvfile:
csvreader = csv.reader(csvfile, delimiter=',')
next(csvreader)
for row in csvreader:
x.append([row[2], row[3], int(row[4]), int(row[5])])
# The team names
teams = {x: i for i, x in enumerate({match[0] for match in x})}
teams_rev = {v: k for k, v in teams.items()}
# Convert into nice numpy array
x = np.array([[teams[match[0]], teams[match[1]], match[2], match[3]] for match in x], dtype=np.int32)
no_teams = len(teams)
# Prepare the Transition matrix
trans = np.zeros((no_teams, no_teams), dtype=np.float32)
for match_id in xrange(x.shape[0]):
i, j = x[match_id][0], x[match_id][1]
i_score, j_score = x[match_id][2], x[match_id][3]
den = i_score + j_score
if den > 0:
trans[i, i] += (i_score > j_score) + i_score / den
trans[j, j] += (i_score < j_score) + j_score / den
trans[i, j] += (i_score < j_score) + j_score / den
trans[j, i] += (i_score > j_score) + i_score / den
# Normalize the transition matrix
norm = np.sum(trans, axis=1)
trans_norm = trans / np.expand_dims(norm, axis=0)
# Perform the eigenvalue decomposition of the transition matrix
w, v = np.linalg.eig(trans_norm.T)
# Normalize the 1st eigenvector that corresponds to eigenvalue = 1
s_d = v[:, 0].real / np.sum(v[:, 0].real)
# Sort s_d
sorted_ranking = np.argsort(s_d)[::-1]
# Prepare a list to display
best_teams = [(teams_rev[i], s_d[i]) for i in sorted_ranking]
print("The rankings of the teams are")
for team in best_teams:
print team
The rankings of the teams are ('Leicester', 0.13876668) ('Arsenal', 0.098007552) ('Tottenham', 0.086899467) ('West Ham', 0.077981375) ('Man United', 0.074728511) ('Southampton', 0.064768665) ('Man City', 0.062873565) ('Liverpool', 0.056575444) ('Chelsea', 0.044441242) ('Stoke', 0.036718827) ('Swansea', 0.035415944) ('West Brom', 0.033552695) ('Everton', 0.03319782) ('Watford', 0.026719317) ('Bournemouth', 0.026716128) ('Newcastle', 0.026127573) ('Crystal Palace', 0.024857888) ('Sunderland', 0.022549277) ('Norwich', 0.019476177) ('Aston Villa', 0.009625854)
One of the possible reason for City below United and Southampton might be that City performed better against the good teams.