The Schulze method is a refinemnet of the original Condorcet methods that are used for deciding the winning candidate of an election in which all the voters can express a number of different preferences.
To read all about the method we have been using, have a look at Wikipedia.
Create a dataframe with the ordered poreferences for each voter. The first column shall be the first preference for the i-th voter.
The entries of each cell are the names of the candidates.
Each voter can vote for all or part of the candidates. The non voted candidates are considered to be less preferred than the voted ones.
import condorcet
import numpy as np
import pandas as pd
Make sure that the structure of the dataframe is a in the example below: each row represents the preferences of each voter (there are six voters in the exaple below). The names of the candidates are a, b, c, d.
# create dataframe
table = np.asarray([
["b","c","a"],
["c","a","b"],
["c","a","b"],
["b","a","c"],
["d","a","c"],
["d","a","c"]
])
df = pd.DataFrame(table, columns = ["first_preference","second_preference","third_preference"])
df
first_preference | second_preference | third_preference | |
---|---|---|---|
0 | b | c | a |
1 | c | a | b |
2 | c | a | b |
3 | b | a | c |
4 | d | a | c |
5 | d | a | c |
Just run the condorcet.compute_ranks
method on the dataframe directly and the result will appear as a list of lists.
In the example below, the output is [['a', 'c'], ['b'], ['d']]
, which means that there are two tie candidates in the first place, a and c, one candidate in the second place b and the least preferred candidate is d.
# run the classification
condorcet.compute_ranks(df)
[['a', 'c'], ['b'], ['d']]
You can extract the d[V,W] and p[V,W] matrices with the methods:
condorcet._compute_d(weighted_ranks, candidate_names)
condorcet._compute_p(dmat, candidate_names)
To extract the candidate names and the weighted ranks to be input to the above methods, please use:
weighted_ranks = condorcet.weighted_ranks_from_df(df)
candidate_names = condorcet.candidate_names_from_df(df)
We test the method on wikipedia example to make sure that the algorithm works properly.
# create dataframe
df = pd.DataFrame()
# populate with wikipedia preferences
df["first_column"] = 5*['A']+5*['A']+8*['B']+3*['C']+7*['C']+2*['C']+7*['D']+8*['E']
df["second_column"] = 5*['C']+5*['D']+8*['E']+3*['A']+7*['A']+2*['B']+7*['C']+8*['B']
df["third_column"] = 5*['B']+5*['E']+8*['D']+3*['B']+7*['E']+2*['A']+7*['E']+8*['A']
df["fourth_column"] = 5*['E']+5*['C']+8*['A']+3*['E']+7*['B']+2*['D']+7*['B']+8*['D']
# we are not putting explicitly the last column, as it will be automatically inferred by the algorithm
df
first_column | second_column | third_column | fourth_column | |
---|---|---|---|---|
0 | A | C | B | E |
1 | A | C | B | E |
2 | A | C | B | E |
3 | A | C | B | E |
4 | A | C | B | E |
5 | A | D | E | C |
6 | A | D | E | C |
7 | A | D | E | C |
8 | A | D | E | C |
9 | A | D | E | C |
10 | B | E | D | A |
11 | B | E | D | A |
12 | B | E | D | A |
13 | B | E | D | A |
14 | B | E | D | A |
15 | B | E | D | A |
16 | B | E | D | A |
17 | B | E | D | A |
18 | C | A | B | E |
19 | C | A | B | E |
20 | C | A | B | E |
21 | C | A | E | B |
22 | C | A | E | B |
23 | C | A | E | B |
24 | C | A | E | B |
25 | C | A | E | B |
26 | C | A | E | B |
27 | C | A | E | B |
28 | C | B | A | D |
29 | C | B | A | D |
30 | D | C | E | B |
31 | D | C | E | B |
32 | D | C | E | B |
33 | D | C | E | B |
34 | D | C | E | B |
35 | D | C | E | B |
36 | D | C | E | B |
37 | E | B | A | D |
38 | E | B | A | D |
39 | E | B | A | D |
40 | E | B | A | D |
41 | E | B | A | D |
42 | E | B | A | D |
43 | E | B | A | D |
44 | E | B | A | D |
# run the classification
condorcet.compute_ranks(df)
[['E'], ['A'], ['C'], ['B'], ['D']]
# checking the d and p matrix
candidate_names = condorcet.candidate_names_from_df(df)
weighted_ranks = condorcet.weighted_ranks_from_df(df)
d = condorcet._compute_d(weighted_ranks, candidate_names)
p = condorcet._compute_p(d, candidate_names)
print("d matrix: ", d)
print("p matrix: ", p)
d matrix: defaultdict(<class 'int'>, {('A', 'C'): 26, ('A', 'B'): 20, ('A', 'E'): 22, ('A', 'D'): 30, ('C', 'B'): 29, ('C', 'E'): 24, ('C', 'D'): 17, ('B', 'E'): 18, ('B', 'D'): 33, ('E', 'D'): 31, ('D', 'E'): 14, ('D', 'C'): 28, ('D', 'B'): 12, ('E', 'C'): 21, ('E', 'B'): 27, ('B', 'A'): 25, ('B', 'C'): 16, ('E', 'A'): 23, ('D', 'A'): 15, ('C', 'A'): 19}) p matrix: {('A', 'B'): 28, ('A', 'C'): 28, ('A', 'D'): 30, ('A', 'E'): 24, ('B', 'A'): 25, ('B', 'C'): 28, ('B', 'D'): 33, ('B', 'E'): 24, ('C', 'A'): 25, ('C', 'B'): 29, ('C', 'D'): 29, ('C', 'E'): 24, ('D', 'A'): 25, ('D', 'B'): 28, ('D', 'C'): 28, ('D', 'E'): 24, ('E', 'A'): 25, ('E', 'B'): 28, ('E', 'C'): 28, ('E', 'D'): 31}
The algorithmis sound and we were able to reproduce exaclty Wikipedia's example.
We first import the csv file and then we run the method
# import file as dataframe
df_temp = pd.read_csv("data/Evaluation of Jupyter Notebooks.csv")
df = df_temp[df_temp.columns[-3:]]
df
Who would you choose as winner? | Who would you choose for second place? | Who would you choose for third place? | |
---|---|---|---|
0 | marabernardi | rorondre | lordgrilo |
1 | Antonio-Leitao | lordgrilo | marabernardi |
2 | marabernardi | rorondre | kazuyamagiwa |
3 | lordgrilo | Antonio-Leitao | marabernardi |
4 | kazuyamagiwa | filco306 | rorondre |
5 | filco306 | rorondre | marabernardi |
6 | rorondre | filco306 | marabernardi |
7 | Antonio-Leitao | filco306 | rorondre |
8 | marabernardi | filco306 | lordgrilo |
9 | rorondre | marabernardi | lordgrilo |
10 | marabernardi | kazuyamagiwa | rorondre |
11 | marabernardi | kazuyamagiwa | Antonio-Leitao |
12 | filco306 | lordgrilo | rorondre |
# run the final classification
condorcet.compute_ranks(df)
[['marabernardi'], ['filco306', 'rorondre'], ['lordgrilo'], ['kazuyamagiwa'], ['Antonio-Leitao']]
Congratulations to all the candidates that submitted a notebook to the challenge! All of the notebooks were original, well structured and clearly explained. Thank you for your efforts!
I truly hope you enjoyed the challenge and do not hesitate to share your notebooks with your friends and colleagues!
Bravo!! ;)