During this seminar we will:
import numpy as np
import pandas as pd
import scipy.spatial as spt
import matplotlib.pyplot as plt
plt.xkcd()
import networkx as nx
%matplotlib inline
HINT: For correlation coeficient you can use np.corrcoef(), for the distances you may implement them on your own or use scipy.spatial.distance.pdist()
# Put your code here..
#
#
G = nx.karate_club_graph()
A = nx.to_numpy_matrix(G, dtype=int)
A = np.asarray(A)
def plotDist(A):
f, ax = plt.subplots(2, 2, figsize=(10,10))
ax[0, 0].imshow(A, cmap = 'Greys', interpolation = 'None')
ax[0, 0].set_title('Adjacency Matrix')
D = np.corrcoef(A)
ax[1, 0].imshow(D, cmap = 'Greys', interpolation = 'None')
ax[1, 0].set_title('Correlation coeff.')
dVec = spt.distance.pdist(A, metric = 'euclidean')
D = spt.distance.squareform(dVec)
ax[0, 1].imshow(D, cmap = 'Greys', interpolation = 'None')
ax[0, 1].set_title('Euclidean Dist.')
dVec = spt.distance.pdist(A, metric = 'cosine')
D = spt.distance.squareform(dVec)
ax[1, 1].imshow(D, cmap = 'Greys', interpolation = 'None')
ax[1, 1].set_title('Cosine Dist.')
plotDist(A)
G = nx.read_gml('lesmis.gml')
A = nx.to_numpy_matrix(G, dtype=int)
A = np.asarray(A)
plotDist(A)
Without special preprocess procidures graph adjacency matrix can look very noisy and hide network's structure (just look at the matrices above). Offcourse usually you don't know the structure itself (communities, groups of closelly connected nodes, etc.) unless it is given, however there are some procedures of node reordering that provides a better view of the network's adjacency matrix.
Reverse Cuthill-McKee finds permutation of the nodes that minimizes the bandwidth of the matrix, which is calculated as: $$ \theta = \max_{a_{ij} > 0}|i-j|$$ Unformally, this algorithm puts some mass on the diagonal of adjacency matrix.
Run this reordering with nx.utils.reverse_cuthill_mckee_ordering(G) and compare with the results above
# Put your code here
#
#
G = nx.karate_club_graph()
# run procedure
cm = nx.utils.reverse_cuthill_mckee_ordering(G)
# get permutation
l = [n for n in cm]
l
A = nx.to_numpy_matrix(G)
A = np.asarray(A)
# apply reordering
A = A[np.ix_(l,l)]
plotDist(A)
For this task you should download some data, convert it to network and calculate assortative mixing coefficient. Particularly, download characters and events datasets.
The first dataset provides information on characters of the Game Of Thrones universe. The second one -- describes some events that have occured with them during the story. We are interested in killing events since they can be considered as binary relations and consequently -- graphs. The attribute wrt which we are going to compute assortative mixing is called "Team".
We will explore datasets with pandas module. The list of usefull functions:
# Put your code here
#
#
events = pd.read_csv('events.csv')
characters = pd.read_csv('characters.csv')
characters.head()
characterID | Name | Team | isDead | isCaptured | isHurt | title | prefix | |
---|---|---|---|---|---|---|---|---|
0 | 2 | Addam Marbrand | Lannister | 0 | 0 | 0 | knight | Ser |
1 | 1894 | Adrack Humble | Greyjoy | 0 | 0 | 0 | NaN | NaN |
2 | 7 | Aegon Frey (Jinglebell) | Frey (North) | 0 | 0 | 0 | NaN | NaN |
3 | 8 | Aegon I Targaryen | Targaryen | 1 | 0 | 0 | King of the Seven Kingdoms | King |
4 | 12 | Aegon Targaryen | Targaryen | 1 | 0 | 0 | prince | Prince |
5 rows × 8 columns
kill_events = events[events['event'] == 'killed']
kill_events = pd.DataFrame(kill_events, index = None, columns=['characterID', 'event', 'withID'])
kill_events = kill_events.dropna()
kill_events.head()
characterID | event | withID | |
---|---|---|---|
7 | 1808 | killed | 2068 |
9 | 1825 | killed | 1808 |
25 | 557 | killed | 456 |
289 | 1186 | killed | 1528 |
518 | 755 | killed | 629 |
5 rows × 3 columns
G = nx.DiGraph()
for row, data in kill_events.iterrows():
killer = data[2]
killed = data[0]
G.add_edge(killer, killed)
ch_team = characters.set_index('characterID')['Team'].to_dict()
for k in ch_team.keys():
if k not in G.nodes():
del ch_team[k]
nx.set_node_attributes(G, 'Team', ch_team)
nx.assortativity.attribute_assortativity_coefficient(G, 'Team')
as_dict = nx.assortativity.attribute_mixing_dict(G, 'Team')
as_dict
{'Bolton (Lannister)': {}, 'Bolton (North)': {'Bolton (Lannister)': 1, 'Greyjoy': 1, 'Stark': 3, 'none': 1}, 'Brave Companions': {'Lannister': 2, 'none': 1}, 'Citadel': {}, 'Drogo': {'Drogo': 1, 'Targaryen': 1}, 'Essos': {'Targaryen': 2}, 'Faceless Men': {'Citadel': 1, 'Lannister': 2}, 'Frey (Lannister)': {}, 'Frey (North)': {'Stark': 3}, 'Greyjoy': {'Greyjoy': 4, 'Stark': 4, 'Tyrell': 1, 'Wildlings (north of wall)': 1}, 'Lannister': {'Frey (Lannister)': 2, 'Lannister': 1, 'Robert': 2, 'Stark': 4, 'none': 1}, 'Littlefinger': {'Littlefinger': 1, 'Robert': 1}, 'Martell': {'Robert': 1}, 'Night Watch': {'Night Watch': 4, 'Tyrell': 1, 'Wildlings (north of wall)': 5}, 'Red God': {'Renly': 2, 'Stannis': 4}, 'Renly': {'Brave Companions': 3, 'Stark': 1, 'none': 1}, 'Robert': {'Brave Companions': 2, 'Essos': 1, 'Martell': 1, 'Robert': 5, 'Second Sons': 1, 'Stark': 4}, 'Second Sons': {}, 'Stannis': {}, 'Stark': {'Bolton (Lannister)': 1, 'Brave Companions': 1, 'Drogo': 1, 'Frey (Lannister)': 2, 'Frey (North)': 6, 'Lannister': 4, 'Night Watch': 4, 'Robert': 1, 'Stark': 1}, 'Stormcrows': {'Stormcrows': 2}, 'Targaryen': {'Drogo': 1, 'Essos': 2}, 'Tyrell': {'Lannister': 1, 'Renly': 3}, 'Tyrion': {'Lannister': 1, 'Tyrion': 1}, 'Wildling refugees': {}, 'Wildlings (north of wall)': {'Bolton (Lannister)': 1, 'Night Watch': 5, 'Stark': 1}, 'none': {'Lannister': 3, 'Night Watch': 2, 'Stark': 2, 'none': 1}, 'the Others': {'Night Watch': 5, 'Tyrell': 1, 'Wildling refugees': 1}}