Page Rank and HITS

In [1]:
import numpy as np
import matplotlib.pyplot as plt
plt.xkcd()
import networkx as nx
%matplotlib inline

During with lab we will check the change of results of Page Rank and HITS algorithms with respect to the values of $\alpha$ and number of iterations.

Task 1.

a) Firstly let's make experiment for small network (Zachary or any other small one). Set any 3 distinct values for $\alpha$, fix the number of iterations, and run Page Rank. Plot results PR values wrt $\alpha$

In [2]:
G = nx.karate_club_graph()
# Continue here
# Explain the results

fig = plt.figure(1, figsize=(7,7))
ax = plt.subplot(111)

alphas = np.arange(0.1, 0.9, 0.2)

for alp in alphas:
    pr = nx.pagerank(G, alpha=alp)
    prval = pr.values()
    ax.plot(prval, c=np.random.rand(3,1), label='alpha {0:.2f}'.format(alp))
    
ax.legend()

    
Out[2]:
<matplotlib.legend.Legend at 0x7fe07082a6d0>
In [3]:
nx.draw(G)

b) Do the same for the number of iterations

In [4]:
# Continue here

c) Plot Page-Rank vs. Degree centrality

In [5]:
# Continue here
pr = nx.pagerank(G, alpha=0.8)
pr = pr.values()
d = nx.degree_centrality(G)
d = d.values()

plt.plot(d, pr, '*')
plt.xlabel('Degree Centrality')
plt.ylabel('Page-Rank')
Out[5]:
<matplotlib.text.Text at 0x7fe07081d390>

Task 2.

Let's switch to a bigger network. Download political blogs network. Check its basic properties. Run HITS algorithm. Try to inverstigate the top nodes.

In [6]:
# Continue here
# 
G = nx.read_gml('polblogs.gml')
In [7]:
G.is_directed()
A = nx.adjacency_matrix(G)

P = nx.DiGraph(A)
In [8]:
(h, a) = nx.hits(P)
a = a.values()
h = h.values()
In [9]:
plt.plot(h,a, '*')
plt.xlabel('Hubs')
plt.ylabel('Auth')
Out[9]:
<matplotlib.text.Text at 0x7fe031a3fa10>
In [10]:
idx = np.argsort(a)
idx[-3]
Out[10]:
54
In [11]:
G.node[54]
Out[11]:
{'id': 54,
 'label': u'atomicairship.com',
 'source': u'Blogarama,eTalkingHead',
 'value': 0}