Prof. Götz Pfeiffer

School of Mathematics, Statistics and Applied Mathematics

NUI Galway

Many real world networks are **small world networks**,
where most pairs of nodes are only a few steps away from each other.

More precisely, a network is a small world network if it has

- a small
**average shortest path length**(scaling with $\log n$, where $n$ is the number of nodes), and - a high
**clustering coefficient**.

Random networks do have a small average shortest path length, but not a high clustering coefficient. This observation justifies the need for a different model of random networks, if they are to be used to model the clustering behavior of real world networks.

In [ ]:

```
import networkx as nx
import matplotlib.pyplot as plt
```

Let $\mathcal{D} = (d_{ij})$ be the **distance matrix** of a connected graph $G = (X, E)$,
whose entry $d_{ij}$ is the length of the shortest path from node $i \in X$ to node $j \in X$. (Hence $d_{ii} = 0$ for all $i$.)

**Definition.** Let $G = (X, E)$ be a connected graph.
* The **eccentricity** $e_i$ of a node $i \in X$ is the maximum distance between $i$
and any other vertex in $G$,
$$
e_i = \max_j d_{ij}.
$$
* The **graph radius** $R$ is the minimum eccentricity,
$$
R = \min_i e_i.
$$
* The **graph diameter** $D$ is the maximum eccentricity,
$$
D = \max_i e_i.
$$
* The **characteristic path length** $L$ of $G$ is the average distance between pairs of distinct nodes,
$$
L = \frac1{n(n-1)} \sum_{i \neq j} d_{ij}.
$$

- The characteristic path length of a random graph $G(n, m)$, or $G(n, p)$ is $$ L = \frac{\ln n}{\ln \bar{k}} $$ (... express $\bar{k}$ in terms of $n, m, p$ ...)

In [ ]:

```
n, m = 16, 32
while True:
G = nx.gnm_random_graph(n, m)
if nx.is_connected(G):
break
nx.draw(G, with_labels=True, node_color='y')
```

In [ ]:

```
dist = dict(nx.shortest_path_length(G))
dist = [[dist[i][j] for j in range(n)] for i in range(n)]
```

In [ ]:

```
dist
```

In [ ]:

```
eccentricity = [max(d) for d in dist]
eccentricity
```

In [ ]:

```
nx.eccentricity(G)
```

In [ ]:

```
radius = min(eccentricity)
diameter = max(eccentricity)
radius, diameter
```

In [ ]:

```
sum([sum(d) for d in dist]) / n / (n - 1)
```

In [ ]:

```
nx.average_shortest_path_length(G)
```

In [ ]:

```
from math import log
kbar = sum(dict(G.degree()).values()) / n
log(n) / log(kbar)
```

**Definition (Small-world behaviour).**
A network $G = (X, E)$ is said to exhibit a **small world behaviour** if
its characteristic path length $L$ grows proportionally to the
logarithm of the number $n$ of nodes of $G$:
$$
L \sim \ln n.
$$

In this sense, the ensembles $G(n, m)$ and $G(n, p)$ of random graphs do exhibit small workd behavior (as $n \to \infty$).

Small world networks contain many triangles: it is not uncommon that a friend of one of my friends is my friend, too.
This **degree of transitivity** can be measured in several different ways.

**Definition (Graph transitivity).**
A **triad** is a tree of $3$ nodes or, equivalently, a graph consisting of $2$
adjacent edges (and the nodes they connect). The transitivity $T$ of a graph $G = (X, E)$
is the proportion of **transitive** triads, i.e., triads which are subgraphs of **triangles**:
$$
T = \frac{3 n_{\Delta}}{n_{\wedge}},
$$
where $n_{\Delta}$ is the number of triangles in $G$, and $n_{\wedge}$ is the number of triads.

By definition, $0 \leq T \leq 1$.

**Example.**

In [ ]:

```
G = nx.Graph(((1,2), (2,3), (3,1), (3,4), (3,5)))
nx.draw(G, with_labels=True, node_color='y')
```

The function `nx.triangles(G)`

returns a `python`

dictionary reporting for each node
of the graph `G`

the number of triangles it is contained in.

In [ ]:

```
print(nx.triangles(G))
```

Overall, each triangle in `G`

is thus accounted for $3$ times, once for each of its
vertices. The following sum determines this number $3 n_{\Delta}$.

In [ ]:

```
triple_nr_triangles = sum(nx.triangles(G).values())
print(triple_nr_triangles)
```

The number $n_{\wedge}$ of triads in `G`

can be determined from the graph's degree
sequence, as each node of degree $k$ is the central node of exactly
$\binom{k}{2}$ triads. (Why?)

In [ ]:

```
print(G.degree())
print({k : v * (v-1) // 2 for k, v in dict(G.degree()).items()})
nr_triads = sum([v * (v-1) // 2 for v in dict(G.degree()).values()])
print(nr_triads)
```

The transitivity $T$ of `G`

is the quotient of these two quantities, $T = 3 n_{\Delta}/n_{\wedge}$.

In [ ]:

```
print(triple_nr_triangles / nr_triads )
print(nx.transitivity(G))
```

**Definition (Clustering coefficient).**
For a node $i \in X$ of a graph $G = (X, E)$, denote by
$G_i$ the subgraph induced on the neighbours of $i$ in $G$,
and by $m(G_i)$ its number of edges.
The **node clustering coefficient** $c_i$ of node $i$ is defined
as
$$
c_i = \begin{cases}
\binom{k_i}{2}^{-1} m(G_i), & k_i \geq 2, \\
0, & \text{else.}
\end{cases}
$$
The **graph clustering coefficient** $C$ of $G$ is the
average node clustering coefficient,
$$
C = \langle c\rangle = \frac1n \sum_{i=1}^n c_i.
$$

By definition, $0 \leq c_i \leq 1$ for all nodes $i \in X$, and $0 \leq C \leq 1$.

**Example.**

In [ ]:

```
G = nx.Graph([(0,1), (0,2), (0,3), (0,4), (1,2), (1,3), (2,3), (3,4)])
nx.draw(G, with_labels=True, node_color='y')
```

In [ ]:

```
N = nx.neighbors(G, 0)
S = G.subgraph(list(N))
nx.draw(S, with_labels=True, node_color='y')
```

In [ ]:

```
nS = S.number_of_nodes()
nS_choose_2 = nS * (nS - 1) // 2
mS = S.number_of_edges()
print(nS, mS, mS / nS_choose_2 )
```

In [ ]:

```
nx.clustering(G)
```

In [ ]:

```
nx.average_clustering(G)
```

The clustering coefficient of a $G(n, p)$ random graph is $$ C = p. $$ Note that when $p(n) = \bar{k} / n$ for a fixed expected average degree $\bar{k}$ then $C = \bar{k} / n \to \infty$ for $n \to \infty$: in large random graphs the number of triangles is negligible.

In real world networks, one often observes that $C / \bar{k}$ does not depend on $n$ (as $n \to \infty$?)

Design an experiment with random graphs to verify the predicted characteristic path length.

Design an experiment with random graphs to verify the predicted graph clustering coefficient.

In [ ]:

```
```

In [ ]:

```
```