### CS4423 - Networks¶

Prof. Götz Pfeiffer
School of Mathematics, Statistics and Applied Mathematics
NUI Galway

# Assignment 2¶

Provide answers to the problems in the boxes provided.
The buttons at the top of the page can be used to create more boxes if needed. The type of box can be changed from Code to Markdown. Code boxes take (and execute) python code. Markdown boxes take (and format nicely) text input. In this way, you can provide answers, ask questions, or raise issues, in words.

Marks will be awarded for participation and engagement.

When finished, print this notebook into a pdf file and submit this to blackboard.

Deadline is next Monday at 5pm.

## Setup¶

This is a jupyter notebook. Find an environment that allows you to work with it. You can either install jupyter as a python packag on your own laptop or PC. Or you can use a suitable website on the internet, such as nbviewer and binder.

The following packages need to be loaded. In order to execute the code in a box, use the mouse or arrow keys to highlight the box and then press SHIFT-RETURN.

Should it ever happen that the notebook becomes unusable, start again with a fresh copy.

In [ ]:
import numpy as np
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt


## 1. ER Model $A$¶

**Definition (ER Model $A$: Uniform Random Graphs).** Let $n \geq 1$, let $N = \binom{n}{2}$ and let $0 \leq m \leq N$. The model $G(n, m)$ consists of the ensemble of graphs $G$ with $n$ nodes $X = \{1, 2, \dots, n\}$, and $m$ randomly selected edges, chosen uniformly from the $N$ possible edges.

Model $A$ random graphs in networkx can be generated with the function nx.gnm_random_graph(n, m), where parameter $n$ gives the number of nodes and parameter $m$ the (exact) number of edges of the graph.

In [ ]:
G = nx.gnm_random_graph(16, 15)
print("#nodes: ", G.number_of_nodes())
print("#edges: ", G.number_of_edges())
nx.draw(G, with_labels=True, node_color='y')

• Draw $6$ random graphs drawn from model $A$ with $n = 16$ nodes and $m = 15$ edges. For each graph drawn, determine its number of edges.

## 2. ER Model $B$¶

**Definition (ER Model $B$: Binomial Random Graphs).** Let $n \geq 1$, let $N = \binom{n}{2}$ and let $0 \leq p \leq 1$. The model $G(n, p)$ consists of the ensemble of graphs $G$ with $n$ nodes $X = \{1, 2, \dots, n\}$, and each of the $N$ possible edges chosen with probability $p$.

Model $B$ random graphs in networkx can be generated with the function nx.gnp_random_graph(n, p), where parameter $n$ gives the number of nodes and parameter $p \in [0, 1]$ the edge probability.

In [ ]:
G = nx.gnp_random_graph(16, 0.125)
print("#nodes: ", G.number_of_nodes())
print("#edges: ", G.number_of_edges())
nx.draw(G, with_labels=True, node_color='m')

• Draw $6$ random graphs drawn from model $B$ with $n = 16$ nodes and edge probability $p = 15/\binom{16}{2} = 0.125$. For each graph drawn, determine its number of edges.

## 3. Degree Distribution¶

The degree distribution of a graph $G = (X, E)$ is the probability distribution of the node degrees of the graph $G$, i.e. the function $p \colon \mathbb{N}_0 \to \mathbb{R}$ defined by $$p_k = \frac{n_k}{n},$$ where $n = |X|$ is the total number of nodes in $G$, and $n_k$ is the number of nodes of degree $k$. (Note hat $\sum_k p_k = 1$.)

In networkx, the numbers $n_k$ can be determined by the function nx.degree_histogram. Then python list comprehension can be used to compute the numbers $p_k$ from those. And those numbers, turned into a pandas dataframe, can be plotted nicely. (In principle, such a plot should be possible with matplotlib and without explixit reference to pandas...)

In [ ]:
G = nx.gnp_random_graph(16, 0.125)
n = G.number_of_nodes()
nn = nx.degree_histogram(G)
pp = [x/n for x in nn]
df = pd.DataFrame(pp)
df.plot()

• Create a model $B$ random graph on $n = 10000$ points, and plot its degree distribution.
In [ ]:



The degree distribution of a model $B$ random graph is known to follow a binomial distribution $\mathrm{Bin}(n-1, p)$ of the form $$p_k = \binom{n-1}{k} p^k (1-p)^{n-1-k}$$

Using the formula $$\binom{n}{k} = \frac{n \cdot (n-1) \dotsm (n-k+1)}{1 \cdot 2 \dotsm k}$$ in python, the binomial coeeficient $\binom{n}{k}$ can be computed with the following function:

In [ ]:
def binomial(n, k):
prd, top, bot = 1, n, 1
for i in range(k):
prd = (prd * top) // bot
top, bot = top - 1, bot + 1
return prd


The binomial distribution $\mathrm{Bin}(n, p)$ can then be defined as:

In [ ]:
def b_dist(n, p, k):
return binomial(n, k) * p**k * (1-p)**(n-k)


In order to compare the degree distribution of a random graph $G$ on $16$ points to the corresponding binomial distribution, one can compute and plot the values of $\mathrm{Bin}(16, p)$ for a suitable value of $p$ as follows.

In [ ]:
n = 16
p = 0.125
bb = [b_dist(n-1, 0.125, k) for k in range(n)]
df = pd.DataFrame(bb)
df.plot()

• Compute and plot the binomial distribution that corresponds to a random model $B$ graph on $n = 10000$ points with $p = 0.0015$.
In [ ]:



In the limit $n \to \infty$ (keeping the expected aaverage degree $p (n-1)$ constant), the binomial distribution $\mathrm{Bin}(n-1, p)$ is well approximated by the Poisson distribution defined by $$p_k = e^{-\lambda} \frac{\lambda^k}{k!},$$ where $\lambda = p (n-1)$.

Using the functions exp and factorial from python's math library, one can compute the Poissin distribution with the follwong python function:

In [ ]:
from math import exp, factorial
def p_dist(l, k):
return exp(-l) * l**k / factorial(k)

• Compute and plot the Poisson distribution that corresponds to a random model $B$ graph on $n = 10000$ points with $p = 0.0015$.
In [ ]:


• What do you conclude from the plots? Can you plot all the distributions (binomial, Poisson, and the actual degree distribution of a graph $G$) into a single diagram?
In [ ]:



## 4. Giant Components¶

Graphs of a certain relative size (number of edges in relation to number of nodes) have a giant component, a connected component that contains a substantial number of the nodes. For a random graph $G(n, p)$ in model $B$, the sudden appearance of a giant component can be described as a function of $p$ (keeping $n$ fixed), by measuring the size of the largest connected component.

With networkx, the connected components of a graph G can be found with the command nx.connected_components(G) and their respective sizes are determined with the command

sz = [len(c) for c in list(nx.connected_components(G))]



The python function max applied to a list of numbers finds the maximum number in the list, i.e., max(sz) is the maxiumum size in the list sz.

• For $n = 10000$, say, compute $50$, say, random graphs $G(n, p)$ for varying values of $p$, maybe equally spread out between $0$ and $1$. For each graph, determine the size of its largest connected component. Plot these sizes agains the values of $p$. Can you spot a pattern? Where does the giant component start to grow?
In [ ]: