Prof. Götz Pfeiffer

School of Mathematics, Statistics and Applied Mathematics

NUI Galway

Provide answers to the problems in the cells provided.

The buttons at the top of the page can be used to create
more cells if needed.
The type of cell can be changed from `Code`

to `Markdown`

(and back).
`Code`

cells take (and execute) `python`

code.
`Markdown`

cells take (and format nicely) text input.
In this way, you can provide answers, ask questions,
or raise issues, in words.

Marks will be awarded for participation and engagement.

When finished, print this notebook into a **pdf** file and submit this to
**blackboard**.

**Deadline** is next Monday at 5pm.

This is a `jupyter`

notebook. Find an environment that allows you to work with it. You can either
install `jupyter`

as a python package on your own laptop or PC. Or you can use a suitable website
on the internet, such as nbviewer and `binder`

.

In order to execute the code in a cell, use the mouse or arrow keys to highlight the box and then press SHIFT-RETURN.

Should it ever happen that the notebook becomes unusable, start again with a fresh copy.

The following packages need to be loaded for this notebook to work.

In [ ]:

```
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
```

The connectome of an organism is a comprehensive map of all neural connections between the neurons in the brain. C. Elegans is a small worm (1mm long) whose neural network has been completely determined by the South African biologist Sydney Brenner, who won a Nobel prize for this work in 2002.

An *undirected* connected version of this network on 279 nodes with 2287 connections is available from the
book's website and is copied here into a single file `c_elegans_undir.net`

. This file is in the pajek format,
and can be imported into this notebook with the `nx.read_pajek`

command. This command
constructs a *multigraph*, which can easily be converted into a (simple) graph by applying
the `nx.Graph`

constructor to it.

In [ ]:

```
G = nx.read_pajek("c_elegans_undir.net")
G = nx.Graph(G)
```

**Don't draw this graph!** It is too big to produce a meaningful picture.

Use appropriate `networkx`

commands to determine:

- the number of vertices $n$ of
`G`

, - the number of edges $m$ of
`G`

, - the characteristic path length $L$ of
`G`

, - the clustering coefficient $C$ of
`G`

, - the number of triangles $n_{\Delta}$ in
`G`

and - the transitivity $T = 3 n_{\Delta} / n_{\wedge}$ of
`G`

.

In [ ]:

```
```

Any graph $G$ with $n$ nodes and $m$ edges can be compared to a random $G(n, m)$
graph with the same parameters, or to a random $G(n, p)$ graph with parameter $p = m/\binom{n}{2}$.
One attribute of interest is the **degree distribution** of $G$. We know that the degree distribution
of a random $G(n, p)$ graph is binomial. How does the worm's brain compare to that?

For example, with $n = 100$ and $m = 292$, one can generate a random graph and plot its node degrees as a histogram, as follows:

In [ ]:

```
R = nx.gnm_random_graph(100, 292)
hist = nx.degree_histogram(R)
df = pd.DataFrame(hist)
df.plot.bar()
```

Does that look like a binomial distribution?

For parameters $n$ and $m$ chosen identical to those of the worm brain graph

`G`

, construct a random $G(n, m)$ graph`R`

.Determine and plot the degree histogram of

`R`

.Determine and plot the degree histogram of

`G`

.In your own words, describe the difference in appearance between the two plots (in the text cell below, the one that does not have

`In []:`

written against it).

In [ ]:

```
```

... text goes here ...

Another interesting exercise is studying how the (node) clustering coefficient of a graph $G = (X, E)$ depends on the degree of the nodes. For each value of $k$, denote by $X_k$ the set of nodes of degree $X_k$. Then one can define the average clustering at degree $k$ as $$ C_k = \frac 1 {|X_k|} \sum_{x \in X_K} c(x), $$ where $c(x)$ is the node clustering coefficient of vertex $x$.

To plot $C_k$ against $k$ will need some preparation. In order to determine the sets $X_k$,
we can use the map constructed by `G.degree()`

, which assigns to each node its degree,
and revert it in such a way that each degree $k$ gets assigned the list of nodes of degree
$k$. This can be done with a general purpose `python`

function that works with
any `python`

dictionary `kv`

.

In [ ]:

```
def reversed_dict(kv):
reverse = {}
for k, v in kv.items():
reverse.setdefault(v, []).append(k)
return reverse
```

For example, a dictionary of the degrees of a random $G(n, p)$-graph `R`

with $n = 16$ and $p = 0.125$
is constructed like so.

In [ ]:

```
R = nx.gnp_random_graph(16, 0.125)
degs = dict(R.degree())
print(degs)
```

This dictionary has as *keys* the 16 nodes $0, 1, \dots, 15$,
and for each key as *value* the corresponding node degree, most of the time $1$ or $2$.
We want a dictionary that works in the opposite direction, i.e.. has keys $1$ and $2$
(and whatever else occurs as node degree), and as value for key $2$, say, the list of
all nodes that have degree $2$. That's what the above function `reversed_dict`

constructs:

In [ ]:

```
nodes_by_degree = reversed_dict(degs)
print(nodes_by_degree)
```

Next, we apply the map returned (as `python`

dictionary) by the `networkx`

command `nx.clustering`

to the node lists in `nodes_by_degree`

to obtain a new dictionary `clustering_by_degree`

which contains lists of node clustering coefficients in place of node names, one for each node degree $k$
that occurs in the graph.

In [ ]:

```
clust = nx.clustering(R)
print(clust)
```

In [ ]:

```
clustering_by_degree = { d : [clust[x] for x in v] for d, v in nodes_by_degree.items() }
print(clustering_by_degree)
```

Finally, we can determine the average value `sum(v)/len(v)`

of each list `v`

in this dictionary,
and obtain a list of pairs $(k, C_k)$.

In [ ]:

```
avg_clust_degree = [ (d, sum(v)/len(v)) for d, v in clustering_by_degree.items() ]
print(avg_clust_degree)
```

Unsurprisingly, the value of $C_k$ for both $k = 0$ and $k = 1$ is $0$.
Using a `pandas`

data frame, the resulting values can be plotted as a scatter plot like this:

In [ ]:

```
df = pd.DataFrame(avg_clust_degree, columns = ["k", "C_k"] )
df.plot.scatter(x = "k", y = "C_k")
```

For a random $G(n, m)$ graph

`R`

of the same degree and size as the worm brain network`G`

, determine and plot the average degree clustering coefficients $C_k$.Use an additional argument

`loglog = True`

in the plotting command (i.e.,`df.plot.scatter(x = "k", y = "C_k", loglog = True)`

) to obtain a log-log plot of the same data.Determine and plot the average degree clustering coefficients $C_k$ for the worm brain graph

`G`

, again as a log-log plot.In your own words, describe the difference, if any, between the two data sets in the text cell below.

In [ ]:

```
```

... text goes here ...

Random graphs in the Watts-Strogatz model are obtained from a regular
$(n, d)$-circle graph by randomly rewiring all edges with a given probability $p$.
Such a graph can be generated with the command `nx.watts_strogatz_graph(n, 2*d, p)`

(note that the second argument is actually `2*d`

).

In [ ]:

```
n, d, p = 16, 3, 0.16
G = nx.watts_strogatz_graph(n, 2*d, p)
nx.draw_circular(G, with_labels=True, node_color='y')
```

Watts-Strogatz random graphs are supposed to be like real world networks in the sense that they
combine relatively *short* characteristic path lengths with relatively *high* clustering coefficients.

For values $n = 1000$ and $d = 5$, produce a sequence of 50 (or so) $(n, d, p)$-WS graphs for different values of $p$ between $0$ and $1$ (including the extreme cases $p = 0$ and $p = 1$). Compute and compare their (graph) clustering coefficients and their characteristic path lengths. (Use smaller values for $n$ if $n = 1000$ turns out to be too demanding on resources.)

In your own words, in which range of values for $p$, do the generated graphs indeed have high clustering and short paths?

In [ ]:

```
```

... text goes here ...