`shift+enter`

), or all at once (via the `Cell->Run All`

command). Try running all cells now to verify that NetworKit has been properly built and installed.

In [1]:

```
%matplotlib inline
import matplotlib.pyplot as plt
```

In [2]:

```
cd ../../
```

In [3]:

```
from networkit import *
```

Let us start by reading a network from a file on disk: PGPgiantcompo.graph. In the course of this tutorial, we are going to work on the `PGPgiantcompo`

network, a social network/web of trust in which nodes are PGP keys and an edge represents a signature from one key on another. It is distributed with NetworKit as a good starting point.

There is a convenient function in the top namespace which tries to guess the input format and select the appropriate reader:

In [4]:

```
G = readGraph("input/PGPgiantcompo.graph", Format.METIS)
```

`readGraph`

function tries to be an intelligent wrapper for various reader classes. In this example, it uses the `METISGraphReader`

which is located in the `graphio`

submodule, alongside other readers. These classes can also be used explicitly:

In [5]:

```
graphio.METISGraphReader().read("input/PGPgiantcompo.graph")
# is the same as: readGraph("input/PGPgiantcompo.graph", Format.METIS)
```

Out[5]:

It is also possible to specify the format for `readGraph()`

and `writeGraph()`

. Supported formats can be found via `[graphio.]Format`

. However, graph formats are most likely only supported as far as the NetworKit::Graph can hold and use the data. Please note, that not all graph formats are supported for reading and writing.

Thus, it is possible to use NetworKit to convert graphs between formats. Let's say I need the previously read PGP graph in the Graphviz format:

In [6]:

```
graphio.writeGraph(G,"output/PGPgiantcompo.graphviz", Format.GraphViz)
```

NetworKit also provides a function to convert graphs directly:

In [7]:

```
graphio.convertGraph(Format.LFR, Format.GML, "input/example.edgelist", "output/example.gml")
```

`Graph`

is the central class of NetworKit. An object of this type represents an undirected, optionally weighted network. Let us inspect several of the methods which the class provides.

In [8]:

```
n = G.numberOfNodes()
m = G.numberOfEdges()
print(n, m)
```

In [9]:

```
G.toString()
```

Out[9]:

Nodes are simply integer indices, and edges are pairs of such indices.

In [10]:

```
V = G.nodes()
print(V[:10])
E = G.edges()
print(E[:10])
```

In [11]:

```
G.hasEdge(42,11)
```

Out[11]:

This network is unweighted, meaning that each edge has the default weight of 1.

In [12]:

```
G.weight(42,11)
```

Out[12]:

Sometimes it be may interesting to take a glance at a visualization of a graph. As this is not the scope of NetworKit, the `viztasks`

-module provides two convenience functions to draw graphs via NetworkX. If you have it installed, you will see usage examples throughout this guide.

It also possible to load a graph and the results of our analytic kernels directly into Gephi, a software package for interactive graph visualization, via its streaming plugin. You may want to take a look at the GephiStreaming notebook.

`profiling`

-module introduced with version 4.0 of NetworKit is the successor of the `properties`

-module. It provides a convenient way to run a selection of NetworKit's analytic kernels. The results are further processed to show all kinds of statistics. A very brief example follows. First, let's load a different graph:

In [13]:

```
astro = readGraph("input/astro-ph.graph", Format.METIS)
```

`preset`

-parameter is a convenient way to choose a set of algorithms. Currently, `minimal`

, `default`

and `complete`

can be passed.

In [14]:

```
pf = profiling.Profile.create(astro, preset="minimal")
```

`show`

-function can be used to display the profile. Depending on the selection of kernels, it may take a while to produce all the plots.

In [15]:

```
pf.show()
```

`HTML`

and `LaTeX`

.

In [16]:

```
pf.output("HTML",".")
```

`Config`

-object can be created and passed to `Profile.create`

. Take a look at the specific Profiling notebook for more detailed instructions.

In [17]:

```
cc = components.ConnectedComponents(G)
cc.run()
print("number of components ", cc.numberOfComponents())
v = 0
print("component of node ", v , ": " , cc.componentOfNode(0))
print("map of component sizes: ", cc.getComponentSizes())
```

In [18]:

```
dd = sorted(centrality.DegreeCentrality(G).run().scores(), reverse=True)
plt.xscale("log")
plt.xlabel("degree")
plt.yscale("log")
plt.ylabel("number of nodes")
plt.plot(dd)
plt.show()
```

*powerlaw degree distribution*, a characteristic feature of complex networks, would show up as a straight line from the top left to the bottom right on such a plot. As we see, the degree distribution of the `PGPgiantcompo`

network is definitely skewed, with few high-degree nodes and many low-degree nodes. But does the distribution actually obey a power law? In order to study this, we need to apply the powerlaw module. Call the following function:

In [19]:

```
import powerlaw
fit = powerlaw.Fit(dd)
```

The powerlaw coefficient can then be retrieved via:

In [20]:

```
fit.alpha
```

Out[20]:

If you further want to know how "good" it fits the power law distribution, you can use the the `distribution_compare`

-function. From the documentation of the function:

R : float

Loglikelihood ratio of the two distributions' fit to the data. If greater than 0, the first distribution is preferred. If less than 0, the second distribution is preferred.

p : float

Significance of R

In [21]:

```
fit.distribution_compare('power_law','exponential')
```

Out[21]:

In [22]:

```
globals.clustering(G)
```

Out[22]:

A simple breadth-first search from a starting node can be performed as follows:

In [23]:

```
v = 0
bfs = graph.BFS(G, v)
bfs.run()
bfsdist = bfs.getDistances()
```

`v`

to other nodes - indexed by node id. For example, we can now calculate the mean distance from the starting node to all other nodes:

In [24]:

```
sum(bfsdist) / len(bfsdist)
```

Out[24]:

`PGPgiantcompo`

is an unweighted graph, the result is the same here:

In [25]:

```
dijkstra = graph.Dijkstra(G, v)
dijkstra.run()
spdist = dijkstra.getDistances()
sum(spdist) / len(spdist)
```

Out[25]:

In [26]:

```
K = readGraph("input/karate.graph",Format.METIS)
coreDec = centrality.CoreDecomposition(K)
coreDec.run()
```

Out[26]:

In [27]:

```
set(coreDec.scores())
```

Out[27]:

In [28]:

```
viztasks.drawGraph(K, nodeSizes=[(k**2)*20 for k in coreDec.scores()])
plt.show()
```

`community`

module. The module provides a top-level function to quickly perform community detection with a suitable algorithm and print some stats about the result.

In [29]:

```
community.detectCommunities(G)
```

Out[29]:

In [30]:

```
communities = community.detectCommunities(G)
```

*Modularity* is the primary measure for the quality of a community detection solution. The value is in the range `[-0.5,1]`

and usually depends both on the performance of the algorithm and the presence of distinctive community structures in the network.

In [31]:

```
community.Modularity().getQuality(communities, G)
```

Out[31]:

`Partition`

data strucure, which provides several methods for inspecting and manipulating a partition of a set of elements (which need not be the nodes of a graph).

In [32]:

```
type(communities)
```

Out[32]:

In [33]:

```
print("{0} elements assigned to {1} subsets".format(communities.numberOfElements(), communities.numberOfSubsets()))
```

In [34]:

```
print("the biggest subset has size {0}".format(max(communities.subsetSizes())))
```

*i* contains the subset id of node *i*.

In [35]:

```
community.writeCommunities(communities, "output/communties.partition")
```

*PLM*, our parallel implementation of the well-known Louvain method. It yields a high-quality solution at reasonably fast running times. Let us now apply a variation of this algorithm.

In [36]:

```
community.detectCommunities(G, algo=community.PLM(G, True))
```

Out[36]:

In [37]:

```
sizes = communities.subsetSizes()
sizes.sort(reverse=True)
ax1 = plt.subplot(2,1,1)
ax1.set_ylabel("size")
ax1.plot(sizes)
ax2 = plt.subplot(2,1,2)
ax2.set_xscale("log")
ax2.set_yscale("log")
ax2.set_ylabel("size")
ax2.plot(sizes)
plt.show()
```

In [38]:

```
c2 = communities.getMembers(2)
g2 = G.subgraphFromNodes(c2)
```

In [39]:

```
communities.subsetSizeMap()[2]
```

Out[39]:

In [40]:

```
g2.numberOfNodes()
```

Out[40]:

In [41]:

```
communities2 = community.detectCommunities(g2)
```

In [42]:

```
viztasks.drawCommunityGraph(g2,communities2)
plt.show()
```

`centrality`

module.

In [43]:

```
K = readGraph("input/karate.graph", Format.METIS)
```

In [44]:

```
bc = centrality.Betweenness(K)
bc.run()
```

Out[44]:

In [45]:

```
bc.ranking()[:10] # the 10 most central nodes
```

Out[45]:

`PGPgiantcompo`

, with a probabilistic guarantee that the error is no larger than an additive constant $\epsilon$.

In [46]:

```
abc = centrality.ApproxBetweenness(G, epsilon=0.1)
abc.run()
```

Out[46]:

The 10 most central nodes according to betweenness are then

In [47]:

```
abc.ranking()[:10]
```

Out[47]:

In [48]:

```
# Eigenvector centrality
ec = centrality.EigenvectorCentrality(K)
ec.run()
ec.ranking()[:10] # the 10 most central nodes
```

Out[48]:

In [49]:

```
# PageRank
pr = centrality.PageRank(K, 1e-6)
pr.run()
pr.ranking()[:10] # the 10 most central nodes
```

Out[49]:

In [50]:

```
import networkx as nx
nxG = nxadapter.nk2nx(G) # convert from NetworKit.Graph to networkx.Graph
print(nx.degree_assortativity_coefficient(nxG))
```

**ErdÃ¶s-Renyi model** is the most basic random graph model, in which each edge exists with the same uniform probability. NetworKit provides an efficient generator:

In [51]:

```
ERG = generators.ErdosRenyiGenerator(1000, 0.1).generate()
profiling.Profile.create(ERG, preset="minimal").show()
```