CS4423 - Networks

Prof. Götz Pfeiffer
School of Mathematics, Statistics and Applied Mathematics
NUI Galway

Lecture 3: A NetworkX Tutorial

NetworkX is a Python library for studying graphs and networks. NetworkX is suitable for operation on large real-world graphs: e.g., graphs in excess of 10 million nodes and 100 million edges. Using a pure-Python "dictionary of dictionaries" data structure, NetworkX is a reasonably efficient, very scalable, highly portable framework for network and social network analysis. NetworkX is free software. (Wikipedia) (github.io)

In [ ]:
import networkx as nx

Creating a graph

Create an empty graph with no nodes and no edges.

In [ ]:
G = nx.Graph()

By definition, a Graph is a collection of nodes (vertices) along with identified pairs of nodes (called edges, links, etc). In NetworkX, nodes can be any hashable object e.g. a text string, an image, an XML object, another Graph, a customized node object, etc. (Note: Python's None object should not be used as a node as it determines whether optional function arguments have been assigned in many functions.)

Nodes

The graph G can be grown in several ways. NetworkX includes many graph generator functions and facilities to read and write graphs in many formats. We start with simple manipulations. You can add one node at a time:

In [ ]:
G.add_node(1)

Now the graph G has one node and, still, zero edges. This can be seen by:

In [ ]:
G.number_of_nodes()
In [ ]:
G.order()
In [ ]:
G.number_of_edges(),  G.size()

Nodes and edges of G can be listed explicitly like so (both G.nodes() and G.edges() return iterator objects):

In [ ]:
list(G.nodes())
In [ ]:
 list(G.edges())

You can also add a list of nodes:

In [ ]:
G.add_nodes_from([2, 3])
In [ ]:
list(G.nodes()), list(G.edges())

With the same command you can in fact add any nbunch of nodes. An nbunch is any iterable container of nodes (e.g. a list as above, a set, another graph, a file, etc...). Let's use a generator to make a graph H, and then add its nodes to the graph G.

In [ ]:
H = nx.path_graph(10)
H.number_of_nodes(), H.number_of_edges()
In [ ]:
list(H.nodes()), list(H.edges())
In [ ]:
for node in H:
    print(node)
In [ ]:
G.add_nodes_from(H)
In [ ]:
list(G.nodes()), list(G.edges())

Note that G now contains the nodes of H as its own nodes of G. However, G does not contain the edges of H. The graph H could also be added as a new node to G.

In [ ]:
G.add_node(H)
In [ ]:
list(G.nodes()), list(G.edges())

The graph G now contains H as a node. This flexibility is very powerful as it allows graphs of graphs, graphs of files, graphs of functions and much more.

Edges

Let's start again with a clean sheet.

In [ ]:
G.clear()
list(G.nodes()), list(G.edges())

G can also be grown by adding one edge at a time,

In [ ]:
G.add_edge(1, 2)
e = (2, 3)
G.add_edge(*e) # unpack edge tuple*
In [ ]:
list(G.nodes()), list(G.edges())
In [ ]:
list(G.neighbors(2))  # G.neighbors(n) returns an iterator of neigboring nodes of n

by adding a list of edges,

In [ ]:
G.add_edges_from([(1, 2),(1, 3)])
In [ ]:
list(G.nodes()), list(G.edges())

or by adding any ebunch of edges. An ebunch is any iterable container of edge-tuples. An edge-tuple can be a 2-tuple of nodes or a 3-tuple with 2 nodes followed by an edge attribute dictionary, e.g. (2, 3, {'weight' : 3.1415}). Edge attributes are discussed further later ...

In [ ]:
G.add_edges_from(H.edges())
In [ ]:
list(G.nodes()), list(G.edges())

Note how nodes are automatically added if necessary.

One can demolish the graph in a similar fashion; using Graph.remove_node, Graph.remove_nodes_from, Graph.remove_edge and Graph.remove_edges_from, e.g.

In [ ]:
G.remove_node(3)
In [ ]:
list(G.nodes()), list(G.edges())

You can specify graph data in several formats.

In [ ]:
edgelist = [(0, 1), (1, 2), (2, 3)]
In [ ]:
G = nx.Graph(edgelist)
list(G.edges())
In [ ]:
H = nx.DiGraph(G)  # create a DiGraph using the connections from G
In [ ]:
list(H.edges())

Drawing graphs

NetworkX is not primarily a graph drawing package but basic drawing with Matplotlib as well as an interface to use the open source Graphviz software package are included. These are part of the networkx.drawing package and will be imported if possible.

Matplotlib's plot interface is imported as follows.

In [ ]:
import matplotlib.pyplot as plt
%matplotlib inline

To test if the import of networkx.drawing was successful draw G using one of

In [ ]:
nx.draw(G)
In [ ]:
nx.draw(G, with_labels=True)
In [ ]:
nx.draw_random(G)
In [ ]:
nx.draw_circular(G)
In [ ]:
nx.draw_spectral(G)

when drawing to an interactive display. Note that you may need to issue a Matplotlib plt.show() command if you are not using matplotlib in interactive mode.

What to use as nodes and edges

Nodes and edges are not specified as NetworkX objects. This leaves you free to use meaningful items as nodes and edges. The most common choices are numbers or strings, but a node can be any hashable object (except None), and an edge can be associated with any object x using G.add_edge(n1, n2, object=x).

For example, n1 and n2 could be protein objects from the RCSB Protein Data Bank, and x could refer to an XML record of publications detailing experimental observations of their interaction.

This power can be quite useful, but its abuse can lead to unexpected surprises unless one is familiar with Python. If in doubt, consider using convert_node_labels_to_integers to obtain a more traditional graph with integer labels.

Accessing edges

We have seen that nodes and edges of a graph G can be accessed with the methods G.nodes, G.edges, and G.neighbors. These methods return iterators that can save you from creating large lists when you are just going to iterate through them anyway.

Fast direct access to the graph data structure is also possible using subscript notation.

Warning. Do not change the returned dict--it is part of the graph data structure and direct manipulation may leave the graph in an inconsistent state.

In [ ]:
G[1]  # Warning: do not change the resulting dict
In [ ]:
list(G.neighbors(1))
In [ ]:
G.add_edge(1, 3, color='blue')
In [ ]:
G[1][3]

You can safely set the attributes of an edge using subscript notation if the edge already exists.

In [ ]:
list(G.adjacency())

Fast examination of all edges is achieved using adjacency(iterators). Note that for undirected graphs this actually looks at each edge twice.

Adding attributes to graphs, nodes, and edges

Attributes such as weights, labels, colors, or whatever Python object you like, can be attached to graphs, nodes, or edges.

Each graph, node, and edge can hold key/value attribute pairs in an associated attribute dictionary (the keys must be hashable). By default these are empty, but attributes can be added or changed using add_edge, add_node or direct manipulation of the attribute dictionaries named G.graph, G.node and G.edge for a graph G.

Graph attributes

Assign graph attributes when creating a new graph

In [ ]:
G = nx.Graph(day="Wednesday")
In [ ]:
G.graph

Or you can modify attributes later

In [ ]:
G.graph['day'] = 'Monday'
In [ ]:
G.graph

Node attributes

Add node attributes using add_node(), add_nodes_from() or G.node

In [ ]:
G.add_node(1, time='5pm')
In [ ]:
G.add_nodes_from([3], time='2pm')
In [ ]:
G.node[1]
In [ ]:
G.node[1]['room'] = 714
In [ ]:
list(G.nodes(data=True))

Note that adding a node to G.node does not add it to the graph, use G.add_node() to add new nodes.

Edge attributes

Add edge attributes using add_edge(), add_edges_from() or subscript notation.

In [ ]:
G.add_edge(1, 2, weight=4.7)
In [ ]:
G.add_edges_from([(3, 4), (4, 5)], color='red')
In [ ]:
G.add_edges_from([(1, 2, {'color': 'blue'}), (2, 3, {'weight': 8})])
In [ ]:
G[1][2]['weight'] = 4.7
In [ ]:
list(G.edges(data=True))

The special attribute 'weight' should be numeric and holds values used by algorithms requiring weighted edges.

Directed Graphs

The DiGraph class provides additional methods specific to directed edges, e.g. DiGraph.out_edges, DiGraph.in_degree, DiGraph.predecessors, DiGraph.successors etc. To allow algorithms to work with both classes easily, the directed versions of neighbors() and degree() are equivalent to successors() and the sum of in_degree() and out_degree() respectively even though that may feel inconsistent at times.

In [ ]:
DG = nx.DiGraph()
In [ ]:
DG.add_weighted_edges_from([(1, 2, 0.5), (3, 1, 0.75)])
In [ ]:
DG.out_degree(1, weight='weight')
In [ ]:
DG.degree(1, weight='weight')
In [ ]:
list(DG.successors(1))   # DG.successors(n) returns an iterator
In [ ]:
list(DG.neighbors(1))   # DG.neighbors(n) returns an iterator

Some algorithms work only for directed graphs and others are not well defined for directed graphs. Indeed the tendency to lump directed and undirected graphs together is dangerous. If you want to treat a directed graph as undirected for some measurement you should probably convert it using Graph.to_undirected or with

In [ ]:
H = nx.Graph(DG) # convert DG to undirected graph

Graph generators and graph operations

In addition to constructing graphs node-by-node or edge-by-edge, they can also be generated by

  • Applying classic graph operations, such as:

    subgraph(G, nbunch)      - induce subgraph of G on nodes in nbunch
    union(G1,G2)             - graph union
    disjoint_union(G1,G2)    - graph union assuming all nodes are different
    cartesian_product(G1,G2) - return Cartesian product graph
    compose(G1,G2)           - combine graphs identifying nodes common to both
    complement(G)            - graph complement
    create_empty_copy(G)     - return an empty copy of the same graph class
    convert_to_undirected(G) - return an undirected representation of G
    convert_to_directed(G)   - return a directed representation of G
  • Using a call to one of the classic small graphs, e.g.

In [ ]:
petersen = nx.petersen_graph()
In [ ]:
tutte = nx.tutte_graph()
In [ ]:
maze = nx.sedgewick_maze_graph()
In [ ]:
tet = nx.tetrahedral_graph()
  • Using a (constructive) generator for a classic graph, e.g.
In [ ]:
K_5 = nx.complete_graph(5)
In [ ]:
nx.draw(K_5)
In [ ]:
K_3_5 = nx.complete_bipartite_graph(3, 5)
In [ ]:
nx.draw(K_3_5)
In [ ]:
barbell = nx.barbell_graph(10, 10)
In [ ]:
lollipop = nx.lollipop_graph(10, 20)
  • Using a stochastic graph generator, e.g.
In [ ]:
er = nx.erdos_renyi_graph(100, 0.15)
In [ ]:
ws = nx.watts_strogatz_graph(30, 3, 0.1)
In [ ]:
ba = nx.barabasi_albert_graph(100, 5)
In [ ]:
red = nx.random_lobster(100, 0.9, 0.9)
  • Reading a graph stored in a file using common graph formats, such as edge lists, adjacency lists, GML, GraphML, pickle, LEDA and others.