This is one of the 100 recipes of the IPython Cookbook, the definitive guide to high-performance scientific computing and data science in Python.

You need the `python-apt`

package in order to build the package dependency graph. (https://pypi.python.org/pypi/python-apt/)

We also assume that this notebook is executed on a Debian system (like Ubuntu). If you don't have such a system, you can download the data *Debian* directly on the book's website. Extract it in the current directory, and start this notebook directly at step 7. (http://ipython-books.github.io)

- We import the
`apt`

module and we build the list of packages.

In [ ]:

```
import json
import apt
cache = apt.Cache()
```

- The
`graph`

dictionary will contain the adjacency list of a small portion of the dependency graph.

In [ ]:

```
graph = {}
```

- We define a function that returns the list of dependencies of a package.

In [ ]:

```
def get_dependencies(package):
if package not in cache:
return []
pack = cache[package]
ver = pack.candidate or pack.versions[0]
# We flatten the list of dependencies,
# and we remove the duplicates.
return sorted(set([item.name
for sublist in ver.dependencies
for item in sublist]))
```

- We now define a
*recursive*function that builds the dependency graph for a particular package. This function updates the`graph`

variable.

In [ ]:

```
def get_dep_recursive(package):
if package not in cache:
return []
if package not in graph:
dep = get_dependencies(package)
graph[package] = dep
for dep in graph[package]:
if dep not in graph:
graph[dep] = get_dep_recursive(dep)
return graph[package]
```

- Let's build the dependency graph for IPython.

In [ ]:

```
get_dep_recursive('ipython');
```

- Finally, let's save the adjacency list in JSON.

In [ ]:

```
with open('data/apt.json', 'w') as f:
json.dump(graph, f, indent=1)
```

- We import a few packages.

In [ ]:

```
import json
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
%matplotlib inline
```

- Let's load the adjacency list from the JSON file.

In [ ]:

```
with open('data/apt.json', 'r') as f:
graph = json.load(f)
```

- Now, we create a directed graph (
`DiGraph`

in NetworkX) from our adjacency list. We reverse the graph to get a more natural ordering.

In [ ]:

```
g = nx.DiGraph(graph).reverse()
```

- A topological sort only exists when the graph is a
**directed acyclic graph**(DAG). This means that there is no cycle in the graph, i.e. no circular dependency here. Is our graph a DAG?

In [ ]:

```
nx.is_directed_acyclic_graph(g)
```

- What are the packages responsible for the cycles? We can find it out with the
`simple_cycles`

function.

In [ ]:

```
set([cycle[0] for cycle in nx.simple_cycles(g)])
```

- Here, we can try to remove these packages. In an actual package manager, these cycles need to be carefully taken into account.

In [ ]:

```
g.remove_nodes_from(_)
```

In [ ]:

```
nx.is_directed_acyclic_graph(g)
```

- The graph is now a DAG. Let's display it first.

In [ ]:

```
ug = g.to_undirected()
deg = ug.degree()
```

In [ ]:

```
plt.figure(figsize=(4,4));
# The size of the nodes depends on the number of dependencies.
nx.draw(ug, font_size=6,
node_size=[20*deg[k] for k in ug.nodes()]);
```

- Finally, we can perform the topological sort, thereby obtaining a linear installation order satisfying all dependencies.

In [ ]:

```
nx.topological_sort(g)
```

You'll find all the explanations, figures, references, and much more in the book (to be released later this summer).

IPython Cookbook, by Cyrille Rossant, Packt Publishing, 2014 (500 pages).