AMLD'19 Learning and Processing over Networks

2 Network Science

In this notebook we will examine a couple of basic network properties, taking a flight route graph and a road graph as objects of study.

It will be useful to check the documentation of the networkx package as we go along, since a lot of the properties we will see can be easily computed via calls to this package.

In [1]:
import numpy as np
import scipy as sp
import osmnx as ox
import networkx as nx
import pandas as pd
import matplotlib.pyplot as plt
import collections
import utils

The data used for constructing the flight route network was taken from OpenFlights. Each node in the network represents an airport. Edges are drawn between nodes if there is a flight route connecting the corresponding nodes.

The cell code below loads the flight route network that we are going to use. The graph itself, represented as a networkx graph, is returned in the variable flight_graph. The varible pos, useful for plotting the graph against a world map, is a dictionary containing airport acronyms as keys and (longitude, latitude) pairs as values.

In [2]:
routes, airports, pos, flight_graph = utils.preprocess_flight_routes()

The routes dataframe contains information on the flight routes, such as source and destination airports, and distance. You can get a glimpse of its contents by running the cell below.

In [3]:
routes.head()
Out[3]:
Airline Airline ID Source airport Source airport ID Destination airport Destination airport ID Codeshare Stops Equipment Source latitude Source longitude Destination latitude Destination longitude Distance
0 2B 410 AER 2965 KZN 2990 NaN 0 CR2 43.449902 39.956600 55.606201 49.278702 1507.989717
1 2B 410 ASF 2966 KZN 2990 NaN 0 CR2 46.283298 48.006302 55.606201 49.278702 1040.943207
2 2B 410 ASF 2966 MRV 2962 NaN 0 CR2 46.283298 48.006302 44.225101 43.081902 449.036664
3 2B 410 CEK 2968 KZN 2990 NaN 0 CR2 55.305801 61.503300 55.606201 49.278702 773.126239
4 2B 410 CEK 2968 OVB 4078 NaN 0 CR2 55.305801 61.503300 55.012600 82.650703 1343.161122

Similarly, the airports dataframe contains information on the various airports in the dataset. Recall that they represent the nodes in our first graph.

In [4]:
airports.head()
Out[4]:
Airport ID Name City Country ICAO Latitude Longitude Altitude Timezone DST TZ Type Source
4
GKA 1 Goroka Airport Goroka Papua New Guinea AYGA -6.081690 145.391998 5282 10.0 U Pacific/Port_Moresby airport OurAirports
MAG 2 Madang Airport Madang Papua New Guinea AYMD -5.207080 145.789001 20 10.0 U Pacific/Port_Moresby airport OurAirports
HGU 3 Mount Hagen Kagamuga Airport Mount Hagen Papua New Guinea AYMH -5.826790 144.296005 5388 10.0 U Pacific/Port_Moresby airport OurAirports
LAE 4 Nadzab Airport Nadzab Papua New Guinea AYNZ -6.569803 146.725977 239 10.0 U Pacific/Port_Moresby airport OurAirports
POM 5 Port Moresby Jacksons International Airport Port Moresby Papua New Guinea AYPY -9.443380 147.220001 146 10.0 U Pacific/Port_Moresby airport OurAirports

The line of code below allows us to overlay the graph onto the world map. The flight routes graph is an instance of a network with a "natural" embedding (the surface of the earth). Such is not always the case and sometimes we need to come up with our own embeddings, as we will see in the next notebook.

In [5]:
utils.display_map(flight_graph, pos)
/Users/rodrigopena/anaconda/envs/amld2019/lib/python3.7/site-packages/networkx/drawing/nx_pylab.py:611: MatplotlibDeprecationWarning: isinstance(..., numbers.Number)
  if cb.is_numlike(alpha):