Spatially Explicit Markov Methods

Author: Serge Rey [email protected], Wei Kang [email protected]

Introduction

This notebook introduces Discrete Markov Chains (DMC) model and its two variants which explicitly incorporate spatial effects. We will demonstrate the usage of these methods by an empirical study for understanding regional income dynamics in the US. The dataset is the per capita incomes observed annually from 1929 to 2009 for the lower 48 US states.

Note that a full execution of this notebook requires pandas, matplotlib and light-weight geovisualization package pysal-splot.

Classic Markov

giddy.markov.Markov(self, class_ids, classes=None)

We start with a look at a simple example of classic DMC methods implemented in PySAL-giddy. A Markov chain may be in one of $k$ different states/classes at any point in time. These states are exhaustive and mutually exclusive. If one had a time series of remote sensing images used to develop land use classifications, then the states could be defined as the specific land use classes and interest would center on the transitions in and out of different classes for each pixel.

For example, suppose there are 5 pixels, each of which takes on one of 3 states (a,b,c) at 3 consecutive periods:

In [1]:
import numpy as np
c = np.array([['b','a','c'],['c','c','a'],['c','b','c'],['a','a','b'],['a','b','c']])

So the first pixel was in state ‘b’ in period 1, state ‘a’ in period 2, and state ‘c’ in period 3. Each pixel's trajectory (row) owns Markov property, meaning that which state a pixel takes on today is only dependent on its immediate past.

Let's suppose that all the 5 pixels are governed by the same transition dynamics rule. That is, each trajectory is a realization of a Discrete Markov Chain process. We could pool all the 5 trajectories from which to estimate a transition probability matrix. To do that, we utlize the Markov class in giddy:

In [2]:
import giddy
m = giddy.markov.Markov(c)
The Markov Chain is irreducible and is composed by:
1 Recurrent class (indices):
[0 1 2]
0 Transient class.
The Markov Chain has 0 absorbing state.

You may turn off the summary for the Markov chain by assigning summary=False when initializing the Markov Chain.

In [3]:
m = giddy.markov.Markov(c, summary=False)

In this way, we create a Markov instance - $m$. Its attribute $classes$ gives 3 unique classes these pixels can take on, which are 'a','b' and 'c'.

In [4]:
print(m.classes)
['a' 'b' 'c']
In [5]:
print(len(m.classes))
3

In addition to extracting the unique states as an attribute, our Markov instance will also have the attribute transitions which is a transition matrix counting the number of transitions from one state to another. Since there are 3 unique states, we will have a $(3,3)$ transtion matrix:

In [6]:
print(m.transitions)
[[1. 2. 1.]
 [1. 0. 2.]
 [1. 1. 1.]]

The above transition matrix indicates that of the four pixels that began a transition interval in state ‘a’, 1 remained in that state, 2 transitioned to state ‘b’ and 1 transitioned to state ‘c’. Another attribute $p$ gives the transtion probability matrix which is the transition dynamics rule ubiquitous to all the 5 pixels across the 3 periods. The maximum likehood estimator for each element $p_{i,j}$ is shown below where $n_{i,j}$ is the number of transitions from state $i$ to state $j$ and $k$ is the number of states (here $k=3$):

\begin{equation} \hat{p}_{i,j} = \frac{n_{i,j}}{\sum_{q=1}^k n_{i,q} } \end{equation}
In [7]:
print(m.p)
[[0.25       0.5        0.25      ]
 [0.33333333 0.         0.66666667]
 [0.33333333 0.33333333 0.33333333]]

This means that if any of the 5 pixels was in state 'c', the probability of staying at 'c' or transitioning to any other states ('a', 'b') in the next period is the same (0.333). If a pixel was in state 'b', there is a high possibility that it would take on state 'c' in the next period because $\hat{p}_{2,3}=0.667$.

In [8]:
m.steady_state  # steady state distribution
Out[8]:
array([0.30769231, 0.28846154, 0.40384615])

This simple example illustrates the basic creation of a Markov instance, but the small sample size makes it unrealistic for the more advanced features of this approach. For a larger example, we will look at an application of Markov methods to understanding regional income dynamics in the US. Here we will load in data on per capita incomes observed annually from 1929 to 2010 for the lower 48 US states:

Regional income dynamics in the US

Firstly, we load in data on per capita incomes observed annually from 1929 to 2009 for the lower 48 US states. We use the example dataset in libpysal which was downloaded from US Bureau of Economic Analysis.

In [9]:
import libpysal
f = libpysal.io.open(libpysal.examples.get_path("usjoin.csv"))
pci = np.array([f.by_col[str(y)] for y in range(1929,2010)])
print(pci.shape)
(81, 48)

The first row of the array is the per capita incomes for the 48 US states for the year 1929:

In [10]:
print(pci[0, :])
[ 323  600  310  991  634 1024 1032  518  347  507  948  607  581  532
  393  414  601  768  906  790  599  286  621  592  596  868  686  918
  410 1152  332  382  771  455  668  772  874  271  426  378  479  551
  634  434  741  460  673  675]

In order to apply the classic Markov approach to this series, we first have to discretize the distribution by defining our classes. There are many ways to do this including quantiles classification scheme, equal interval classification scheme, Fisher Jenks classification scheme, etc. For a list of classification methods, please refer to the pysal package mapclassify.

Here we will use the quintiles for each annual income distribution to define the classes. It should be noted that using quintiles for the pooled income distribution to define the classes will result in a different interpretation of the income dynamics. Quintiles for each annual income distribution (the former) will reveal more of relative income dynamics while those for the pooled income distribution (the latter) will provide insights in absolute dynamics.

In [11]:
import matplotlib.pyplot as plt
%matplotlib inline
years = range(1929,2010)
names = np.array(f.by_col("Name"))
order1929 = np.argsort(pci[0,:])
order2009 = np.argsort(pci[-1,:])
names1929 = names[order1929[::-1]]
names2009 = names[order2009[::-1]]
first_last = np.vstack((names1929,names2009))
from pylab import rcParams
rcParams['figure.figsize'] = 15,10
plt.plot(years,pci)
for i in range(48):
    plt.text(1915,54530-(i*1159), first_last[0][i],fontsize=12)
    plt.text(2010.5,54530-(i*1159), first_last[1][i],fontsize=12)
plt.xlim((years[0], years[-1]))
plt.ylim((0, 54530))
plt.ylabel(r"$y_{i,t}$",fontsize=14)
plt.xlabel('Years',fontsize=12)
plt.title('Absolute Dynamics',fontsize=18)
Out[11]:
Text(0.5, 1.0, 'Absolute Dynamics')
In [12]:
years = range(1929,2010)
rpci= (pci.T / pci.mean(axis=1)).T
names = np.array(f.by_col("Name"))
order1929 = np.argsort(rpci[0,:])
order2009 = np.argsort(rpci[-1,:])
names1929 = names[order1929[::-1]]
names2009 = names[order2009[::-1]]
first_last = np.vstack((names1929,names2009))
from pylab import rcParams
rcParams['figure.figsize'] = 15,10
plt.plot(years,rpci)
for i in range(48):
    plt.text(1915,1.91-(i*0.041), first_last[0][i],fontsize=12)
    plt.text(2010.5,1.91-(i*0.041), first_last[1][i],fontsize=12)
plt.xlim((years[0], years[-1]))
plt.ylim((0, 1.94))
plt.ylabel(r"$y_{i,t}/\bar{y}_t$",fontsize=14)
plt.xlabel('Years',fontsize=12)
plt.title('Relative Dynamics',fontsize=18)
Out[12]:
Text(0.5, 1.0, 'Relative Dynamics')
In [13]:
import mapclassify as mc
q5 = np.array([mc.Quantiles(y,k=5).yb for y in pci]).transpose()
print(q5[:, 0])
[0 2 0 4 2 4 4 1 0 1 4 2 2 1 0 1 2 3 4 4 2 0 2 2 2 4 3 4 0 4 0 0 3 1 3 3 4
 0 1 0 1 2 2 1 3 1 3 3]
In [14]:
print(f.by_col("Name"))
['Alabama', 'Arizona', 'Arkansas', 'California', 'Colorado', 'Connecticut', 'Delaware', 'Florida', 'Georgia', 'Idaho', 'Illinois', 'Indiana', 'Iowa', 'Kansas', 'Kentucky', 'Louisiana', 'Maine', 'Maryland', 'Massachusetts', 'Michigan', 'Minnesota', 'Mississippi', 'Missouri', 'Montana', 'Nebraska', 'Nevada', 'New Hampshire', 'New Jersey', 'New Mexico', 'New York', 'North Carolina', 'North Dakota', 'Ohio', 'Oklahoma', 'Oregon', 'Pennsylvania', 'Rhode Island', 'South Carolina', 'South Dakota', 'Tennessee', 'Texas', 'Utah', 'Vermont', 'Virginia', 'Washington', 'West Virginia', 'Wisconsin', 'Wyoming']

A number of things need to be noted here. First, we are relying on the classification methods in mapclassify for defining our quintiles. The class Quantiles uses quintiles ($k=5$) as the default and will create an instance of this class that has multiple attributes, the one we are extracting in the first line is $yb$ - the class id for each observation. The second thing to note is the transpose operator which gets our resulting array $q5$ in the proper structure required for use of Markov. Thus we see that the first spatial unit (Alabama with an income of 323) fell in the first quintile in 1929, while the last unit (Wyoming with an income of 675) fell in the fourth quintile.

So now we have a time series for each state of its quintile membership. For example, Colorado’s quintile time series is:

In [15]:
print(q5[4, :])
[2 3 2 2 3 2 2 3 2 2 2 2 2 2 2 2 3 2 3 2 3 2 3 3 3 2 2 3 3 3 3 3 3 3 3 3 3
 2 2 2 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4
 3 3 3 4 3 3 3]

indicating that it has occupied the 3rd, 4th and 5th quintiles in the distribution at the first 3 periods. To summarize the transition dynamics for all units, we instantiate a Markov object:

In [16]:
m5 = giddy.markov.Markov(q5)
The Markov Chain is irreducible and is composed by:
1 Recurrent class (indices):
[0 1 2 3 4]
0 Transient class.
The Markov Chain has 0 absorbing state.

The number of transitions between any two quintile classes could be counted:

In [17]:
print(m5.transitions)
[[729.  71.   1.   0.   0.]
 [ 72. 567.  80.   3.   0.]
 [  0.  81. 631.  86.   2.]
 [  0.   3.  86. 573.  56.]
 [  0.   0.   1.  57. 741.]]

By assuming the first-order Markov property, time homogeneity, spatial homogeneity and spatial independence, a transition probability matrix could be estimated which holds for all the 48 US states across 1929-2010:

In [18]:
print(m5.p)
[[0.91011236 0.0886392  0.00124844 0.         0.        ]
 [0.09972299 0.78531856 0.11080332 0.00415512 0.        ]
 [0.         0.10125    0.78875    0.1075     0.0025    ]
 [0.         0.00417827 0.11977716 0.79805014 0.07799443]
 [0.         0.         0.00125156 0.07133917 0.92740926]]

The fact that each of the 5 diagonal elements is larger than $0.78$ indicates a high stability of US regional income dynamics system.

Another very important feature of DMC model is the steady state distribution $\pi$ (also called limiting distribution) defined as $\pi p = \pi$. The attribute $steady\_state$ gives $\pi$ as follows:

In [19]:
print(m5.steady_state)
[0.20774716 0.18725774 0.20740537 0.18821787 0.20937187]

If the distribution at $t$ is a steady state distribution as shown above, then any distribution afterwards is the same distribution.

With the transition probability matrix in hand, we can estimate the first mean passage time which is the average number of steps to go from a state/class to another state for the first time:

In [20]:
print(giddy.ergodic.fmpt(m5.p))
[[  4.81354357  11.50292712  29.60921231  53.38594954 103.59816743]
 [ 42.04774505   5.34023324  18.74455332  42.50023268  92.71316899]
 [ 69.25849753  27.21075248   4.82147603  25.27184624  75.43305672]
 [ 84.90689329  42.85914824  17.18082642   5.31299186  51.60953369]
 [ 98.41295543  56.36521038  30.66046735  14.21158356   4.77619083]]

Thus, for a state with income in the first quintile, it takes on average 11.5 years for it to first enter the second quintile, 29.6 to get to the third quintile, 53.4 years to enter the fourth, and 103.6 years to reach the richest quintile.

Regional context and Moran's Is

Thus far we have treated all the spatial units as independent to estimate the transition probabilities. This hides an implicit assumption: the movement of a spatial unit in the income distribution is independent of the movement of its neighbors or the position of the neighbors in the distribution. But what if spatial context matters??

We could plot the choropleth maps of per capita incomes of US states to get a first impression of the spatial distribution.

In [21]:
import geopandas as gpd
import pandas as pd
In [22]:
geo_table = gpd.read_file(libpysal.examples.get_path('us48.shp'))
income_table = pd.read_csv(libpysal.examples.get_path("usjoin.csv"))
complete_table = geo_table.merge(income_table,left_on='STATE_NAME',right_on='Name')
complete_table.head()
Out[22]:
AREA PERIMETER STATE_ STATE_ID STATE_NAME STATE_FIPS_x SUB_REGION STATE_ABBR geometry Name ... 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009
0 20.750 34.956 1 1 Washington 53 Pacific WA (POLYGON ((-122.400749206543 48.22539520263672... Washington ... 31528 32053 32206 32934 34984 35738 38477 40782 41588 40619
1 45.132 34.527 2 2 Montana 30 Mtn MT POLYGON ((-111.4746322631836 44.70223999023438... Montana ... 22569 24342 24699 25963 27517 28987 30942 32625 33293 32699
2 9.571 18.899 3 3 Maine 23 N Eng ME (POLYGON ((-69.77778625488281 44.0740737915039... Maine ... 25623 27068 27731 28727 30201 30721 32340 33620 34906 35268
3 21.874 21.353 4 4 North Dakota 38 W N Cen ND POLYGON ((-98.73005676269531 45.93829727172852... North Dakota ... 25068 26118 26770 29109 29676 31644 32856 35882 39009 38672
4 22.598 22.746 5 5 South Dakota 46 W N Cen SD POLYGON ((-102.7879333496094 42.99532318115234... South Dakota ... 26115 27531 27727 30072 31765 32726 33320 35998 38188 36499

5 rows × 92 columns

In [23]:
index_year = range(1929,2010,15)
fig, axes = plt.subplots(nrows=2, ncols=3,figsize = (15,7))
for i in range(2):
    for j in range(3):
        ax = axes[i,j]
        complete_table.plot(ax=ax, column=str(index_year[i*3+j]), cmap='OrRd', scheme='quantiles', legend=True)
        ax.set_title('Per Capita Income %s Quintiles'%str(index_year[i*3+j]))
        ax.axis('off')
        leg = ax.get_legend()
        leg.set_bbox_to_anchor((0.8, 0.15, 0.16, 0.2))
plt.tight_layout()