Getting Started With pymcmcstat

Author(s): Paul Miles | Date Created: August 31, 2018

Note, the pymcmcstat tutorials have moved to a new location. To switch to the new index, please follow this link. Otherwise, selecting any of the tutorials listed below will take you to the appropriate new location.

Introduction

The pymcmcstat package is a Python program for running Markov Chain Monte Carlo (MCMC) simulations. Included in this package is the abilitity to use different Metropolis based sampling techniques:

  • Metropolis-Hastings (MH): Primary sampling method.
  • Adaptive-Metropolis (AM): Adapts covariance matrix at specified intervals.
  • Delayed-Rejection (DR): Delays rejection by sampling from a narrower distribution. Capable of n-stage delayed rejection.
  • Delayed Rejection Adaptive Metropolis (DRAM): DR + AM

The pymcmcstat package is a Python implementation of the MATLAB toolbox mcmcstat. The user interface is designed to be as similar to the MATLAB version as possible, but this implementation has taken advantage of certain data structure concepts more amenable to Python.

Please see the pymcmcstat homepage for more details about the development of this Python package.

Installation

This code can be found on the Github project page. The package is available on the PyPI distribution site and the latest version can be installed via,

pip install pymcmcstat

The master branch on Github typically matches the latest release on the PyPI distribution site. To install the master branch directly from Github,

pip install git+https://github.com/prmiles/pymcmcstat.git

You can also clone the repository and run python setup.py install.

General Examples

There are many built-in features to pymcmcstat that allow it to be tailored to suit your particular problem. Below we have outlined features through a set of examples.


Monod

Key Features:

  • Basic MCMC settings
  • Data structure initialization
  • Constructing initial parameter covariance matrix using scipy.optimize.leastsq.
  • Chain/Pairwise-correlation panels.
  • Credible interval generation and plotting.





Beetle

Key Features:

  • Sending objects within MCMC data structure.
  • Managing objects within sum-of-squares evaluation.
  • Chain/Pairwise-correlation panels.
  • Credible interval generation and plotting.







Banana

Key Features:

  • Sending class objects in MCMC data structure.
  • Defining parameter covariance matrix.
  • Pairwise correlation and generation of ellipse contours.








Algae

Key Features:

  • Using multiple data sets.
  • Solving system of ODE's as model response.
  • Chain/Density/Pairwise-correlation panels.
  • Generating prediction/credible intervals for multiple quantities of interest.

Viscoelasticity

Key Features:

  • Loading data from *.mat file.
  • Calling C++ model using ctypes packages.
  • Specifying model parameters to be included in the sampling chain.
  • Plotting prediction/credible intervals with respect to time or deformation.






Landau Energy

Key Features:

  • Evaluating multidimensional functions (3-D polarization space).
  • Loading data from *.mat file.
  • Specifying model parameters to be included in the sampling chain.
  • Specifying number of observations.
  • Enhanced visualization using mcmcplot.
  • Plotting prediction/credible intervals.





Radiation Source Localization

Key Features:

  • Embedding user defined objects in the data structure.
  • Enhanced visualization using mcmcplot.
  • Specifying model parameters to be included in the sampling chain.











Running Parallel Chains

Key Features:

  • Running multiple chains simultaneously.
  • Using Gelman-Rubin chain diagnostics.
  • Enhanced visualization using mcmcplot.

Advanced Topics

These tutorials address very specific features of using the package.

Using Chain Log Files

Key Features:

  • Saving chain logs in binary and text formats.
  • Loading log files for post processing.
  • Assessing log history to ascertain status of simulation.

Setting the RNG Seed

Key Features:

  • Set seed for random number generator within pymcmcstat.
  • Produce repeatable simulation results.

Calling Models Written in C++

Key Features:

  • Call arbitrarily complex models written in other languages (e.g., C++) using the ctypes package.
  • Generating credible/prediction intervals using C++ based model.

Specifying Sample Variables

Key Features:

  • Specify which model parameters should be included in sampling chain.

Estimating Error Variance for Multiple Data Sets

Key Features:

  • Setting up multiple data sets in the MCMC data structure.
  • Defining sum-of-squares function to accomodate multiple data sets.
  • Estimating a separate observation error variance for each data set.
  • Plotting prediction/credible intervals for each data set.

Using Normal Prior Distributions

Key Features:

  • Enforcing normally distributed prior functions.
  • Defining non-linear parameter constraints via custom prior functions.

Advanced Interval Plotting

Key Features:

  • Change model, data, and interval display options when plotting credible and prediction intervals.
  • This highlights available features as of version 1.5.0.