Author(s): Paul Miles | Date Created: June 19, 2018
For the purpose of testing the MCMC simulation on a particular problem, it may be useful to check whether the results are repeatable. This can be accomplished by setting the seed for the random number generator used within pymcmcstat. This tutorial outlines how to accomplish this, and demonstrates the repeatability of the results.
First we import the required packages, define our model/sum-of-squares functions, and define a data set.
import numpy as np
from pymcmcstat.MCMC import MCMC
np.seterr(over = 'ignore')
# define test model function
def test_modelfun(xdata, theta):
m = theta[0]
b = theta[1]
nrow, ncol = xdata.shape
y = np.zeros([nrow, 1])
y[:,0] = m*xdata.reshape(nrow,) + b
return y
def test_ssfun(theta, data):
xdata = data.xdata[0]
ydata = data.ydata[0]
# eval model
ymodel = test_modelfun(xdata, theta)
# calc sos
ss = sum((ymodel[:, 0] - ydata[:, 0])**2)
return ss
# define data
nds = 100
x = np.linspace(2, 3, num=nds)
y = 2.*x + 3. + 0.1*np.random.standard_normal(x.shape)
By default, no seed is specifield. To specify a seed simply define a numeric value for the keywork argument rngseed
.
# Initialize MCMC object
mcstat = MCMC(rngseed=1234)
We set the
just like we would for any other simulation.
mcstat.data.add_data_set(x, y)
mcstat.simulation_options.define_simulation_options(
nsimu=int(5.0e3), updatesigma=True, method='dram')
# update model settings
mcstat.model_settings.define_model_settings(sos_function=test_ssfun)
mcstat.parameters.add_model_parameter(
name='m',
theta0=2.,
minimum=-10,
maximum=np.inf,
sample=True)
mcstat.parameters.add_model_parameter(
name='b',
theta0=-5.,
minimum=-10,
maximum=100,
sample=True)
# run mcmc
mcstat.run_simulation()
Sampling these parameters: name start [ min, max] N( mu, sigma^2) m: 2.00 [ -10.00, inf] N( 0.00e+00, inf) b: -5.00 [ -10.00, 100.00] N( 0.00e+00, inf) [-----------------100%-----------------] 5000 of 5000 complete in 1.5 sec
In addition, we check the last row of the chain to see where the simulation ended.
# Extract results
results = mcstat.simulation_results.results
chain = results['chain']
s2chain = results['s2chain']
sschain = results['sschain']
names = results['names']
# define burnin
burnin = 2000
# display chain statistics
mcstat.chainstats(chain[burnin:, :], results)
print('chain[-1, :] = {}'.format(chain[-1, :]))
--------------------- name : mean std MC_err tau geweke m : 2.0130 0.0320 0.0038 48.7245 0.9959 b : 2.9673 0.0800 0.0096 49.0697 0.9921 --------------------- chain[-1, :] = [2.01337637 2.95431498]
To check the repeatability we simply create a new MCMC object with the same random seed and compare the result.
# Initialize MCMC object
mcstat = MCMC(rngseed=1234)
mcstat.data.add_data_set(x, y)
mcstat.simulation_options.define_simulation_options(
nsimu=int(5.0e3), updatesigma=True, method='dram')
# update model settings
mcstat.model_settings.define_model_settings(sos_function=test_ssfun)
mcstat.parameters.add_model_parameter(
name='m',
theta0=2.,
minimum=-10,
maximum=np.inf,
sample=True)
mcstat.parameters.add_model_parameter(
name='b',
theta0=-5.,
minimum=-10,
maximum=100,
sample=True)
# run mcmc
mcstat.run_simulation()
# Extract results
results = mcstat.simulation_results.results
chain = results['chain']
s2chain = results['s2chain']
sschain = results['sschain']
names = results['names']
# define burnin
burnin = 2000
# display chain statistics
mcstat.chainstats(chain[burnin:, :], results)
print('chain[-1, :] = {}'.format(chain[-1, :]))
Sampling these parameters: name start [ min, max] N( mu, sigma^2) m: 2.00 [ -10.00, inf] N( 0.00e+00, inf) b: -5.00 [ -10.00, 100.00] N( 0.00e+00, inf) [-----------------100%-----------------] 5000 of 5000 complete in 1.4 sec --------------------- name : mean std MC_err tau geweke m : 2.0130 0.0320 0.0038 48.7245 0.9959 b : 2.9673 0.0800 0.0096 49.0697 0.9921 --------------------- chain[-1, :] = [2.01337637 2.95431498]
It is clearly seen that the results are identical to the first simulation, so the random number process is repeatable.