or the many ways to perform GLMM in python
A comparison among:
StatsModels
Theano
PyMC3(Base on Theano)
TensorFlow
Stan and pyStan
Keras with Tensorflow backend
import numpy as np
#import pandas as pd
import seaborn as sns
from theano import tensor as tt
%pylab inline
%config InlineBackend.figure_format = 'retina'
%qtconsole --colors=linux
%load_ext watermark
Populating the interactive namespace from numpy and matplotlib
%watermark -v -m -p numpy,pandas,seaborn,statsmodels,theano,pymc3,tensorflow,keras,matplotlib
/usr/local/lib/python3.5/dist-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters
CPython 3.5.2 IPython 6.2.1 numpy 1.14.0 pandas 0.21.0 seaborn 0.8.1 statsmodels 0.8.0.dev0+abb5996 theano 1.0.0 pymc3 3.2 tensorflow 1.4.1 keras 2.1.2 matplotlib 2.1.0+224.g95805d2 compiler : GCC 5.4.0 20160609 system : Linux release : 4.4.0-104-generic machine : x86_64 processor : x86_64 CPU cores : 8 interpreter: 64bit
Using TensorFlow backend.
# load data
import pandas as pd
Tbl_beh = pd.read_csv('./behavioral_data.txt', delimiter='\t')
Tbl_beh["subj"] = Tbl_beh["subj"].astype('category')
#%% visualized data
# Draw a nested violinplot and split the violins for easier comparison
sns.set(style="ticks", palette="muted", color_codes=True)
plt.figure()
Tbl_beh['cbcond'] = pd.Series(Tbl_beh['chimera'] + '-' + Tbl_beh['identity'], index=Tbl_beh.index)
sns.violinplot(y="cbcond", x="rt", hue="group", data=Tbl_beh, split=True,
inner="quart", palette={"cp": "b", "cg": "y"})
sns.despine(left=True)
Tbl_beh
is the raw data table.
Tbl_beh.head(5)
subj | trial | chimera | identity | orientation | rt | acc | group | cbcond | |
---|---|---|---|---|---|---|---|---|---|
0 | 1 | 1 | L_R | other | upright | 754.6 | 0 | cp | L_R-other |
1 | 1 | 2 | L_L | other | upright | 1036.1 | 1 | cp | L_L-other |
2 | 1 | 3 | R_R | self | upright | 433.0 | 1 | cp | R_R-self |
3 | 1 | 4 | L_R | self | upright | 463.4 | 1 | cp | L_R-self |
4 | 1 | 5 | L_L | self | upright | 483.3 | 1 | cp | L_L-self |
#%% Compute the conditional mean of dataset
condi_sel = ['subj', 'chimera', 'identity', 'orientation']
tblmean = pd.DataFrame({'Nt' : Tbl_beh.groupby( condi_sel ).size()}).reset_index()
tblmean["subj"] = tblmean["subj"].astype('category')
tblmean['rt'] = np.asarray(Tbl_beh.groupby(condi_sel)['rt'].mean())
tblmean['ACC'] = np.asarray(Tbl_beh.groupby(condi_sel)['acc'].mean())
tblmean['group']= np.asarray(Tbl_beh.groupby(condi_sel)['group'].all())
tblmean
is the summary data table.
tblmean['cbcond'] = pd.Series(tblmean['chimera'] + '-' + tblmean['identity'],
index=tblmean.index)
## boxplot + scatter plot to show accuracy
#ax = sns.boxplot(y="cbcond", x="acc", data=tbltest,
# whis=np.inf, color="c")
## Add in points to show each observation
#sns.stripplot(y="cbcond", x="acc", data=tbltest,
# jitter=True, size=3, color=".3", linewidth=0)
plt.figure()
g1 = sns.violinplot(y="cbcond", x="rt", hue="group", data=tblmean, split=True,
inner="quart", palette={"cp": "b", "cg": "y"})
g1.set(xlim=(0, 3000))
# Make the quantitative axis logarithmic
sns.despine(trim=True)
tblmean.head(5)
subj | chimera | identity | orientation | Nt | rt | ACC | group | cbcond | |
---|---|---|---|---|---|---|---|---|---|
0 | 1 | L_L | other | inverted | 10 | 713.37 | 0.8 | cp | L_L-other |
1 | 1 | L_L | other | upright | 10 | 749.55 | 0.8 | cp | L_L-other |
2 | 1 | L_L | self | inverted | 10 | 538.64 | 0.6 | cp | L_L-self |
3 | 1 | L_L | self | upright | 10 | 629.01 | 0.9 | cp | L_L-self |
4 | 1 | L_R | other | inverted | 10 | 784.70 | 1.0 | cp | L_R-other |
Here I used the mean table as the data - it is smaller and faster to run
tbltest = tblmean
tbltest.shape
(320, 9)
import statsmodels.formula.api as smf
from patsy import dmatrices
formula = "rt ~ group*orientation*identity"
#formula = "rt ~ -1 + cbcond"
md = smf.mixedlm(formula, tbltest, groups=tbltest["subj"])
mdf = md.fit()
print(mdf.summary())
fe_params = pd.DataFrame(mdf.fe_params,columns=['LMM'])
random_effects = pd.DataFrame(mdf.random_effects)
random_effects = random_effects.transpose()
random_effects = random_effects.rename(index=str, columns={'groups': 'LMM'})
#%% Generate Design Matrix for later use
Y, X = dmatrices(formula, data=tbltest, return_type='matrix')
Terms = X.design_info.column_names
_, Z = dmatrices('rt ~ -1+subj', data=tbltest, return_type='matrix')
X = np.asarray(X) # fixed effect
Z = np.asarray(Z) # mixed effect
Y = np.asarray(Y).flatten()
nfixed = np.shape(X)
nrandm = np.shape(Z)
Mixed Linear Model Regression Results =================================================================================================== Model: MixedLM Dependent Variable: rt No. Observations: 320 Method: REML No. Groups: 20 Scale: 4502.8886 Min. group size: 16 Likelihood: -1796.2120 Max. group size: 16 Converged: Yes Mean group size: 16.0 --------------------------------------------------------------------------------------------------- Coef. Std.Err. z P>|z| [0.025 0.975] --------------------------------------------------------------------------------------------------- Intercept 689.538 21.918 31.459 0.000 646.579 732.497 group[T.cp] 151.113 37.049 4.079 0.000 78.498 223.727 orientation[T.upright] -64.602 13.160 -4.909 0.000 -90.395 -38.808 identity[T.self] -17.581 13.160 -1.336 0.182 -43.374 8.212 group[T.cp]:orientation[T.upright] -87.712 22.245 -3.943 0.000 -131.311 -44.113 group[T.cp]:identity[T.self] -67.411 22.245 -3.030 0.002 -111.010 -23.812 orientation[T.upright]:identity[T.self] -20.020 18.611 -1.076 0.282 -56.498 16.457 group[T.cp]:orientation[T.upright]:identity[T.self] 36.768 31.459 1.169 0.242 -24.890 98.426 groups RE 5119.673 27.637 ===================================================================================================
A helper function for ploting
#%% ploting function
def plotfitted(fe_params=fe_params,random_effects=random_effects,X=X,Z=Z,Y=Y):
plt.figure(figsize=(18,9))
ax1 = plt.subplot2grid((2,2), (0, 0))
ax2 = plt.subplot2grid((2,2), (0, 1))
ax3 = plt.subplot2grid((2,2), (1, 0), colspan=2)
fe_params.plot(ax=ax1)
random_effects.plot(ax=ax2)
ax3.plot(Y.flatten(),'o',color='k',label = 'Observed', alpha=.25)
for iname in fe_params.columns.get_values():
fitted = np.dot(X,fe_params[iname])+np.dot(Z,random_effects[iname]).flatten()
print("The MSE of "+iname+ " is " + str(np.mean(np.square(Y.flatten()-fitted))))
ax3.plot(fitted,lw=1,label = iname, alpha=.5)
ax3.legend(loc=0)
#plt.ylim([0,5])
plt.show()
plotfitted(fe_params=fe_params,random_effects=random_effects,X=X,Z=Z,Y=Y)
The MSE of LMM is 4150.212099971669
beta0 = np.linalg.lstsq(X,Y)
fixedpred = np.argmax(X,axis=1)
randmpred = np.argmax(Z,axis=1)
con = tt.constant(fixedpred)
sbj = tt.constant(randmpred)
import pymc3 as pm
with pm.Model() as glmm1:
# Fixed effect
beta = pm.Normal('beta', mu=0., sd=100., shape=(nfixed[1]))
# random effect
s = pm.HalfCauchy('s', 10.)
b = pm.Normal('b', mu=0., sd=s, shape=(nrandm[1]))
eps = pm.HalfCauchy('eps', 5.)
#mu_est = pm.Deterministic('mu_est',beta[con] + b[sbj])
mu_est = pm.Deterministic('mu_est', tt.dot(X,beta)+tt.dot(Z,b))
RT = pm.Normal('RT', mu_est, eps, observed=Y)
trace = pm.sample(1000, tune=1000)
pm.traceplot(trace,varnames=['beta','b','s']) #
plt.show()
fixed_pymc = pm.summary(trace, varnames=['beta'])
randm_pymc = pm.summary(trace, varnames=['b'])
fe_params['PyMC'] = pd.Series(np.asarray(fixed_pymc['mean']), index=fe_params.index)
random_effects['PyMC'] = pd.Series(np.asarray(randm_pymc['mean']),index=random_effects.index)
print(fe_params)
/usr/local/lib/python3.5/dist-packages/ipykernel_launcher.py:1: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions. To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`. """Entry point for launching an IPython kernel. Auto-assigning NUTS sampler... Initializing NUTS using jitter+adapt_diag... /home/laoj/Documents/Github/pymc3/pymc3/model.py:384: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. if not np.issubdtype(var.dtype, float): Multiprocess sampling (4 chains in 4 jobs) NUTS: [eps_log__, b, s_log__, beta] 30%|███ | 606/2000 [00:06<00:14, 99.12it/s]INFO (theano.gof.compilelock): Waiting for existing lock by process '31629' (I am process '31630') INFO (theano.gof.compilelock): To manually release the lock, delete /home/laoj/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-3.5.2-64/lock_dir 100%|██████████| 2000/2000 [00:10<00:00, 194.41it/s]
LMM PyMC Intercept 689.537735 663.211739 group[T.cp] 151.112622 158.703037 orientation[T.upright] -64.601774 -61.863246 identity[T.self] -17.581122 -14.990877 group[T.cp]:orientation[T.upright] -87.712115 -83.437449 group[T.cp]:identity[T.self] -67.411021 -63.125393 orientation[T.upright]:identity[T.self] -20.020417 -21.210995 group[T.cp]:orientation[T.upright]:identity[T.s... 36.768234 28.727198
This is a non-probabilistic model, but none-the-less uncertainty related to parameter estimation could be obtained using dropout (http://mlg.eng.cam.ac.uk/yarin/blog_3d801aa532c1ce.html).
Y, X = dmatrices(formula, data=tbltest, return_type='matrix')
Terms = X.design_info.column_names
_, Z = dmatrices('rt ~ -1+subj', data=tbltest, return_type='matrix')
X = np.asarray(X) # fixed effect
Z = np.asarray(Z) # mixed effect
Y = np.asarray(Y).flatten()
nfixed = np.shape(X)
nrandm = np.shape(Z)
X = X.astype(np.float32)
Z = Z.astype(np.float32)
Y = Y.astype(np.float32)
import theano
theano.config.compute_test_value = 'ignore'
def floatX(X):
return np.asarray(X, dtype=theano.config.floatX)
def init_weights(shape):
return theano.shared(floatX(np.random.randn(*shape) * 0.01))
def model(X, beta, Z, b):
return tt.sum(tt.dot(X, beta) + tt.dot(Z, b),axis=1)
def sgd(cost, params, lr=0.001):
grads = tt.grad(cost=cost, wrt=params)
updates = []
for p, g in zip(params, grads):
updates.append([p, p - g * lr])
return updates
Xtf = tt.fmatrix()
Ztf = tt.fmatrix()
y = tt.vector()
tbeta = init_weights([nfixed[1], 1])
tb = init_weights([nrandm[1], 1])
eps = init_weights([0])
y_ = model(Xtf, tbeta, Ztf,tb)
cost = tt.mean(tt.sqr(y - y_))
params = [tbeta, tb]
updates = sgd(cost, params)
train = theano.function(inputs=[Xtf, Ztf, y], outputs=cost, updates=updates, allow_input_downcast=True)
for i in range(100000):
sel = np.random.randint(0,nfixed[0],size=int(nfixed[0]/2))
batch_xs, batch_zs, batch_ys = X[sel,:],Z[sel,:],Y[sel]
train(batch_xs, batch_zs, batch_ys)
outdf = pd.DataFrame(data=tbeta.get_value(),columns=['beta'],index=Terms)
fe_params['theano'] = pd.Series(tbeta.get_value().flatten(), index=fe_params.index)
random_effects['theano'] = pd.Series(tb.get_value().flatten() ,index=random_effects.index)
print(fe_params)
LMM PyMC \ Intercept 689.537735 663.211739 group[T.cp] 151.112622 158.703037 orientation[T.upright] -64.601774 -61.863246 identity[T.self] -17.581122 -14.990877 group[T.cp]:orientation[T.upright] -87.712115 -83.437449 group[T.cp]:identity[T.self] -67.411021 -63.125393 orientation[T.upright]:identity[T.self] -20.020417 -21.210995 group[T.cp]:orientation[T.upright]:identity[T.s... 36.768234 28.727198 theano Intercept 653.040426 group[T.cp] 162.735105 orientation[T.upright] -66.495571 identity[T.self] -19.708378 group[T.cp]:orientation[T.upright] -82.978358 group[T.cp]:identity[T.self] -62.619234 orientation[T.upright]:identity[T.self] -16.524722 group[T.cp]:orientation[T.upright]:identity[T.s... 28.987585
Y, X = dmatrices(formula, data=tbltest, return_type='matrix')
Terms = X.design_info.column_names
_, Z = dmatrices('rt ~ -1+subj', data=tbltest, return_type='matrix')
X = np.asarray(X) # fixed effect
Z = np.asarray(Z) # mixed effect
Y = np.asarray(Y)
nfixed = np.shape(X)
nrandm = np.shape(Z)
X = X.astype(np.float32)
Z = Z.astype(np.float32)
Y = Y.astype(np.float32)
import tensorflow as tf
def init_weights(shape):
return tf.Variable(tf.random_normal(shape, stddev=0.01))
def model(X, beta, Z, b):
y_pred = tf.matmul(X, beta) + tf.matmul(Z, b)
#randcoef = tf.matmul(Z, b)
#Xnew = tf.transpose(X) * tf.transpose(randcoef)
#y_pred = tf.matmul(tf.transpose(Xnew), beta)
return y_pred # notice we use the same model as linear regression,
# this is because there is a baked in cost function which performs softmax and cross entropy
Xtf = tf.placeholder("float32", [None, nfixed[1]]) # create symbolic variables
Ztf = tf.placeholder("float32", [None, nrandm[1]])
y = tf.placeholder("float32", [None, 1])
beta = init_weights([nfixed[1], 1])
b = init_weights([nrandm[1], 1])
eps = init_weights([0])
y_ = model(Xtf, beta, Ztf, b)
# y_ = tf.nn.softmax(model(Xtf, beta) + model(Ztf, b) + eps)
# cost = tf.reduce_mean(-tf.reduce_sum(y * tf.log(y_), reduction_indices=[1]))
cost = tf.square(y - y_) # use square error for cost function
train_step = tf.train.GradientDescentOptimizer(0.0001).minimize(cost)
init = tf.global_variables_initializer()
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(init)
for i in range(2000):
sel = np.random.randint(0,nfixed[0],size=int(nfixed[0]/2))
batch_xs, batch_zs, batch_ys = X[sel,:],Z[sel,:],Y[sel]
sess.run(train_step, feed_dict={Xtf: batch_xs, Ztf: batch_zs, y: batch_ys})
accuracy = tf.reduce_mean(tf.cast(cost, tf.float32))
if i % 1000 == 0:
print(i,sess.run(accuracy, feed_dict={Xtf: X, Ztf: Z, y: Y}))
betaend = sess.run(beta)
bend = sess.run(b)
fe_params['tf'] = pd.Series(betaend.flatten(), index=fe_params.index)
random_effects['tf'] = pd.Series(bend.flatten(), index=random_effects.index)
print(fe_params)
0 415940.22 1000 4516.334 LMM PyMC \ Intercept 689.537735 663.211739 group[T.cp] 151.112622 158.703037 orientation[T.upright] -64.601774 -61.863246 identity[T.self] -17.581122 -14.990877 group[T.cp]:orientation[T.upright] -87.712115 -83.437449 group[T.cp]:identity[T.self] -67.411021 -63.125393 orientation[T.upright]:identity[T.self] -20.020417 -21.210995 group[T.cp]:orientation[T.upright]:identity[T.s... 36.768234 28.727198 theano tf Intercept 653.040426 650.199646 group[T.cp] 162.735105 157.798386 orientation[T.upright] -66.495571 -62.885197 identity[T.self] -19.708378 -15.955853 group[T.cp]:orientation[T.upright] -82.978358 -71.105774 group[T.cp]:identity[T.self] -62.619234 -52.372208 orientation[T.upright]:identity[T.self] -16.524722 -21.645893 group[T.cp]:orientation[T.upright]:identity[T.s... 28.987585 9.399190
model similar to brms
Y, X = dmatrices(formula, data=tbltest, return_type='matrix')
Terms = X.design_info.column_names
_, Z = dmatrices('rt ~ -1+subj', data=tbltest, return_type='matrix')
X = np.asarray(X) # fixed effect
Z = np.asarray(Z) # mixed effect
Y = np.asarray(Y).flatten()
nfixed = np.shape(X)
nrandm = np.shape(Z)
beta0 = np.linalg.lstsq(X,Y)
fixedpred = np.argmax(X,axis=1)
randmpred = np.argmax(Z,axis=1)
con = tt.constant(fixedpred)
sbj = tt.constant(randmpred)
X1 = X[0:, 1:]
X_mean = np.mean(X1, axis=0) # column means of X before centering
X_cent = X1 - X_mean
with pm.Model() as glmm2:
# temporary Intercept
temp_Intercept = pm.StudentT('temp_Intercept', nu=3, mu=0, sd=186)#
# Fixed effect
beta = pm.Normal('beta', mu = 0, sd = 100, shape=(nfixed[1]-1))
# random effect
# group-specific standard deviation
s = pm.HalfStudentT('sd_1', nu=3, sd=186)
b = pm.Normal('b', mu = 0, sd = 1, shape=(nrandm[1]))
r_1 = pm.Deterministic('r_1', s*b)
sigma = pm.HalfStudentT('sigma', nu=3, sd=186)
# compute linear predictor
mu_est = tt.dot(X_cent,beta) + temp_Intercept + r_1[sbj]
RT = pm.Normal('RT', mu_est, sigma, observed = Y)
b_Intercept = pm.Deterministic("b_Intercept", temp_Intercept - tt.sum(X_mean * beta))
trace2 = pm.sample(1000, tune=1000)
pm.traceplot(trace2, varnames=['beta', 'b_Intercept', 'r_1']) #
plt.show()
/usr/local/lib/python3.5/dist-packages/ipykernel_launcher.py:9: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions. To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`. if __name__ == '__main__': Auto-assigning NUTS sampler... INFO:pymc3:Auto-assigning NUTS sampler... Initializing NUTS using jitter+adapt_diag... INFO:pymc3:Initializing NUTS using jitter+adapt_diag... /home/laoj/Documents/Github/pymc3/pymc3/model.py:384: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. if not np.issubdtype(var.dtype, float): Multiprocess sampling (4 chains in 4 jobs) INFO:pymc3:Multiprocess sampling (4 chains in 4 jobs) NUTS: [sigma_log__, b, sd_1_log__, beta, temp_Intercept] INFO:pymc3:NUTS: [sigma_log__, b, sd_1_log__, beta, temp_Intercept] 100%|██████████| 2000/2000 [00:15<00:00, 128.54it/s]
df_summary = pm.summary(trace2, varnames=['beta'])
df_stmp = pm.summary(trace2, varnames=['b_Intercept']) #['temp_Intercept']
df_new = df_stmp.append(df_summary, ignore_index=True)
df_new.index=Terms
randm_pymc = pm.summary(trace2,varnames=['r_1'])
fe_params['PyMC2'] = pd.Series(np.asarray(df_new['mean']), index=fe_params.index)
random_effects['PyMC2'] = pd.Series(np.asarray(randm_pymc['mean']),index=random_effects.index)
print(fe_params)
LMM PyMC \ Intercept 689.537735 663.211739 group[T.cp] 151.112622 158.703037 orientation[T.upright] -64.601774 -61.863246 identity[T.self] -17.581122 -14.990877 group[T.cp]:orientation[T.upright] -87.712115 -83.437449 group[T.cp]:identity[T.self] -67.411021 -63.125393 orientation[T.upright]:identity[T.self] -20.020417 -21.210995 group[T.cp]:orientation[T.upright]:identity[T.s... 36.768234 28.727198 theano tf \ Intercept 653.040426 650.199646 group[T.cp] 162.735105 157.798386 orientation[T.upright] -66.495571 -62.885197 identity[T.self] -19.708378 -15.955853 group[T.cp]:orientation[T.upright] -82.978358 -71.105774 group[T.cp]:identity[T.self] -62.619234 -52.372208 orientation[T.upright]:identity[T.self] -16.524722 -21.645893 group[T.cp]:orientation[T.upright]:identity[T.s... 28.987585 9.399190 PyMC2 Intercept 694.862672 group[T.cp] 128.068855 orientation[T.upright] -67.127274 identity[T.self] -20.246368 group[T.cp]:orientation[T.upright] -78.328132 group[T.cp]:identity[T.self] -58.332467 orientation[T.upright]:identity[T.self] -16.419332 group[T.cp]:orientation[T.upright]:identity[T.s... 24.641072
library(lme4)
library(brms)
Tbl_beh <- read.csv("behavioral_data.txt",sep = "\t")
Tbl_beh$subj <- factor(Tbl_beh$subj)
Tbl_beh$trial <- factor(Tbl_beh$trial)
stanmodel <- make_stancode(rt ~ group*orientation*identity + (1|subj),
data = Tbl_beh,family = "normal")
standata <- make_standata(rt ~ group*orientation*identity + (1|subj),
data = Tbl_beh,family = "normal")
bayesfit <- brm(rt ~ group*orientation*identity + (1|subj),
data = Tbl_beh,family = "normal")
bayesfit$model
summary(bayesfit, waic = TRUE)
lmefit <- lmer(rt ~ group*orientation*identity + (1|subj),
data = Tbl_beh)
summary(lmefit)
stan_datadict = {}
stan_datadict['N'] = Y.shape[0]
stan_datadict['Y'] = Y
stan_datadict['K'] = X1.shape[1]
stan_datadict['X'] = X_cent
stan_datadict['X_means'] = X_mean
stan_datadict['J_1'] = randmpred + 1
stan_datadict['N_1'] = max(randmpred + 1)
stan_datadict['K_1'] = 1
stan_datadict['Z_1'] = np.ones(randmpred.shape)
stan_datadict['prior_only'] = 0
#stan_datadict.items()
stan_mdlspec_lmm = """
// This Stan code was generated with the R package 'brms'.
// We recommend generating the data with the 'make_standata' function.
functions {
}
data {
int<lower=1> N; // total number of observations
vector[N] Y; // response variable
int<lower=1> K; // number of population-level effects
matrix[N, K] X; // centered population-level design matrix
vector[K] X_means; // column means of X before centering
// data for group-specific effects of subj
int<lower=1> J_1[N];
int<lower=1> N_1;
int<lower=1> K_1;
vector[N] Z_1;
int prior_only; // should the likelihood be ignored?
}
transformed data {
}
parameters {
vector[K] b; // population-level effects
real temp_Intercept; // temporary Intercept
real<lower=0> sd_1; // group-specific standard deviation
vector[N_1] z_1; // unscaled group-specific effects
real<lower=0> sigma; // residual SD
}
transformed parameters {
// group-specific effects
vector[N_1] r_1;
vector[N] eta; // linear predictor
r_1 <- sd_1 * (z_1);
// compute linear predictor
eta <- X * b + temp_Intercept;
for (n in 1:N) {
eta[n] <- eta[n] + r_1[J_1[n]] * Z_1[n];
}
}
model {
// prior specifications
sd_1 ~ student_t(3, 0, 186);
z_1 ~ normal(0, 1);
sigma ~ student_t(3, 0, 186);
// likelihood contribution
if (!prior_only) {
Y ~ normal(eta, sigma);
}
}
generated quantities {
real b_Intercept; // population-level intercept
b_Intercept <- temp_Intercept - dot_product(X_means, b);
}
"""
import pystan
stan_fit_lmm = pystan.stan(model_code=stan_mdlspec_lmm, data=stan_datadict,
iter=3000, warmup=1000, chains=4, n_jobs=2,verbose=False)
stan_fit_lmm.plot(pars=['b_Intercept','b','z_1'])
plt.show()
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ In file included from /usr/local/lib/python3.5/dist-packages/numpy/core/include/numpy/ndarraytypes.h:1816:0, from /usr/local/lib/python3.5/dist-packages/numpy/core/include/numpy/ndarrayobject.h:18, from /usr/local/lib/python3.5/dist-packages/numpy/core/include/numpy/arrayobject.h:4, from /tmp/tmp05tpa9__/stanfit4anon_model_9f1c55ea88fca682f1405da3e975196f_4659688139688500613.cpp:599: /usr/local/lib/python3.5/dist-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp] #warning "Using deprecated NumPy API, disable it by " \ ^ In file included from /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/lib/boost_1.64.0/boost/multi_array/base.hpp:28:0, from /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/lib/boost_1.64.0/boost/multi_array.hpp:21, from /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/lib/boost_1.64.0/boost/numeric/odeint/util/multi_array_adaption.hpp:29, from /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/lib/boost_1.64.0/boost/numeric/odeint.hpp:61, from /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/stan/math/prim/arr/functor/integrate_ode_rk45.hpp:17, from /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/stan/math/prim/arr.hpp:38, from /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/stan/math/prim/mat.hpp:298, from /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/stan/math/rev/mat.hpp:12, from /usr/local/lib/python3.5/dist-packages/pystan/stan/src/stan/model/log_prob_grad.hpp:4, from /usr/local/lib/python3.5/dist-packages/pystan/stan/src/stan/model/test_gradients.hpp:7, from /usr/local/lib/python3.5/dist-packages/pystan/stan/src/stan/services/diagnose/diagnose.hpp:10, from /usr/local/lib/python3.5/dist-packages/pystan/stan_fit.hpp:22, from /tmp/tmp05tpa9__/stanfit4anon_model_9f1c55ea88fca682f1405da3e975196f_4659688139688500613.cpp:603: /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/lib/boost_1.64.0/boost/multi_array/concept_checks.hpp: In static member function ‘static void boost::multi_array_concepts::detail::idgen_helper<N>::call(Array&, const IdxGen&, Call_Type)’: /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/lib/boost_1.64.0/boost/multi_array/concept_checks.hpp:42:43: warning: typedef ‘index_range’ locally defined but not used [-Wunused-local-typedefs] typedef typename Array::index_range index_range; ^ /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/lib/boost_1.64.0/boost/multi_array/concept_checks.hpp:43:37: warning: typedef ‘index’ locally defined but not used [-Wunused-local-typedefs] typedef typename Array::index index; ^ /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/lib/boost_1.64.0/boost/multi_array/concept_checks.hpp: In static member function ‘static void boost::multi_array_concepts::detail::idgen_helper<0ul>::call(Array&, const IdxGen&, Call_Type)’: /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/lib/boost_1.64.0/boost/multi_array/concept_checks.hpp:53:43: warning: typedef ‘index_range’ locally defined but not used [-Wunused-local-typedefs] typedef typename Array::index_range index_range; ^ /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/lib/boost_1.64.0/boost/multi_array/concept_checks.hpp:54:37: warning: typedef ‘index’ locally defined but not used [-Wunused-local-typedefs] typedef typename Array::index index; ^ /tmp/tmp05tpa9__/stanfit4anon_model_9f1c55ea88fca682f1405da3e975196f_4659688139688500613.cpp: In function ‘PyObject* __pyx_pf_71stanfit4anon_model_9f1c55ea88fca682f1405da3e975196f_4659688139688500613_2_call_sampler(PyObject*, PyObject*, PyObject*, PyObject*)’: /tmp/tmp05tpa9__/stanfit4anon_model_9f1c55ea88fca682f1405da3e975196f_4659688139688500613.cpp:9152:30: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] __pyx_t_12 = ((__pyx_t_9 != __pyx_v_fitptr->param_names_oi().size()) != 0); ^ In file included from /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/stan/math/rev/core/operator_unary_plus.hpp:7:0, from /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/stan/math/rev/core.hpp:36, from /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/stan/math/rev/mat.hpp:4, from /usr/local/lib/python3.5/dist-packages/pystan/stan/src/stan/model/log_prob_grad.hpp:4, from /usr/local/lib/python3.5/dist-packages/pystan/stan/src/stan/model/test_gradients.hpp:7, from /usr/local/lib/python3.5/dist-packages/pystan/stan/src/stan/services/diagnose/diagnose.hpp:10, from /usr/local/lib/python3.5/dist-packages/pystan/stan_fit.hpp:22, from /tmp/tmp05tpa9__/stanfit4anon_model_9f1c55ea88fca682f1405da3e975196f_4659688139688500613.cpp:603: /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/stan/math/prim/scal/fun/constants.hpp: At global scope: /usr/local/lib/python3.5/dist-packages/pystan/stan/lib/stan_math/stan/math/prim/scal/fun/constants.hpp:65:18: warning: ‘stan::math::NEGATIVE_EPSILON’ defined but not used [-Wunused-variable] const double NEGATIVE_EPSILON ^
/usr/local/lib/python3.5/dist-packages/pystan/misc.py:399: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. elif np.issubdtype(np.asarray(v).dtype, float):
Itercept = stan_fit_lmm.extract(permuted=True)['b_Intercept']
betatemp = stan_fit_lmm.extract(permuted=True)['b']
btemp = stan_fit_lmm.extract(permuted=True)['z_1']
sdtemp = stan_fit_lmm.extract(permuted=True)['sd_1']
betastan = np.hstack([Itercept.mean(axis=0), betatemp.mean(axis=0)])
bstan = btemp.mean(axis=0)*sdtemp.mean(axis=0)
fe_params['Stan'] = pd.Series(betastan, index=fe_params.index)
random_effects['Stan'] = pd.Series(bstan,index=random_effects.index)
print(fe_params)
LMM PyMC \ Intercept 689.537735 663.211739 group[T.cp] 151.112622 158.703037 orientation[T.upright] -64.601774 -61.863246 identity[T.self] -17.581122 -14.990877 group[T.cp]:orientation[T.upright] -87.712115 -83.437449 group[T.cp]:identity[T.self] -67.411021 -63.125393 orientation[T.upright]:identity[T.self] -20.020417 -21.210995 group[T.cp]:orientation[T.upright]:identity[T.s... 36.768234 28.727198 theano tf \ Intercept 653.040426 650.199646 group[T.cp] 162.735105 157.798386 orientation[T.upright] -66.495571 -62.885197 identity[T.self] -19.708378 -15.955853 group[T.cp]:orientation[T.upright] -82.978358 -71.105774 group[T.cp]:identity[T.self] -62.619234 -52.372208 orientation[T.upright]:identity[T.self] -16.524722 -21.645893 group[T.cp]:orientation[T.upright]:identity[T.s... 28.987585 9.399190 PyMC2 Stan Intercept 694.862672 688.946428 group[T.cp] 128.068855 151.710532 orientation[T.upright] -67.127274 -64.415917 identity[T.self] -20.246368 -17.650878 group[T.cp]:orientation[T.upright] -78.328132 -87.906190 group[T.cp]:identity[T.self] -58.332467 -67.130097 orientation[T.upright]:identity[T.self] -16.419332 -20.182996 group[T.cp]:orientation[T.upright]:identity[T.s... 24.641072 36.620921
import keras
from keras.models import Model
from keras.layers import Input, Add, Dense
from keras import backend as K
K.clear_session()
nb_epoch = 1000
fixedpred = np.argmax(X,axis=1)
randmpred = np.argmax(Z,axis=1)
Xinput = Input(batch_shape=(None, nfixed[1]-1))
fixed_keras = Dense(1, input_dim=nfixed[1]-1, name = 'fixedEffect')(Xinput)
Zinput = Input(batch_shape=(None, nrandm[1]))
randm_keras = Dense(1, input_dim=nrandm[1], use_bias=None, name = 'randomEffect')(Zinput)
merged = keras.layers.add([fixed_keras, randm_keras])
model = Model([Xinput,Zinput],merged)
model.compile(loss='mean_squared_error', optimizer='sgd')
from keras.callbacks import TensorBoard
# train the model
model.fit([X[:,1:], Z], Y.flatten(),
epochs=nb_epoch,
batch_size=100,
verbose=0,
shuffle=True,
)#callbacks=[TensorBoard(log_dir='/tmp/xinyi_spot_invert')])
Ypredict = model.predict([X[:,1:], Z])
betakeras = np.hstack((model.get_weights()[1], model.get_weights()[0].flatten()))
bkeras = model.get_weights()[2].flatten()
fe_params['Keras'] = pd.Series(betakeras, index=fe_params.index)
random_effects['Keras'] = pd.Series(bkeras, index=random_effects.index)
print(fe_params)
LMM PyMC \ Intercept 689.537735 663.211739 group[T.cp] 151.112622 158.703037 orientation[T.upright] -64.601774 -61.863246 identity[T.self] -17.581122 -14.990877 group[T.cp]:orientation[T.upright] -87.712115 -83.437449 group[T.cp]:identity[T.self] -67.411021 -63.125393 orientation[T.upright]:identity[T.self] -20.020417 -21.210995 group[T.cp]:orientation[T.upright]:identity[T.s... 36.768234 28.727198 theano tf \ Intercept 653.040426 650.199646 group[T.cp] 162.735105 157.798386 orientation[T.upright] -66.495571 -62.885197 identity[T.self] -19.708378 -15.955853 group[T.cp]:orientation[T.upright] -82.978358 -71.105774 group[T.cp]:identity[T.self] -62.619234 -52.372208 orientation[T.upright]:identity[T.self] -16.524722 -21.645893 group[T.cp]:orientation[T.upright]:identity[T.s... 28.987585 9.399190 PyMC2 Stan \ Intercept 694.862672 688.946428 group[T.cp] 128.068855 151.710532 orientation[T.upright] -67.127274 -64.415917 identity[T.self] -20.246368 -17.650878 group[T.cp]:orientation[T.upright] -78.328132 -87.906190 group[T.cp]:identity[T.self] -58.332467 -67.130097 orientation[T.upright]:identity[T.self] -16.419332 -20.182996 group[T.cp]:orientation[T.upright]:identity[T.s... 24.641072 36.620921 Keras Intercept 652.405823 group[T.cp] 158.866501 orientation[T.upright] -66.060951 identity[T.self] -19.977377 group[T.cp]:orientation[T.upright] -75.678688 group[T.cp]:identity[T.self] -54.360828 orientation[T.upright]:identity[T.self] -17.846439 group[T.cp]:orientation[T.upright]:identity[T.s... 15.291400
plotfitted(fe_params=fe_params,random_effects=random_effects,X=X,Z=Z,Y=Y)
The MSE of LMM is 4150.212099971669 The MSE of PyMC is 4155.068506774344 The MSE of theano is 4138.020911984854 The MSE of tf is 4166.439669462494 The MSE of PyMC2 is 4152.542000833988 The MSE of Stan is 4139.408540744592 The MSE of Keras is 4148.736587845381
It is important to note from before that, although the model fitting (i.e., regression coefficients) are not the same across different approach, the model prediction is highly similar (at least it pass the eyeball test).