RevBayes
with jupyter¶This notebook demonstrates how to run a simple RevBayes
analysis using jupyter. Clicking a cell will allow you to modify its contents. Note that some cells contain RevBayes
code and others contain Markdown
code. Pressing 'Shift+Enter
' will execute or render the code within a cell. The prompt on the left hand side reading e.g. 'In [1]:
' indicates the sequence of executed cells (where '[2]
' is executed after '[1]
').
First, we'll create filepath and filename variables for IO.
# IO
dat_fp = "example/data/"
dat_fn = dat_fp + "primates_cytb.nex"
out_fp = "example/output/"
out_fn = out_fp + "primates"
print("path to data: " + dat_fn)
print("path to output: " + out_fn)
Next, create helper variables to configure the MCMC analysis.
# MCMC settings
mvi = 1
mni = 1
n_gen = 1e3
sample_freq = floor(n_gen/1e2)
print("n_gen: " + n_gen)
print("sample_freq: " + sample_freq)
Read the NEXUS file stored in dat_fn
.
# read alignment
dat = readDiscreteCharacterData(dat_fn)
dat
Extract the alignment's dimensions.
# data dimensions
taxa = dat.taxa()
n_sites = dat.nchar()
n_taxa = taxa.size()
print("taxon names:")
print(taxa)
print("n_sites: " + n_sites)
print("n_taxa: " + n_taxa)
Let's create a simple pure birth process with unit height and birth rate, $\lambda \sim \text{Exp}(1)$. The three moves---mvNNI
, mvFNPR
, and mvNodeTimeSlideUniform
---will be used to instruct MCMC how to explore, and thus integrate over, tree space.
# create unit Yule tree
birth ~ dnExp(1)
phy ~ dnBDP(lambda=birth, mu=0., rootAge=1, taxa=taxa)
# tree moves
mv[mvi++] = mvNNI(phy, weight=n_taxa)
mv[mvi++] = mvFNPR(phy, weight=n_taxa/2)
mv[mvi++] = mvNodeTimeSlideUniform(phy, weight=n_taxa)
# print the tree's value
phy
We'll assume characters evolve according to a generalized time-reversible substitution process, with a rate matrix, Q
, determined by the base frequencies, pi
, and the exchangeability rates, er
.
# base frequencies
pi ~ dnDirichlet(rep(1,4))
mv[mvi++] = mvSimplexElementScale(pi, alpha=10, weight=4)
# exchangeability rates
er ~ dnDirichlet(rep(1,6))
mv[mvi++] = mvSimplexElementScale(er, alpha=10, weight=6)
# GTR rate matrix
Q := fnGTR(exchangeRates=er,
baseFrequencies=pi)
# print the matrix's value
Q
Finally, we'll create our phylogenetic substitution model, seq
, conditioned on the sequence data, dat
.
# build phylogenetic CTMC
seq ~ dnPhyloCTMC(tree=phy,
Q=Q,
branchRates=1.,
nSites=n_sites,
type="DNA")
# treat the simulated data as 'observed'
seq.clamp(dat)
Sample posterior tree and parameter estimates from MCMC every sample_freq
iterations.
# create monitors
mn[mni++] = mnScreen(pi, printgen=sample_freq)
mn[mni++] = mnModel(filename=out_fn+".model.log",
printgen=sample_freq)
mn[mni++] = mnFile(phy,
filename=out_fn+".tre",
printgen=sample_freq)
Build the Model
and Mcmc
objects from the model graph.
# create MCMC
mdl = model(phy)
ch = mcmc(mdl,mv,mn)
Run the MCMC chain, ch
, for n_gen
iterations. Samples from mnScreen
will be pushed to the jupyter console, but also saved in the output
directory relative to this notebook's directory.
Highlight this final cell and press 'Shift+Enter
' to start the MCMC analysis.
# run MCMC
ch.run(n_gen)