Inferring species trees with tetrad

When you install ipyrad a number of analysis tools are installed as well. This includes the program tetrad, which applies the theory of phylogenetic invariants (see Lake 1987) to infer quartet trees based on a SNP alignment. It then uses the software wQMC to join the quartets into a species tree. This combined approach was first developed by Chifman and Kubatko (2015) in the software SVDQuartets.

Required software

In [ ]:
## conda install ipyrad -c ipyrad
## conda install toytree -c eaton-lab
In [1]:
import ipyrad.analysis as ipa
import ipyparallel as ipp
import toytree

Connect to a cluster

In [2]:
## connect to a cluster
ipyclient = ipp.Client()
print("connected to {} cores".format(len(ipyclient)))
connected to 4 cores

Run tetrad

In [3]:
## initiate a tetrad object
tet = ipa.tetrad(
    name="pedic-full",
    seqfile="analysis-ipyrad/pedic-full_outfiles/pedic-full.snps.phy",
    mapfile="analysis-ipyrad/pedic-full_outfiles/pedic-full.snps.map",
    nboots=100,
    )
loading seq array [13 taxa x 14159 bp]
max unlinked SNPs per quartet (nloci): 2777
In [4]:
## run tetrad on the cluster
tet.run(ipyclient=ipyclient)
host compute node: [4 cores] on oud
inferring 715 induced quartet trees
[####################] 100%  initial tree | 0:00:06 |  
[####################] 100%  boot 100     | 0:01:00 |  

Plot the tree

In [8]:
## plot the resulting unrooted tree
import toytree
tre = toytree.tree(tet.trees.nhx)
tre.draw(
    width=350, 
    node_labels=tre.get_node_values("support"),
    );
33588_przewalskii32082_przewalskii30686_cyathophylla29154_superba41954_cyathophylloides41478_cyathophylloides33413_thamno30556_thamno35236_rex35855_rex40578_rex38362_rex39618_rexidx: 1 name: 1 dist: 100 support: 100100idx: 2 name: 2 dist: 100 support: 100100idx: 3 name: 3 dist: 100 support: 100100idx: 4 name: 4 dist: 100 support: 100100idx: 5 name: 5 dist: 100 support: 100100idx: 6 name: 6 dist: 81 support: 8181idx: 7 name: 7 dist: 48 support: 4848idx: 8 name: 8 dist: 35 support: 3535idx: 9 name: 9 dist: 100 support: 100100idx: 10 name: 10 dist: 100 support: 100100
In [12]:
## save the tree as a pdf
import toyplot.pdf
toyplot.pdf.render(canvas, "analysis-tetrad/tetrad-tree.pdf")

What does tetrad do differently from svdquartets?

Not too much currently. But we have plans to expand it. Importantly, however, the code is open source meaning that anybody can read it and contribute it, which is not the case for Paup*. tetrad is also easier to install using conda and therefore easier to setup on an HPC cluster or local machine, and it can be parallelized across an arbitrarily large number of compute nodes while retaining a super small memory footprint.