(📗) Cookbook: ipyrad.analysis.baba

This notebook demonstrates how to use the baba module in ipyrad to test for admixture and introgression. The code is written in Python and is best implemented in a Jupyter-notebook like this one. Analyses can be split up among many computing cores. Finally, there are some simple plotting methods to accompany tests which I show below.

import packages

In [1]:
## imports
import numpy as np
import ipyrad as ip
import ipyrad.analysis.baba as baba
import ipyparallel as ipp

## print versions
print "  ipyrad v.{}".format(ip.__version__)
print "  ipyparallel v.{}".format(ipp.__version__)
print "  numpy v.{}".format(np.__version__)
  ipyrad v.0.5.15
  ipyparallel v.5.2.0
  numpy v.1.11.2

ipcluster setup to run parallel code

You can find more details about this in our [ipyparallel tutorial].

In [2]:
## start 4 engines by running the commented line below in a separate bash terminal. 
##
## ipcluster start --n=4
In [57]:
## connect to client and print connection summary
ipyclient = ipp.Client()
print ip.cluster_info(client=ipyclient)
  host compute node: [4 cores] on oud

Get default (example) 12 taxon tree

The function baba.Tree() takes a newick string as an argument, or, if no value is passed, it constructs the default 12 taxon tree shown below. The Tree Class object holds a representation of the tree, which can be used for plotting, as well as for describing a demographic model for simulating data, as we'll show below.

In [58]:
## create a Tree class object with the default tree
tre = baba.Tree()
tre.show(width=400, height=300);
012345678910abcdefghijkl

Or get any arbitrary tree

In [59]:
## pass a newick string to the Tree class object
tre = baba.Tree(newick="((a,b),c);")
tre.show(width=200, height=200);
01abc

get a tree with an admixture edge

Shown by an arrow pointing backwards from source to sink (indicating that geneflow should be entered as occurring backwards in time). This is how the program msprime will interpret it.

In [51]:
## store newick tree and a migration event list [source, sink, start, stop, rate]
newick = "(((a,b),c), (d,e));"
events = [['e', 'c', 0, 1, 1e-6], 
          ['3', '1', 1, 2, 1e-6]]

## add admixture events to tree
tre = baba.Tree(newick=newick, admix=events)
tre.show(width=250, height=250, yaxis=True);
0123abcde0123

simulate data on that tree

In [33]:
for idx in xrange(tre.verts.shape[0]-1, -1, -1):
    print idx, tre.verts[idx-1]
    row = tre.verts[idx-1]
    
    ## get time of the next event
    if row[1] == 0:
        pass
    else:
        tau = row[1]
    
8 [ 1.  0.]
7 [ 2.  0.]
6 [ 3.  0.]
5 [ 4.  0.]
4 [ 0.5  1. ]
3 [ 3.5  1. ]
2 [ 1.25  2.  ]
1 [ 2.375  3.   ]
0 [ 0.  0.]
In [43]:
## when do coalescent events occur?
who = tre.verts[:, 1] > 0
print 'when', tre.verts[who, 1]
print 'who', np.where(who)[0]

## when do migration events occur
## ...

for time in when:
    ## does migration change during the next epoch
when [ 3.  2.  1.  1.]
who [0 1 2 3]
In [13]:
for node in tre.tree.traverse("postorder"):
    ## check for migration
    if node.name in [i[0] for i in events]:
        node
        #ms.MigrationRateChange(time=0, rate=m_C_B, matrix_index=(1, 2)),
        #ms.MigrationRateChange(time=Taus[1], rate=0),
[[ 2.375  3.   ]
 [ 1.25   2.   ]
 [ 3.5    1.   ]
 [ 0.5    1.   ]
 [ 4.     0.   ]
 [ 3.     0.   ]
 [ 2.     0.   ]
 [ 1.     0.   ]
 [ 0.     0.   ]]
[[1 8]
 [0 1]
 [2 6]
 [3 4]
 [3 5]
 [2 3]
 [0 2]
 [1 7]]

Simulate data

In [ ]: