#!/usr/bin/env python # coding: utf-8 # Fungal ITS QIIME analysis tutorial # ================================== # # In this tutorial we illustrate steps for analyzing fungal ITS amplicon data using the QIIME/UNITE reference OTUs (alpha version 12_11) to compare the composition of 9 soil communities using open-reference OTU picking. More recent ITS reference databases based on UNITE are available on the [QIIME resources page](http://qiime.org/home_static/dataFiles.html). The steps in this tutorial can be generalized to work with other marker genes, such as 18S. # # We recommend working through the [Illumina Overview Tutorial](http://qiime.org/tutorials/illumina_overview_tutorial.html) before working through this tutorial, as it provides more detailed annotation of the steps in a QIIME analysis. This tutorial is intended to highlight the differences that are necessary to work with a database other than QIIME's default reference database. For ITS, we won't build a phylogenetic tree and therefore use nonphylogenetic diversity metrics. Instructions are included for how to build a phylogenetic tree if you're sequencing a non-16S, phylogenetically-informative marker gene (e.g., 18S). # # First, we obtain the tutorial data and reference database: # In[ ]: get_ipython().system('(wget ftp://ftp.microbio.me/qiime/tutorial_files/its-soils-tutorial.tgz || curl -O ftp://ftp.microbio.me/qiime/tutorial_files/its-soils-tutorial.tgz)') get_ipython().system('(wget ftp://ftp.microbio.me/qiime/tutorial_files/its_12_11_otus.tgz || curl -O ftp://ftp.microbio.me/qiime/tutorial_files/its_12_11_otus.tgz)') # Now unzip these files. # In[ ]: get_ipython().system('tar -xzf its-soils-tutorial.tgz') get_ipython().system('tar -xzf its_12_11_otus.tgz') get_ipython().system('gunzip ./its_12_11_otus/rep_set/97_otus.fasta.gz') get_ipython().system('gunzip ./its_12_11_otus/taxonomy/97_otu_taxonomy.txt.gz') # You can then view the files in each of these direcories by passing the directory name to the ``FileLinks`` function. # In[ ]: from IPython.display import FileLink, FileLinks FileLinks('its-soils-tutorial') # The ``params.txt`` file modifies some of the default parameters of this analysis. You can review those by clicking the link or by catting the file. # In[ ]: get_ipython().system('cat its-soils-tutorial/params.txt') # The parameters that differentiate ITS analysis from analysis of other amplicons are the two ``assign_taxonomy`` parameters, which are pointing to the reference collection that we just downloaded. # We're now ready to run the ``pick_open_reference_otus.py`` workflow. Discussion of these methods can be found in [Rideout et. al (2014)](https://peerj.com/articles/545/). # # Note that we pass `-r` to specify a non-default reference database. We're also passing `--suppress_align_and_tree` because we know that trees generated from ITS sequences are generally not phylogenetically informative. # In[ ]: get_ipython().system('pick_open_reference_otus.py -i its-soils-tutorial/seqs.fna -r its_12_11_otus/rep_set/97_otus.fasta -o otus/ -p its-soils-tutorial/params.txt --suppress_align_and_tree') # **Note:** If you would like to build a phylogenetic tree (e.g., if you're using a phylogentically-informative marker gene such as 18S instead of ITS), you should remove the ``--suppress_align_and_tree`` parameter from the above command and add the following lines to the parameters file: # # ``` # align_seqs:template_fp # filter_alignment:suppress_lane_mask_filter True # filter_alignment:entropy_threshold 0.10 # ``` # # After that completes (it will take a few minutes) we'll have the OTU table with taxonomy. You can review all of the files that are created by passing the path to the `index.html` file in the output directory to the `FileLink` function. # In[ ]: FileLink('otus/index.html') # You can then pass the OTU table to ``biom summarize-table`` to view a summary of the information in the OTU table. # In[ ]: get_ipython().system('biom summarize-table -i otus/otu_table_mc2_w_tax.biom') # Next, we run several core diversity analyses, including alpha/beta diversity and taxonomic summarization. We will use an even sampling depth of 353 based on the results of `biom summarize-table` above. Since we did not built a phylogenetic tree, we'll pass the `--nonphylogenetic_diversity` flag, which specifies to compute Bray-Curtis distances instead of UniFrac distances, and to use only nonphylogenetic alpha diversity metrics. # In[ ]: get_ipython().system('core_diversity_analyses.py -i otus/otu_table_mc2_w_tax.biom -o cdout/ -m its-soils-tutorial/map.txt -e 353 --nonphylogenetic_diversity') # You may see a warning issued above; this is safe to ignore. # # **Note:** If you built a phylogenetic tree, you should pass the path to that tree via `-t` and not pass ``--nonphylogenetic_diversity``. # # You can view the output of `core_diversity_analyses.py` using `FileLink`. # In[ ]: FileLink('cdout/index.html') # ## Precomputed results # # In case you're having trouble running the steps above, for example because of a broken QIIME installation, all of the output generated above has been precomputed. You can access this by running the cell below. # In[ ]: FileLinks("its-soils-tutorial/precomputed-output/")