Week 4: RNA-Seq and Gene expression analyses¶
[pictures of gene in different expressions]
okay not that
We will analyze a single RNA-Seq dataset a number of ways:
- Using the authors' data submitted to GEO
- Mapping to the
mm10
mouse genome using STAR and quantifying gene expression using featureCounts
- Using
kallisto
to do quasi-mapping and gene quantification
- Using simple correlation and statistics to analyze the data and find differentially expressed genes
Reading List¶
- Dataset: [Shalek A + Satija R, et al. "Single-cell transcriptomics reveals bimodality in expression and splicing in immune cells." Nature (2013)](papers/Shalek 2013/Nature 2013 Shalek.pdf)
- Transcriptome Alignment (STAR aligner): [Dobin A et al, "STAR: ultrafast universal RNA-seq aligner", Bioinformatics (2012)](papers/Dobin 2012/Bioinformatics 2012 Dobin.pdf)
- Gene quantification (
featureCounts
program): [Liao Y, Smyth GK, Shi W. "featureCounts: an efficient general purpose program for assigning sequence reads to genomic features." Bioinformatics (2014)](papers/Liao 2014/Bioinformatics 2014 Liao.pdf)
- Quasi-alignment and gene quantification (
kallisto
program): [Near-optimal RNA-Seq quantification](papers/Bray 2015/2015 Bray.pdf)
- What the FPKM - Explain difference between TPM/FPKM/FPKM units
Downloads¶
Plan¶
- Tuesday
- RNA-Sequencing Lecture
- Read alignment & quantification with STAR and
featureCounts
(mini-sample only)
- Quasi-alignment with
kallisto
(mini-sample)
- Thursday
- Viewing alignments with IGV
Optional¶
If you're feeling gung-ho, then you can do these other analyses which involves downloading data from GEO, making boxplots and clustered heatmaps, and calculating correlations beteween many samples.