This document will walk through the steps for the initial processing of expression array data, including the creation of plots for QA/QC.
Required Files:
SIG_Array_QA_QC_Workflow.ipynb
): [Download here]array_qa_qc_functions.r
): [Download here]** Note: this notebook can also be downloaded as an R script (only the code blocks seen below will be included): [Download R script here]
Required R packages:
gdata
: https://cran.r-project.org/web/packages/gdata/index.htmloligo
: http://www.bioconductor.org/packages/release/bioc/html/oligo.htmlpd.mogene.2.1.st
: http://www.bioconductor.org/packages/release/data/annotation/html/pd.mogene.2.1.st.htmlmogene21sttranscriptcluster.db
: http://www.bioconductor.org/packages/release/data/annotation/html/mogene21sttranscriptcluster.db.htmlHeatplus
: http://www.bioconductor.org/packages/release/bioc/html/Heatplus.htmlggplot2
: https://cran.r-project.org/web/packages/ggplot2/index.htmlreshape2
: https://cran.r-project.org/web/packages/reshape2/index.htmlAll code is available on GitHub: https://github.com/biodev/SIG
If you are not familiar with Jupyter Notebooks, I've created a short tutorial to get you up and running quickly. There is also plenty of documentation online:
## Load libraries and functions for array processing and QA/QC plots
source('./scripts/array_qa_qc_functions.r')
gdata: read.xls support for 'XLS' (Excel 97-2004) files ENABLED. gdata: read.xls support for 'XLSX' (Excel 2007+) files ENABLED. Attaching package: ‘gdata’ The following object is masked from ‘package:stats’: nobs The following object is masked from ‘package:utils’: object.size Loading required package: BiocGenerics Loading required package: parallel Attaching package: ‘BiocGenerics’ The following objects are masked from ‘package:parallel’: clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB The following object is masked from ‘package:gdata’: combine The following objects are masked from ‘package:stats’: IQR, mad, xtabs The following objects are masked from ‘package:base’: anyDuplicated, append, as.data.frame, as.vector, cbind, colnames, do.call, duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, lengths, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply, union, unique, unlist, unsplit Loading required package: oligoClasses Welcome to oligoClasses version 1.30.0 Loading required package: Biobase Welcome to Bioconductor Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")', and for packages 'citation("pkgname")'. Loading required package: Biostrings Loading required package: S4Vectors Loading required package: stats4 Loading required package: IRanges Attaching package: ‘IRanges’ The following object is masked from ‘package:gdata’: trim Loading required package: XVector ================================================================================ Welcome to oligo version 1.32.0 ================================================================================ Loading required package: RSQLite Loading required package: DBI Loading required package: AnnotationDbi Loading required package: GenomeInfoDb Warning message: : multiple methods tables found for ‘dbconn’Warning message: : multiple methods tables found for ‘dbfile’ Attaching package: ‘AnnotationDbi’ The following objects are masked from ‘package:BiocGenerics’: dbconn, dbfile Loading required package: org.Mm.eg.db
## Read the annotation spreadsheet into R
## You will have to change the directory path
annot_dir = '/Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array'
setwd(annot_dir)
sample_annot = read.xls('BatPlate_Annotation_editedMM.xlsx', header=T, as.is=T, na.strings=c(""," ", "NA", "#DIV/0!"))
rownames(sample_annot) = sample_annot$ID
## Check annotation dataframe
head(sample_annot[,1:5])
ID | Mating | Number | RIN | Dam | |
---|---|---|---|---|---|
13067x16912_f67bat | 13067x16912_f67bat | 13067x16912 | 67 | 9.1 | 13067 |
13067x16912_f68bat | 13067x16912_f68bat | 13067x16912 | 68 | 8.8 | 13067 |
13067x16912_f69bat | 13067x16912_f69bat | 13067x16912 | 69 | 1.6 | 13067 |
13140x16680_f84bat | 13140x16680_f84bat | 13140x16680 | 84 | 8.7 | 13140 |
13140x16680_f86bat | 13140x16680_f86bat | 13140x16680 | 86 | 9.3 | 13140 |
13140x16680_f95bat | 13140x16680_f95bat | 13140x16680 | 95 | 9 | 13140 |
## Set directory where .CEL files are located, and get the list of files
## You will have to change the directory path
cel_dir = '/Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data'
cel_files = list.celfiles(cel_dir)
## Check file names
cel_files[1:3]
## Create sample names (these must match the annotation file)
sample_names = gsub("_2.CEL", "", cel_files)
## Check sample names
sample_names[1:5]
## Subset sample annotation dataframe
sample_annot = sample_annot[sample_names,]
length(sample_names) == dim(sample_annot)[1]
## Create a phenoData object
phenoData = new("AnnotatedDataFrame", data=sample_annot)
phenoData
An object of class 'AnnotatedDataFrame' rowNames: 13067x16912_f67bat 13067x16912_f68bat ... MAQC (96 total) varLabels: ID Mating ... OnArray (28 total) varMetadata: labelDescription
## Load the raw expression data
raw.exprs = read.celfiles(file.path(cel_dir, cel_files), pkgname="pd.mogene.2.1.st",
sampleNames=sample_names, phenoData=phenoData)
Platform design info loaded.
Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/13067x16912_f67bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/13067x16912_f68bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/13067x16912_f69bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/13140x16680_f84bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/13140x16680_f86bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/13140x16680_f95bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16188x3252_f281bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16188x3252_f282bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16188x3252_f283bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16188x8005_f2bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16188x8005_f3bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16188x8005_f4bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16211x13140_f106bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16211x13140_f108bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16211x13140_f119bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16211x16557_f11bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16211x16557_f4bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16211x16557_f8bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16441x8024_f90bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16441x8024_f94bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16441x8024_f95bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16557x13067_f90bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16557x13067_f91bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16557x13067_f98bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16912x5489_f138bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16912x5489_f139bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/16912x5489_f140bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/18042x3032_f18bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/18042x3032_f19bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/18042x3032_f3bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/3032x16188_f133bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/3032x16188_f145bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/3032x16188_f148bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/3032x16441_f39bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/3032x16441_f40bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/3032x16441_f41bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/3154x16012_f59bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/3154x16012_f62bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/3154x16012_f63bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/3252x8002_f27bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/3252x8002_f50bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/3252x8002_f51bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/3609x5489_f124bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/3609x5489_f125bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/3609x5489_f126bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/477x16912_f64bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/477x16912_f65bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/477x16912_f78bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/5119x8018_f74bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/5346x16768_f33bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/5346x16768_f34bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/5346x16768_f37bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/5489x16557_f149bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/5489x16557_f150bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/5489x16557_f161bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/5489x16557_f162bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8002x3032_f256bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8002x3032_f257bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8002x3032_f258bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8002x8010_f15bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8002x8010_f18bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8002x8010_f19bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8005x8002_f117bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8005x8002_f118bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8005x8002_f119bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8008x8016_f117bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8008x8016_f118bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8008x8016_f119bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8026x5080_f67bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8026x5080_f90bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8026x5080_f91bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8033x5346_f104bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8033x5346_f107bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8033x5346_f99bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8034x8048_f29bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8034x8048_f5bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8034x8048_f6bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8042x16513_f34bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8042x16513_f39bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8042x16513_f78bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8043x8008_f11bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8043x8008_f5bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8043x8008_f6bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8048x15155_f79bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8048x15155_f80bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8048x15155_f93bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8048x8026_f106bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8048x8026_f79bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8048x8026_f87bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8049x8010_f152bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8049x8010_f153bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8049x8010_f154bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8056x8033_f83bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8056x8033_f84bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/8056x8033_f90bat_2.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/Bat_Virus_Array/data/MAQC_2.CEL
Warning message: In read.celfiles(file.path(cel_dir, cel_files), pkgname = "pd.mogene.2.1.st", : 'channel' automatically added to varMetadata in phenoData.
head(pData(phenoData)[,1:5])
ID | Mating | Number | RIN | Dam | |
---|---|---|---|---|---|
13067x16912_f67bat | 13067x16912_f67bat | 13067x16912 | 67 | 9.1 | 13067 |
13067x16912_f68bat | 13067x16912_f68bat | 13067x16912 | 68 | 8.8 | 13067 |
13067x16912_f69bat | 13067x16912_f69bat | 13067x16912 | 69 | 1.6 | 13067 |
13140x16680_f84bat | 13140x16680_f84bat | 13140x16680 | 84 | 8.7 | 13140 |
13140x16680_f86bat | 13140x16680_f86bat | 13140x16680 | 86 | 9.3 | 13140 |
13140x16680_f95bat | 13140x16680_f95bat | 13140x16680 | 95 | 9 | 13140 |
## Save raw expression to file in same directory as .CEL files
## This may be used as input for DE and Pathway analysis
save(raw.exprs, file=file.path(cel_dir, 'bat_virus_raw_exprs_2-FEB-2016.rda'))
## Create un-normalized ExpressionSet
bgcor.exprs = rma(raw.exprs, normalize=FALSE, target="core")
Background correcting Calculating Expression
## Create normalized ExpressionSet
norm.exprs = rma(raw.exprs, normalize=TRUE, target="core")
Background correcting Normalizing Calculating Expression
## Check normalized expression matrix
exprs(norm.exprs)[1:5,1:5]
13067x16912_f67bat | 13067x16912_f68bat | 13067x16912_f69bat | 13140x16680_f84bat | 13140x16680_f86bat | |
---|---|---|---|---|---|
17200001 | 5.195653 | 5.984647 | 4.490499 | 6.040004 | 6.567711 |
17200003 | 5.475113 | 5.261995 | 4.716498 | 5.583019 | 5.503679 |
17200005 | 4.240997 | 4.450219 | 6.114470 | 4.240607 | 4.375452 |
17200007 | 5.104637 | 4.446430 | 5.243653 | 4.569253 | 5.240228 |
17200009 | 5.577562 | 5.216354 | 6.269951 | 6.487609 | 5.293501 |
## Save normalized expression to file (optional)
#save(norm.exprs, file="bat_virus_array_normalized.rda")
describe(make.boxplot)
This function creates a boxplot of the raw or normalized expression values. If RIN values are available a subplot will be added. Parameters: use.exprs: An ExpressionSet returned by the rma() function. type: The type of plot to create, either 'raw' or 'norm'. make.pdf: A logical indicating if a PDF should be created. base.name: The base filename of the PDF to create (default is 'test') order.by: A character vector containing the column names that will be used to order the samples (default is c('Mating', 'Sex', 'Number')). color.by: The column name used to assign colors to the samples (default is 'Mating'). highlight.names: A character vector containing sample names (default=NULL). Can be used to highlight samples in the plot. ...: Additional parameters can be passed to format the x-axis labels.
## Boxplot of un-normalized expression values
make.boxplot(bgcor.exprs, type = 'raw', order.by=c("Mating", "Number"), color.by="Mating", make.pdf=F)
Warning message: In make.boxplot(bgcor.exprs, type = "raw", order.by = c("Mating", : NAs introduced by coercion
## Boxplot of normalized expression values
make.boxplot(norm.exprs, type = 'norm', order.by=c("Mating", "Number"), color.by="Mating", make.pdf=F)
Warning message: In make.boxplot(norm.exprs, type = "norm", order.by = c("Mating", : NAs introduced by coercion
describe(plot.bac.spikes)
This function creates a bacterial spike plot. Parameters: use.exprs: An ExpressionSet returned by the rma() function. pgf.file: A probe group file for the array. make.pdf: A logical indicating if a PDF should be created. base.name: The base filename of the PDF to create (default is 'test').
## The probe group file from Affy for the array
pg_file = '/Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/MoGene-2_1-st.pgf'
## Create bacterial spike plot
plot.bac.spikes(norm.exprs, pg_file, make.pdf=F)
Loading required package: affxparser
describe(plot.polya.spikes)
This function creates a polyA spike plot. Parameters: use.exprs: An ExpressionSet returned by the rma() function. pgf.file: A probe group file for the array. plot.type: A string indicating the type of control probesets for the array (default is 'AFFX-r2-Bs'). make.pdf: A logical indicating if a PDF should be created. base.name: The base filename of the PDF to create (default is 'test').
## Create polyA spike plot
plot.polya.spikes(norm.exprs, pg_file, make.pdf=F)
describe(make.heatmap)
This function creates an annotated heatmap for expression data. Parameters: use.exprs: An ExpressionSet returned by the rma() function. cut.dist: The height at which to cut the dendrogram (default is NULL; no cutting). num.genes: A number indicating the number of most variable genes to include. If NULL, all genes will be included (default is 1000). base.factors: A character vector containing column names that will be used to annotate the heatmap (default is c('Sex')). rin.breaks: A numeric vector indicating the break points for discretizing the RIN scores (default is c(0, 5, 7, 10)). make.pdf: A logical indicating if a PDF should be created. base.name: The base filename of the PDF to create (default is 'test').
## Create annotated heatmap (don't include MAQC in heatmap)
make.heatmap(norm.exprs[, norm.exprs$ID != 'MAQC'], base.factors=c('Sex', 'D4_percent'), make.pdf=F)
Warning message: In make.heatmap(norm.exprs[, norm.exprs$ID != "MAQC"], base.factors = c("Sex", : NAs introduced by coercion
## Load MAQC Annotations
maqc_annot_dir = '/Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/MAQC'
maqc_annot = read.xls(file.path(maqc_annot_dir, 'maqc_annotation.xlsx'), header=T, as.is=T, na.strings=c(""," ", "NA", "#DIV/0!"))
rownames(maqc_annot) = maqc_annot$ID
## Set directory where MAQC .CEL files are located, and get the list of files
## You will have to change the directory path
maqc_cel_dir = '/Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/MAQC'
maqc_cel_files = list.celfiles(maqc_cel_dir)
## Check file names
maqc_cel_files[1:3]
## Create sample names (these must match the annotation file)
maqc_names = gsub(".CEL", "", maqc_cel_files)
## Create a phenoData object
maqcData = new("AnnotatedDataFrame", data=maqc_annot)
maqcData
An object of class 'AnnotatedDataFrame' rowNames: Ferris_BAT_MAQC_1_4_16 Ferris_FLU_MAQC_6_25_14 ... Lund_WNV_MAQC_1_2_15 (5 total) varLabels: ID Type ... Date.Downloaded (5 total) varMetadata: labelDescription
## Load the MAQC raw expression data
maqc.raw.exprs = read.celfiles(file.path(maqc_cel_dir, maqc_cel_files), pkgname="pd.mogene.2.1.st",
sampleNames=maqc_names, phenoData=maqcData)
Platform design info loaded.
Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/MAQC/Ferris_BAT_MAQC_1_4_16.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/MAQC/Ferris_FLU_MAQC_6_25_15.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/MAQC/Gale_WNV_MAQC_11_7_14.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/MAQC/Gale_WNV_MAQC_9_17_15.CEL Reading in : /Users/mooneymi/Documents/MyDocuments/SystemsImmunogenetics/Expression/MAQC/Lund_WNV_MAQC_1_2_15.CEL
Warning message: In read.celfiles(file.path(maqc_cel_dir, maqc_cel_files), pkgname = "pd.mogene.2.1.st", : 'channel' automatically added to varMetadata in phenoData.
## Create un-normalized MAQC ExpressionSet
maqc.bgcor.exprs = rma(maqc.raw.exprs, normalize=FALSE, target="core")
Background correcting Calculating Expression
## Create normalized MAQC ExpressionSet
maqc.norm.exprs = rma(maqc.raw.exprs, normalize=TRUE, target="core")
Background correcting Normalizing Calculating Expression
## Create boxplots
make.boxplot(maqc.bgcor.exprs, type = 'raw', order.by=c("Date.Downloaded"), color.by="Date.Downloaded", make.pdf=F)
Warning message: In make.boxplot(maqc.bgcor.exprs, type = "raw", order.by = c("Date.Downloaded"), : RIN values were not found in the sample annotations.
## Create boxplots
make.boxplot(maqc.norm.exprs, type = 'norm', order.by=c("Date.Downloaded"), color.by="Date.Downloaded", make.pdf=F)
Warning message: In make.boxplot(maqc.norm.exprs, type = "norm", order.by = c("Date.Downloaded"), : RIN values were not found in the sample annotations.