HTML reports for a set of regions

If you wish, you can view this vignette online here.

regionReport (Collado-Torres, Jaffe, and Leek, 2015) creates HTML reports styled with knitrBootstrap (Hester, 2013) for a set of regions such as derfinder (Collado-Torres, Frazee, Love, Irizarry, et al., 2015) results.

This package includes a basic exploration for a general set of genomic regions which can be easily customized to include the appropriate conclusions and/or further exploration of the results. Such a report can be generated using renderReport(). regionReport has a separate template for running a basic exploration analysis of derfinder results by using derfinderReport(). Both reports are written in R Markdown format and include all the code for making the plots and explorations in the report itself. For both reports, regionReport relies on knitr (Xie, 2014), rmarkdown (Allaire, Cheng, Xie, McPherson, et al., 2015), and knitrBootstrap (Hester, 2013) for generating the report.

Using regionReport

The plots in regionReport are powered by derfinderPlot (Collado-Torres, Jaffe, and Leek, 2015), ggbio (Yin, Cook, and Lawrence, 2012), and ggplot2 (Wickham, 2009).

Examples

The regionReport supplementary website regionReportSupp has examples of using regionReport with results from DiffBind, bumphunter, and derfinder. These represent different uses of regionReport for results from ChIP-seq, methylation, and RNA-seq data. In particular, the DiffBind example illustrates how to expand a basic report created with renderReport().

General case

For a general use case, you first have to identify a set of genomic regions of interest and store it as a GRanges object. In a typical workflow you will have some variables measured for each of the regions, such as p-values and scores. renderReport() uses the set of regions and three main arguments:

Other parameters control the name of the report, where it'll be located, the transcripts database used to annotate the nearest genes, graphical parameters, etc.

Here is a short example of how to use renderReport(). Note that we are using regions produced by derfinder just for convenience sake. You can also run this example by using example('renderReport', 'regionReport', ask=FALSE).

## Load derfinder
library('derfinder')
regions <- genomeRegions$regions

## Assign chr length
library('GenomicRanges')
seqlengths(regions) <- c('chr21' = 48129895)

## The output will be saved in the 'derfinderReport-example' directory
dir.create('renderReport-example', showWarnings = FALSE, recursive = TRUE)

## Generate the HTML report
report <- renderReport(regions, 'Example run', pvalueVars = c(
    'Q-values' = 'qvalues', 'P-values' = 'pvalues'), densityVars = c(
    'Area' = 'area', 'Mean coverage' = 'meanCoverage'), 
    significantVar = regions$qvalues <= 0.05, nBestRegions = 20,
    outdir = 'renderReport-example')

derfinder case

Run derfinder

Prior to using regionReport::derfinderReport() you must use derfinder to analyze a specific data set. While there are many ways to do so, we recommend using analyzeChr() with the same prefix argument. Then merging the results with mergeResults().

Below, we run derfinder for the example data included in the package. The steps are:

  1. Load derfinder
  2. Create a directory where we'll store the results
  3. Generate the pre-requisites for the models to use with the example data
  4. Generate the statistical models
  5. Analyze the example data for chr21
  6. Merge the results (only one chr in this case, but in practice there'll be more)
## Load derfinder
library('derfinder')

## The output will be saved in the 'report' directory
dir.create('report', showWarnings = FALSE, recursive = TRUE)

The following code runs derfinder.

## Save the current path
initialPath <- getwd()
setwd(file.path(initialPath, 'report'))

## Generate output from derfinder

## Collapse the coverage information
collapsedFull <- collapseFullCoverage(list(genomeData$coverage), 
verbose=TRUE)

## Calculate library size adjustments
sampleDepths <- sampleDepth(collapsedFull, probs=c(0.5), nonzero=TRUE, 
verbose=TRUE)

## Build the models
group <- genomeInfo$pop
adjustvars <- data.frame(genomeInfo$gender)
models <- makeModels(sampleDepths, testvars=group, adjustvars=adjustvars)

## Analyze chromosome 21
analysis <- analyzeChr(chr='21', coverageInfo=genomeData, models=models, 
cutoffFstat=1, cutoffType='manual', seeds=20140330, groupInfo=group, 
mc.cores=1, writeOutput=TRUE, returnOutput=TRUE)

## Save the stats options for later
optionsStats <- analysis$optionsStats

## Change the directory back to the original one
setwd(initialPath)

For convenience, we have included the derfinder results as part of regionReport. Note that the above functions are routinely checked as part of derfinder.

## Copy previous results
file.copy(system.file(file.path('extdata', 'chr21'), package='derfinder', 
mustWork=TRUE), 'report', recursive=TRUE)
## [1] TRUE

Next, proceed to merging the results.

## Merge the results from the different chromosomes. In this case, there's 
## only one: chr21
mergeResults(chrs = 'chr21', prefix = 'report',
    genomicState = genomicState$fullGenome)
## 2015-06-09 22:36:18 mergeResults: Saving options used
## 2015-06-09 22:36:18 Loading chromosome chr21
## Neither 'cutoffFstatUsed' nor 'optionsStats' were supplied, so the FWER calculation step will be skipped.
## 2015-06-09 22:36:18 mergeResults: Saving fullNullSummary
## 2015-06-09 22:36:18 mergeResults: Re-calculating the p-values
## 2015-06-09 22:36:18 mergeResults: Saving fullRegions
## 2015-06-09 22:36:18 mergeResults: assigning genomic states
## 2015-06-09 22:36:18 annotateRegions: counting
## 2015-06-09 22:36:18 annotateRegions: annotating
## 2015-06-09 22:36:18 mergeResults: Saving fullAnnotatedRegions
## 2015-06-09 22:36:18 mergeResults: Saving fullFstats
## 2015-06-09 22:36:18 mergeResults: Saving fullTime

Create report

Once the derfinder output has been generated and merged, use derfinderReport() to create the HTML report.

## Load derfindeReport
library('regionReport')

## Generate the HTML report
report <- derfinderReport(prefix='report', browse=FALSE,
    nBestRegions=15, makeBestClusters=TRUE, outdir='html',
    fullCov=list('21'=genomeDataRaw$coverage), optionsStats=optionsStats)

Once the output is generated, you can browse the report from R using browseURL() as shown below.

## Browse the report
browseURL(report)

You can compare the resulting report with the pre-compiled report using the following code.

browseURL(system.file(file.path('basicExploration', 'basicExploration.html'),  
    package = 'regionReport', mustWork = TRUE))

Notes

Note that the reports require an active Internet connection to render correctly.

The report is self-explanatory and will change some of the text depending on the input options.

If the report is taking too long to compile (say more than 3 hours), you might want to consider setting nBestCluters to a small number or even set makeBestClusters to FALSE.

Advanced arguments

If you are interested in using the advanced arguments, use derfinder::advancedArg() as shown below:

## URLs to advanced arguemtns
derfinder::advancedArg('derfinderReport', package = 'regionReport',
    browse = FALSE)
## Set browse = TRUE if you want to open them in your browser

In particular, you might be interested in specifying the output_format argument in either renderReport() or derfinderReport(). For example, setting output_format = 'pdf_document' will generate a PDF file instead. However, you will lose interactivity for toggling hiding/showing code and the tables will be static.

Reproducibility

This package was made possible thanks to:

Code for creating the vignette

## Create the vignette
library('knitrBootstrap') 

knitrBootstrapFlag <- packageVersion('knitrBootstrap') < '1.0.0'
if(knitrBootstrapFlag) {
    ## CRAN version
    system.time(knit_bootstrap('regionReport.Rmd', chooser=c('boot', 'code'), show_code = TRUE))
    unlink('regionReport.md')
} else {
    ## GitHub version
    library('rmarkdown')
    system.time(render('regionReport.Rmd',
        'knitrBootstrap::bootstrap_document'))
}
## Note: if you prefer the knitr version use:
# library('rmarkdown')
# system.time(render('regionReport.Rmd', 'html_document'))

## Extract the R code
library('knitr')
knit('regionReport.Rmd', tangle = TRUE)

## Copy report output to be distributed with the package for comparison 
## purposes
if(gsub('.*/', '', getwd()) == 'realVignettes') {
    file.copy(file.path('report', 'html', 'basicExploration.html'),
        file.path('..', '..', 'inst', 'basicExploration',
            'basicExploration.html'), overwrite=TRUE)
} else {
    file.copy(file.path('report', 'html', 'basicExploration.html'),
        file.path('..', 'inst', 'basicExploration', 'basicExploration.html'),
            overwrite=TRUE)
}

       
## Clean up
file.remove('regionReportRef.bib')
#unlink('regionReport_files', recursive=TRUE)
unlink('report', recursive = TRUE)

Date the vignette was generated.

## [1] "2015-06-09 22:37:02 PDT"

Wallclock time spent generating the vignette.

## Time difference of 52.559 secs

R session information.

## Session info -----------------------------------------------------------------------------------------------------------
##  setting  value                                   
##  version  R version 3.2.1 beta (2015-06-08 r68489)
##  system   x86_64, linux-gnu                       
##  ui       X11                                     
##  language en_US:                                  
##  collate  C                                       
##  tz       <NA>
## Packages ---------------------------------------------------------------------------------------------------------------
##  package           * version  date       source        
##  AnnotationDbi       1.30.1   2015-06-10 Bioconductor  
##  BSgenome            1.36.0   2015-06-10 Bioconductor  
##  Biobase             2.28.0   2015-06-10 Bioconductor  
##  BiocGenerics        0.14.0   2015-06-10 Bioconductor  
##  BiocParallel        1.2.2    2015-06-10 Bioconductor  
##  Biostrings          2.36.1   2015-06-10 Bioconductor  
##  DBI                 0.3.1    2014-09-24 CRAN (R 3.2.1)
##  Formula             1.2-1    2015-04-07 CRAN (R 3.2.1)
##  GGally              0.5.0    2014-12-02 CRAN (R 3.2.1)
##  GenomeInfoDb        1.4.0    2015-06-10 Bioconductor  
##  GenomicAlignments   1.4.1    2015-06-10 Bioconductor  
##  GenomicFeatures     1.20.1   2015-06-10 Bioconductor  
##  GenomicFiles        1.4.0    2015-06-10 Bioconductor  
##  GenomicRanges       1.20.5   2015-06-10 Bioconductor  
##  Hmisc               3.16-0   2015-04-30 CRAN (R 3.2.1)
##  IRanges             2.2.4    2015-06-10 Bioconductor  
##  MASS                7.3-40   2015-03-21 CRAN (R 3.2.1)
##  Matrix              1.2-1    2015-06-01 CRAN (R 3.2.1)
##  OrganismDbi         1.10.0   2015-06-10 Bioconductor  
##  RBGL                1.44.0   2015-06-10 Bioconductor  
##  RColorBrewer        1.1-2    2014-12-07 CRAN (R 3.2.1)
##  RCurl               1.95-4.6 2015-04-24 CRAN (R 3.2.1)
##  RJSONIO             1.3-0    2014-07-28 CRAN (R 3.2.1)
##  RSQLite             1.0.0    2014-10-25 CRAN (R 3.2.1)
##  Rcpp                0.11.6   2015-05-01 CRAN (R 3.2.1)
##  RefManageR          0.8.45   2015-01-09 CRAN (R 3.2.1)
##  Rsamtools           1.20.4   2015-06-10 Bioconductor  
##  S4Vectors           0.6.0    2015-06-10 Bioconductor  
##  VariantAnnotation   1.14.1   2015-06-10 Bioconductor  
##  XML                 3.98-1.2 2015-05-31 CRAN (R 3.2.1)
##  XVector             0.8.0    2015-06-10 Bioconductor  
##  acepack             1.3-3.3  2014-11-24 CRAN (R 3.2.1)
##  bibtex              0.4.0    2014-12-31 CRAN (R 3.2.1)
##  biomaRt             2.24.0   2015-06-10 Bioconductor  
##  biovizBase          1.16.0   2015-06-10 Bioconductor  
##  bitops              1.0-6    2013-08-17 CRAN (R 3.2.1)
##  bumphunter          1.8.0    2015-06-10 Bioconductor  
##  cluster             2.0.1    2015-01-31 CRAN (R 3.2.1)
##  codetools           0.2-11   2015-03-10 CRAN (R 3.2.1)
##  colorspace          1.2-6    2015-03-11 CRAN (R 3.2.1)
##  curl                0.8      2015-06-06 CRAN (R 3.2.1)
##  derfinder         * 1.2.0    2015-06-10 Bioconductor  
##  derfinderHelper     1.2.0    2015-06-10 Bioconductor  
##  derfinderPlot       1.2.0    2015-06-10 Bioconductor  
##  devtools          * 1.8.0    2015-05-09 CRAN (R 3.2.1)
##  dichromat           2.0-0    2013-01-24 CRAN (R 3.2.1)
##  digest              0.6.8    2014-12-31 CRAN (R 3.2.1)
##  doRNG               1.6      2014-03-07 CRAN (R 3.2.1)
##  evaluate            0.7      2015-04-21 CRAN (R 3.2.1)
##  foreach             1.4.2    2014-04-11 CRAN (R 3.2.1)
##  foreign             0.8-63   2015-02-20 CRAN (R 3.2.1)
##  formatR             1.2      2015-04-21 CRAN (R 3.2.1)
##  futile.logger       1.4.1    2015-04-20 CRAN (R 3.2.1)
##  futile.options      1.0.0    2010-04-06 CRAN (R 3.2.1)
##  ggbio               1.16.0   2015-06-10 Bioconductor  
##  ggplot2             1.0.1    2015-03-17 CRAN (R 3.2.1)
##  git2r               0.10.1   2015-05-07 CRAN (R 3.2.1)
##  graph               1.46.0   2015-06-10 Bioconductor  
##  gridExtra           0.9.1    2012-08-09 CRAN (R 3.2.1)
##  gtable              0.1.2    2012-12-05 CRAN (R 3.2.1)
##  highr               0.5      2015-04-21 CRAN (R 3.2.1)
##  htmltools           0.2.6    2014-09-08 CRAN (R 3.2.1)
##  httr                0.6.1    2015-01-01 CRAN (R 3.2.1)
##  iterators           1.0.7    2014-04-11 CRAN (R 3.2.1)
##  knitcitations     * 1.0.6    2015-05-26 CRAN (R 3.2.1)
##  knitr               1.10.5   2015-05-06 CRAN (R 3.2.1)
##  knitrBootstrap    * 0.9.0    2013-10-17 CRAN (R 3.2.1)
##  lambda.r            1.1.7    2015-03-20 CRAN (R 3.2.1)
##  lattice             0.20-31  2015-03-30 CRAN (R 3.2.1)
##  latticeExtra        0.6-26   2013-08-15 CRAN (R 3.2.1)
##  locfit              1.5-9.1  2013-04-20 CRAN (R 3.2.1)
##  lubridate           1.3.3    2013-12-31 CRAN (R 3.2.1)
##  magrittr            1.5      2014-11-22 CRAN (R 3.2.1)
##  markdown            0.7.7    2015-04-22 CRAN (R 3.2.1)
##  matrixStats         0.14.0   2015-02-14 CRAN (R 3.2.1)
##  memoise             0.2.1    2014-04-22 CRAN (R 3.2.1)
##  mgcv                1.8-6    2015-03-31 CRAN (R 3.2.1)
##  munsell             0.4.2    2013-07-11 CRAN (R 3.2.1)
##  nlme                3.1-120  2015-02-20 CRAN (R 3.2.1)
##  nnet                7.3-9    2015-02-11 CRAN (R 3.2.1)
##  pkgmaker            0.22     2014-05-14 CRAN (R 3.2.1)
##  plyr                1.8.2    2015-04-21 CRAN (R 3.2.1)
##  proto               0.3-10   2012-12-22 CRAN (R 3.2.1)
##  qvalue              2.0.0    2015-06-10 Bioconductor  
##  regionReport      * 1.2.1    2015-06-10 Bioconductor  
##  registry            0.2      2012-01-24 CRAN (R 3.2.1)
##  reshape             0.8.5    2014-04-23 CRAN (R 3.2.1)
##  reshape2            1.4.1    2014-12-06 CRAN (R 3.2.1)
##  rmarkdown           0.6.1    2015-05-07 CRAN (R 3.2.1)
##  rngtools            1.2.4    2014-03-06 CRAN (R 3.2.1)
##  rpart               4.1-9    2015-02-24 CRAN (R 3.2.1)
##  rtracklayer         1.28.4   2015-06-10 Bioconductor  
##  rversions           1.0.1    2015-06-06 CRAN (R 3.2.1)
##  scales              0.2.4    2014-04-22 CRAN (R 3.2.1)
##  stringi             0.4-1    2014-12-14 CRAN (R 3.2.1)
##  stringr             1.0.0    2015-04-30 CRAN (R 3.2.1)
##  survival            2.38-1   2015-02-24 CRAN (R 3.2.1)
##  whisker             0.3-2    2013-04-28 CRAN (R 3.2.1)
##  xml2                0.1.1    2015-06-02 CRAN (R 3.2.1)
##  xtable              1.7-4    2014-09-12 CRAN (R 3.2.1)
##  zlibbioc            1.14.0   2015-06-10 Bioconductor

Bibliography

This vignette was generated using knitrBootstrap (Hester, 2013) with knitr (Xie, 2014) and rmarkdown (Allaire, Cheng, Xie, McPherson, et al., 2015) running behind the scenes.

Citations made with knitcitations (Boettiger, 2015).

[1] J. Allaire, J. Cheng, Y. Xie, J. McPherson, et al. rmarkdown: Dynamic Documents for R. R package version 0.6.1. 2015. URL: http://CRAN.R-project.org/package=rmarkdown.

[2] S. Arora, M. Morgan, M. Carlson and H. Pages. GenomeInfoDb: Utilities for manipulating chromosome and other 'seqname' identifiers. R package version 1.4.0. 2015.

[3] B. Auguie. gridExtra: functions in Grid graphics. R package version 0.9.1. 2012. URL: http://CRAN.R-project.org/package=gridExtra.

[4] C. Boettiger. knitcitations: Citations for Knitr Markdown Files. R package version 1.0.6. 2015. URL: http://CRAN.R-project.org/package=knitcitations.

[5] M. Carlson. TxDb.Hsapiens.UCSC.hg19.knownGene: Annotation package for TxDb object(s). R package version 3.1.2. 2015.

[6] L. Collado-Torres, A. C. Frazee, M. I. Love, R. A. Irizarry, et al. “derfinder: Software for annotation-agnostic RNA-seq differential expression analysis”. In: bioRxiv (2015). DOI: 10.1101/015370. URL: http://www.biorxiv.org/content/early/2015/02/19/015370.abstract.

[7] L. Collado-Torres, A. E. Jaffe and J. T. Leek. derfinderPlot: Plotting functions for derfinder. https://github.com/lcolladotor/derfinderPlot - R package version 1.2.0. 2015. URL: http://www.bioconductor.org/packages/release/bioc/html/derfinderPlot.html.

[8] L. Collado-Torres, A. E. Jaffe and J. T. Leek. regionReport: Generate HTML reports for exploring a set of regions. https://github.com/leekgroup/regionReport - R package version 1.2.1. 2015. URL: http://www.bioconductor.org/packages/release/bioc/html/regionReport.html.

[9] J. Hester. knitrBootstrap: Knitr Bootstrap framework. R package version 0.9.0. 2013. URL: http://CRAN.R-project.org/package=knitrBootstrap.

[10] A. E. Jaffe, P. Murakami, H. Lee, J. T. Leek, et al. “Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies”. In: International journal of epidemiology 41.1 (2012), pp. 200–209. DOI: 10.1093/ije/dyr238.

[11] M. Lawrence, W. Huber, H. Pagès, P. Aboyoun, et al. “Software for Computing and Annotating Genomic Ranges”. In: PLoS Computational Biology 9 (8 2013). DOI: 10.1371/journal.pcbi.1003118. URL: http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118}.

[12] E. Neuwirth. RColorBrewer: ColorBrewer Palettes. R package version 1.1-2. 2014. URL: http://CRAN.R-project.org/package=RColorBrewer.

[13] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2015. URL: http://www.R-project.org/.

[14] S. Urbanek and J. Horner. Cairo: R graphics device using cairo graphics library for creating high-quality bitmap (PNG, JPEG, TIFF), vector (PDF, SVG, PostScript) and display (X11 and Win32) output. R package version 1.5-6. 2014. URL: http://CRAN.R-project.org/package=Cairo.

[15] H. Wickham. ggplot2: elegant graphics for data analysis. Springer New York, 2009. ISBN: 978-0-387-98140-6. URL: http://had.co.nz/ggplot2/book.

[16] H. Wickham and W. Chang. devtools: Tools to Make Developing R Packages Easier. R package version 1.8.0. 2015. URL: http://CRAN.R-project.org/package=devtools.

[17] Y. Xie. “knitr: A Comprehensive Tool for Reproducible Research in R”. In: Implementing Reproducible Computational Research. Ed. by V. Stodden, F. Leisch and R. D. Peng. ISBN 978-1466561595. Chapman and Hall/CRC, 2014. URL: http://www.crcpress.com/product/isbn/9781466561595.

[18] T. Yin, D. Cook and M. Lawrence. “ggbio: an R package for extending the grammar of graphics for genomic data”. In: Genome Biology 13.8 (2012), p. R77.

[19] T. Yin, M. Lawrence and D. Cook. biovizBase: Basic graphic utilities for visualization of genomic data. R package version 1.16.0. 2015.