---
title: "Introduction to regionReport"
author: "L Collado-Torres"
date: "`r doc_date()`"
package: "`r pkg_ver('regionReport')`"
output: 
  BiocStyle::html_document
vignette: >
  %\VignetteIndexEntry{Introduction to regionReport}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}  
---


HTML reports for a set of regions
=================================

```{r vignetteSetup, echo = FALSE, message = FALSE, warning = FALSE}
## Track time spent on making the vignette
startTimeVignette <- Sys.time()

## Bib setup
library('knitcitations')

## Load knitcitations with a clean bibliography
cleanbib()
cite_options(hyperlink = 'to.doc', citation_format = 'text', style = 'html')
# Note links won't show for now due to the following issue
# https://github.com/cboettig/knitcitations/issues/63

## Write bibliography information
bibs <- c(knitcitations = citation('knitcitations'), 
    derfinder = citation('derfinder')[1], 
    regionReport = citation('regionReport')[1],
    knitrBootstrap = citation('knitrBootstrap'),
    BiocStyle = citation('BiocStyle'),
    ggbio = citation('ggbio'),
    ggplot2 = citation('ggplot2'),
    knitr = citation('knitr')[3],
    rmarkdown = citation('rmarkdown'),
    R = citation(),
    IRanges = citation('IRanges'),
    devtools = citation('devtools'),
    GenomeInfoDb = citation('GenomeInfoDb'),
    GenomicRanges = citation('GenomicRanges'),
    biovizBase = citation('biovizBase'),
    bumphunter = citation('bumphunter'),
    TxDb.Hsapiens.UCSC.hg19.knownGene = citation('TxDb.Hsapiens.UCSC.hg19.knownGene'),
    derfinderPlot = citation('derfinderPlot')[1],
    grid = citation('grid'),
    gridExtra = citation('gridExtra'),
    mgcv = citation('mgcv'),
    RColorBrewer = citation('RColorBrewer'),
    Cairo = citation('Cairo'),
    whikser = citation('whisker'),
    bumphunter = citation('bumphunter')[1]
)

write.bibtex(bibs,
    file = 'regionReportRef.bib')
bib <- read.bibtex('regionReportRef.bib')

## Assign short names
names(bib) <- names(bibs)
```


`r Biocpkg('regionReport')` `r citep(bib[['regionReport']])` creates HTML reports 
styled with `r CRANpkg('knitrBootstrap')` `r citep(bib[['knitrBootstrap']])` for 
a set of regions such as `r Biocpkg('derfinder')` `r citep(bib[['derfinder']])` results.


This package includes a basic exploration for a general set of genomic regions which can be easily customized to include the appropriate conclusions and/or further exploration of the results. Such a report can be generated using `renderReport()`. `r Biocpkg('regionReport')` has a separate template for running a basic exploration analysis of `r Biocpkg('derfinder')` 
results by using `derfinderReport()`. Both reports are written in [R Markdown](http://rmarkdown.rstudio.com/)
format and include all the code for making the plots and explorations in the report itself. For both reports, `r Biocpkg('regionReport')` relies on
`r CRANpkg('knitr')` `r citep(bib[['knitr']])`, `r CRANpkg('rmarkdown')` 
`r citep(bib[['rmarkdown']])`, and `r CRANpkg('knitrBootstrap')` 
`r citep(bib[['knitrBootstrap']])` for generating the report. 


# Using `r Biocpkg('regionReport')`

The plots in `r Biocpkg('regionReport')` are powered by `r Biocpkg('derfinderPlot')` `r citep(bib[['derfinderPlot']])`, `r Biocpkg('ggbio')` `r citep(bib[['ggbio']])`, and `r CRANpkg('ggplot2')` `r citep(bib[['ggplot2']])`.

## Examples

The `r Biocpkg('regionReport')` supplementary website [regionReportSupp](http://leekgroup.github.io/regionReportSupp/) has examples of using `r Biocpkg('regionReport')` with results from `r Biocpkg('DiffBind')` and `r Biocpkg('derfinder')`. Included as a vignette, this package also has an example using a small data set derived from `r Biocpkg('bumphunter')`. These represent different uses of `r Biocpkg('regionReport')` for results from ChIP-seq, methylation, and RNA-seq data. In particular, the `r Biocpkg('DiffBind')` example illustrates how to expand a basic report created with `renderReport()`.

## General case

For a general use case, you first have to identify a set of genomic regions of interest and store it as a `GRanges` object. In a typical workflow you will have some variables measured for each of the regions, such as p-values and scores. `renderReport()` uses the set of regions and three main arguments:

* `pvalueVars`: this is a character vector (named optionally) with the names of the variables that are bound between 0 and 1, such as p-values. For each of these variables, `renderReport()` explores the distribution by chromosome, the overall distribution, and makes a table with commonly used cutoffs.
* `densityVars`: is another character vector (named optionally) with another set of variables you wish to explore by making density graphs. This is commonly used for scores and other similar numerical variables.
* `significantVar`: is a logical vector separating the regions into by whether they are statistically significant. For example, this information is used to explore the width of all the regions and compare it the significant ones.

Other parameters control the name of the report, where it'll be located, the transcripts database used to annotate the nearest genes, graphical parameters, etc.

Here is a short example of how to use `renderReport()`. Note that we are using regions produced by `r Biocpkg('derfinder')` just for convenience sake. You can also run this example by using `example('renderReport', 'regionReport', ask=FALSE)`.

```{r, eval = FALSE}
## Load derfinder
library('derfinder')
regions <- genomeRegions$regions

## Assign chr length
library('GenomicRanges')
seqlengths(regions) <- c('chr21' = 48129895)

## The output will be saved in the 'derfinderReport-example' directory
dir.create('renderReport-example', showWarnings = FALSE, recursive = TRUE)

## Generate the HTML report
report <- renderReport(regions, 'Example run', pvalueVars = c(
    'Q-values' = 'qvalues', 'P-values' = 'pvalues'), densityVars = c(
    'Area' = 'area', 'Mean coverage' = 'meanCoverage'), 
    significantVar = regions$qvalues <= 0.05, nBestRegions = 20,
    outdir = 'renderReport-example')
```


## `r Biocpkg('derfinder')` case

### Run `r Biocpkg('derfinder')`

Prior to using `regionReport::derfinderReport()` you must use `r Biocpkg('derfinder')` to analyze a specific data set. While there are many ways to do so, we recommend using __analyzeChr()__ with the same _prefix_ argument. Then merging the results with __mergeResults()__.

Below, we run `r Biocpkg('derfinder')` for the example data included in the package. The steps are:

1. Load derfinder
1. Create a directory where we'll store the results
1. Generate the pre-requisites for the models to use with the example data
1. Generate the statistical models
1. Analyze the example data for chr21
1. Merge the results (only one chr in this case, but in practice there'll be more)

```{r loadDerfinder}
## Load derfinder
library('derfinder')

## The output will be saved in the 'report' directory
dir.create('report', showWarnings = FALSE, recursive = TRUE)
```

The following code runs `r Biocpkg('derfinder')`.

```{r runDerfinderFake, eval=FALSE}
## Save the current path
initialPath <- getwd()
setwd(file.path(initialPath, 'report'))

## Generate output from derfinder

## Collapse the coverage information
collapsedFull <- collapseFullCoverage(list(genomeData$coverage), 
verbose=TRUE)

## Calculate library size adjustments
sampleDepths <- sampleDepth(collapsedFull, probs=c(0.5), nonzero=TRUE, 
verbose=TRUE)

## Build the models
group <- genomeInfo$pop
adjustvars <- data.frame(genomeInfo$gender)
models <- makeModels(sampleDepths, testvars=group, adjustvars=adjustvars)

## Analyze chromosome 21
analysis <- analyzeChr(chr='21', coverageInfo=genomeData, models=models, 
cutoffFstat=1, cutoffType='manual', seeds=20140330, groupInfo=group, 
mc.cores=1, writeOutput=TRUE, returnOutput=TRUE)

## Save the stats options for later
optionsStats <- analysis$optionsStats

## Change the directory back to the original one
setwd(initialPath)
```

For convenience, we have included the `r Biocpkg('derfinder')` results as part of 
`r Biocpkg('regionReport')`. Note that the above functions are routinely checked as part
of `r Biocpkg('derfinder')`.


```{r runDerfinderReal}
## Copy previous results
file.copy(system.file(file.path('extdata', 'chr21'), package='derfinder', 
mustWork=TRUE), 'report', recursive=TRUE)
```

Next, proceed to merging the results.

```{r mergeResults}
## Merge the results from the different chromosomes. In this case, there's 
## only one: chr21
mergeResults(chrs = 'chr21', prefix = 'report',
    genomicState = genomicState$fullGenome)
```



### Create report

Once the `r Biocpkg('derfinder')` output has been generated and merged, use
__derfinderReport()__ to create the HTML report.


```{r loadLib, message=FALSE}
## Load derfindeReport
library('regionReport')
```


```{r createReportFake, eval=FALSE}
## Generate the HTML report
report <- derfinderReport(prefix='report', browse=FALSE,
    nBestRegions=15, makeBestClusters=TRUE, outdir='html',
    fullCov=list('21'=genomeDataRaw$coverage), optionsStats=optionsStats)
```


```{r createReportReal, echo=FALSE, message=FALSE}
## Generate the HTML report in a clean environment
library('devtools')

cat("## Generate the report in an isolated environment
## This helps avoids conflicts with generating the vignette
library(derfinder)
library(regionReport)

## Load optionsStats
load(file.path('report', 'chr21', 'optionsStats.Rdata'))

## Create report
report <- derfinderReport(prefix='report', browse=FALSE,
    nBestRegions=15, makeBestClusters=TRUE, outdir='html',
    fullCov=list('21'=genomeDataRaw$coverage), optionsStats=optionsStats)

## Clean up
file.remove('derfinderReport-isolated.R')
", file='derfinderReport-isolated.R')
clean_source('derfinderReport-isolated.R', quiet=TRUE)

```

Once the output is generated, you can browse the report from `R` using 
__browseURL()__ as shown below.

```{r vignetteBrowse, eval=FALSE}
## Browse the report
browseURL(report)
```

You can compare the resulting report with the pre-compiled report using the 
following code.

```{r openIncludedReport, eval=FALSE}
browseURL(system.file(file.path('basicExploration', 'basicExploration.html'),  
    package = 'regionReport', mustWork = TRUE))
```

### Notes

Note that the reports require an active Internet connection to render correctly.

The report is self-explanatory and will change some of the text depending on the
input options.

If the report is taking too long to compile (say more than 3 hours), you might
want to consider setting _nBestCluters_ to a small number or even set 
_makeBestClusters_ to `FALSE`.


# Advanced arguments

If you are interested in using the advanced arguments, use `derfinder::advancedArg()` as shown below:

```{r 'advancedArg'}
## URLs to advanced arguemtns
derfinder::advancedArg('derfinderReport', package = 'regionReport',
    browse = FALSE)
## Set browse = TRUE if you want to open them in your browser
```

In particular, you might be interested in specifying the `output_format` argument in either `renderReport()` or `derfinderReport()`. For example, setting `output_format = 'pdf_document'` will generate a PDF file instead. However, you will lose interactivity for toggling hiding/showing code and the tables will be static.


# Reproducibility

This package was made possible thanks to:

* R `r citep(bib[['R']])`
* `r Biocpkg('BiocStyle')` `r citep(bib[['BiocStyle']])`
* `r Biocpkg('derfinder')` `r citep(bib[['derfinder']])`
* `r Biocpkg('derfinderPlot')` `r citep(bib[['derfinderPlot']])`
* `r CRANpkg('devtools')` `r citep(bib[['devtools']])`
* `r Biocpkg('GenomeInfoDb')` `r citep(bib[['GenomeInfoDb']])`
* `r Biocpkg('GenomicRanges')` `r citep(bib[['GenomicRanges']])`
* `r Biocpkg('ggbio')` `r citep(bib[['ggbio']])`
* `r CRANpkg('ggplot2')` `r citep(bib[['ggplot2']])`
* `r CRANpkg('grid')` `r citep(bib[['grid']])`
* `r CRANpkg('gridExtra')` `r citep(bib[['gridExtra']])`
* `r Biocpkg('IRanges')` `r citep(bib[['IRanges']])`
* `r CRANpkg('knitcitations')` `r citep(bib[['knitcitations']])`
* `r CRANpkg('knitr')` `r citep(bib[['knitr']])`
* `r CRANpkg('knitrBootstrap')` `r citep(bib[['knitrBootstrap']])`
* `r CRANpkg('mgcv')` `r citep(bib[['mgcv']])`
* `r CRANpkg('RColorBrewer')` `r citep(bib[['RColorBrewer']])`
* `r CRANpkg('rmarkdown')` `r citep(bib[['rmarkdown']])`
* `r Biocpkg('biovizBase')` `r citep(bib[['biovizBase']])`
* `r CRANpkg('Cairo')` `r citep(bib[['Cairo']])`
* `r Biocannopkg('TxDb.Hsapiens.UCSC.hg19.knownGene')` `r citep(bib[['TxDb.Hsapiens.UCSC.hg19.knownGene']])`
* `r CRANpkg('whisker')` `r citep(bib[['whisker']])`
* `r Biocpkg('bumphunter')` `r citep(bib[['bumphunter']])`

Code for creating the vignette

```{r createVignette, eval=FALSE}
## Create the vignette
library('rmarkdown')
system.time(render('regionReport.Rmd', 'BiocStyle::html_document'))

## Extract the R code
library('knitr')
knit('regionReport.Rmd', tangle = TRUE)

## Copy report output to be distributed with the package for comparison 
## purposes
if(gsub('.*/', '', getwd()) == 'realVignettes') {
    file.copy(file.path('report', 'html', 'basicExploration.html'),
        file.path('..', '..', 'inst', 'basicExploration',
            'basicExploration.html'), overwrite=TRUE)
} else {
    file.copy(file.path('report', 'html', 'basicExploration.html'),
        file.path('..', 'inst', 'basicExploration', 'basicExploration.html'),
            overwrite=TRUE)
}
```

```{r createVignette2}
## Clean up
file.remove('regionReportRef.bib')
#unlink('regionReport_files', recursive=TRUE)
unlink('report', recursive = TRUE)
```

Date the vignette was generated.

```{r vignetteReproducibility1, echo=FALSE}
## Date the report was generated
Sys.time()
```

Wallclock time spent generating the vignette.

```{r vignetteReproducibility2, echo=FALSE}
## Processing time in seconds
totalTimeVignette <- diff(c(startTimeVignette, Sys.time()))
round(totalTimeVignette, digits=3)
```

`R` session information.

```{r vignetteReproducibility3, echo=FALSE}
## Session info
library('devtools')
options(width = 120)
session_info()
```

# Bibliography

This vignette was generated using `r CRANpkg('BiocStyle')` `r citep(bib[['BiocStyle']])`
with `r CRANpkg('knitr')` `r citep(bib[['knitr']])` and `r CRANpkg('rmarkdown')` `r citep(bib[['rmarkdown']])` running behind the scenes.

Citations made with `r CRANpkg('knitcitations')` `r citep(bib[[1]])`.

```{r vignetteBiblio, results='asis', echo=FALSE, warning = FALSE}
## Print bibliography
bibliography()
```