---
title: "alevinQC"
author: "Charlotte Soneson"
date: "`r Sys.Date()`"
package: "alevinQC"
output: 
    BiocStyle::html_document
bibliography: bibliography.bib
abstract: >
    `r Biocpkg("alevinQC")` reads output files from alevin and 
    generates summary reports.
vignette: >
    %\VignetteIndexEntry{alevinQC}
    %\VignetteEncoding{UTF-8}  
    %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  chunk_output_type: console
---

```{r v1, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = TRUE,
  crop = NULL
)
```

# Introduction

The purpose of the `r Biocpkg("alevinQC")` package is to generate a summary QC
report based on the output of an
[alevin](https://salmon.readthedocs.io/en/latest/alevin.html)
[@Srivastava2019-fz] run. The QC report can be generated as a html or pdf file,
or launched as a shiny application.

# Installation

`alevinQC` can be installed using the `BiocManager` CRAN package.

```{r install, eval=FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("alevinQC")
```

After installation, load the package into the R session.

```{r v2, message=FALSE}
library(alevinQC)
```

Note that in order to process output from Salmon v0.14 or later, you need 
Alevin v1.1 or later. 

# Assumed output directory structure

For more information about running alevin, we refer to the
[documentation](https://salmon.readthedocs.io/en/latest/alevin.html). When
invoked, alevin generates several output files in the specified output
directory. `r Biocpkg("alevinQC")` assumes that this structure is retained, and
will return an error if it isn't  - thus, it is not recommended to move or
rename the output files from alevin. `r Biocpkg("alevinQC")` assumes that the
following files (in the indicated structure) are available in the provided
`baseDir` (note that currently, in order to generate the full set of files,
alevin must be invoked with the `--dumpFeatures` flag).

For alevin versions before 0.14:

```{r v3a, eval=FALSE, results='asis'}
baseDir
  |--alevin
  |    |--featureDump.txt
  |    |--filtered_cb_frequency.txt
  |    |--MappedUmi.txt
  |    |--quants_mat_cols.txt
  |    |--quants_mat_rows.txt
  |    |--quants_mat.gz
  |    |--raw_cb_frequency.txt
  |    |--whitelist.txt
  |--aux_info
  |    |--meta_info.json
  |--cmd_info.json
```

For alevin version 0.14 and later:

```{r v3b, eval=FALSE, results='asis'}
baseDir
  |--alevin
  |    |--featureDump.txt
  |    |--raw_cb_frequency.txt
  |    |--whitelist.txt  (depending on how alevin was run)
  |--aux_info
  |    |--meta_info.json
  |    |--alevin_meta_info.json
  |--cmd_info.json
```

# Check that all required alevin files are available

The report generation functions (see below) will check that all the required
files are available in the provided base directory. However, you can also call
the function `checkAlevinInputFiles()` to run the check manually. If one or more
files are missing, the function will raise an error indicating the missing
file(s).

```{r v4}
baseDir <- system.file("extdata/alevin_example_v0.14", package = "alevinQC")
checkAlevinInputFiles(baseDir = baseDir)
```

# Generate QC report

The `alevinQCReport()` function generates the QC report from the alevin output.
Depending on the file extension of the `outputFile` argument, and the value of
`outputFormat`, the function can generate either an html report or a pdf report.

```{r v5, message=FALSE, warning=FALSE}
outputDir <- tempdir()
alevinQCReport(baseDir = baseDir, sampleId = "testSample", 
               outputFile = "alevinReport.html", 
               outputFormat = "html_document",
               outputDir = outputDir, forceOverwrite = TRUE)
```

# Create shiny app

In addition to static reports, `r Biocpkg("alevinQC")` can also generate a shiny
application, containing the same summary figures as the pdf and html reports.

```{r v7, message=FALSE}
app <- alevinQCShiny(baseDir = baseDir, sampleId = "testSample")
```

Once created, the app can be launched using the `runApp()` function from the 
`r CRANpkg("shiny")` package.

```{r v8, eval=FALSE}
shiny::runApp(app)
```

# Generate individual plots

The individual plots included in the QC reports can also be independently
generated. To do so, we must first read the alevin output into an R object.

```{r v9}
alevin <- readAlevinQC(baseDir = baseDir)
```

The resulting list contains four entries:

- `cbTable`: a `data.frame` with various inferred characteristics of the
individual cell barcodes.
- `summaryTables`: a list of `data.frame`s with summary information about the
full data set, the initial set of whitelisted cells and the final set of
whitelisted cells, respectively.
- `versionTable`: a `matrix` with information about the invokation of alevin.
- `type`: a `character` scalar indicating how alevinQC interpreted the alevin
output directory.

```{r v10}
head(alevin$cbTable)
```

```{r v11}
knitr::kable(alevin$summaryTables$fullDataset)
knitr::kable(alevin$summaryTables$initialWhitelist)
knitr::kable(alevin$summaryTables$finalWhitelist)
```

```{r v12}
knitr::kable(alevin$versionTable)
```

The plots can now be generated using the dedicated plotting functions provided
with `r Biocpkg("alevinQC")` (see the help file for the respective function for
more information).

```{r v13, warning=FALSE}
plotAlevinKneeRaw(alevin$cbTable)
plotAlevinBarcodeCollapse(alevin$cbTable)
plotAlevinQuant(alevin$cbTable)
plotAlevinKneeNbrGenes(alevin$cbTable)
```

# Session info

```{r v14}
sessionInfo()
```

# References