---
title: "scNMT Mouse Gastrulation"
author: 
  - name: "Marcel Ramos Pérez"
date: "`r BiocStyle::doc_date()`"
vignette: |
  %\VignetteIndexEntry{scNMT Mouse Gastrulation}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
output:
    BiocStyle::html_document:
      toc_float: true
package: SingleCellMultiModal
bibliography: ../inst/REFERENCES.bib
---

# Installation

```{r,eval=FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("SingleCellMultiModal")
```

## Load packages

```{r,include=TRUE,results="hide",message=FALSE,warning=FALSE}
library(SingleCellMultiModal)
library(MultiAssayExperiment)
```

# scNMT: single-cell nucleosome, methylation and transcription sequencing

The dataset was graciously provided by @Argelaguet2019-et.

Scripts used to process the raw data were written and maintained by Argelaguet
and colleagues and reside on GitHub:
https://github.com/rargelaguet/scnmt_gastrulation

For more information on the protocol, see @Clark2018-qg.

## Dataset lookup

The user can see the available datasets by using the `dry.run` argument:

```{r}
scNMT("mouse_gastrulation", mode = "*", version = "1.0.0", dry.run = TRUE)
```

Or by simply running the `scNMT` function with defaults:

```{r}
scNMT("mouse_gastrulation", version = "1.0.0")
```

## Data versions

A more recent release of the 'mouse_gastrulation' dataset was provided
by Argelaguet and colleagues. This dataset includes additional cells that
did not pass the original quality metrics as imposed for the version `1.0.0`
dataset.

Use the `version` argument to indicate the newer dataset version
(i.e., `2.0.0`):

```{r}
scNMT("mouse_gastrulation", version = '2.0.0', dry.run = TRUE)
```

## Downloading the data

To obtain the data, we can use the `mode` argument to indicate specific
datasets using 'glob' patterns that will match the outputs above. For example,
if we would like to have all 'genebody' datasets for all available assays,
we would use `*_genebody` as an input to `mode`.

```{r,message=FALSE}
nmt <- scNMT("mouse_gastrulation", mode = c("*_DHS", "*_cgi", "*_genebody"),
    version = "1.0.0", dry.run = FALSE)
nmt
```

## Checking the cell metadata

Included in the `colData` `DataFrame` within the `MultiAssayExperiment` class
are the variables `cellID`, `stage`, `lineage10x_2`, and `stage_lineage`.
To extract this `DataFrame`, one has to use `colData` on the
`MultiAssayExperiment` object:

```{r}
colData(nmt)
```

## Exploring the data structure

Check row annotations:

```{r}
rownames(nmt)
```

The `sampleMap` is a graph representation of the relationships between cells
and 'assay' datasets:

```{r}
sampleMap(nmt)
```

Take a look at the cell identifiers or barcodes across assays:

```{r}
colnames(nmt)
```

## Chromatin Accessibility (acc_*)

See the accessibilty levels (as proportions) for DNase Hypersensitive Sites:

```{r}
head(assay(nmt, "acc_DHS"))[, 1:4]
```

## DNA Methylation (met_*)

See the methylation percentage / proportion:

```{r}
head(assay(nmt, "met_DHS"))[, 1:4]
```

For protocol information, see the references below.

# sessionInfo

```{r}
sessionInfo()
```

# References