---
title: "Converting a gDR-generated MultiAssayExperiment object into a PharmacoSet" 
author:
- name: Jermiah Joseph
  email: jermiah.joseph@uhn.ca
- name: Bartosz Czech
  email: bartosz.w.czech@gmail.com
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Converting a gDR-generated MultiAssayExperiment object into a PharmacoSet}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  warning = FALSE
)
```

```{r setup, include=FALSE}
library(PharmacoGx)
library(gDRimport)
library(MultiAssayExperiment)
```



## Introduction

The `PharmacoGx` package is a popular tool from the Bioconductor project within the field of bioinformatics and computational biology. 
Whereas the `gDR` package is a valuable tool that provides the framework for the import, processing, analysism and visualization of 
high-dimensional drug response data, the `PharmacoGx` package provides functionality to containerize processed drug response and genomic data 
to perform pharmacogenomic analyses. The data structure used by `PharmacoGx` is described as a `PharamcoSet` class, inherited from `CoreGx::CoreSet`. 
Inspired by the `SummarizedExperiment` class, a new data structure class called the `TreatmentResponseExperiment` was developed to store drug response 
data within the `PharmacoSet`. 

In collaboration with the BHKLab, we have developed functionality to convert between the `gDR`-generated `MultiAssayExperiment (MAE)` object and the `PharmacoSet` object 
(see the ConvertingPharmacoSetToGDR vignette to convert from a PharmacoSet to gDR object).


# Loading the gDR-generated MAE object
This workflow assumes that you have already generated a `MAE` object using the `gDR` package. We will load in a `MAE` object generated using 
the data from the [KW. Song et al, 2022](https://doi.org/10.1158/2159-8290.CD-21-0072) study. This `MAE` contains two drugs (`treatments`) and 7 cell lines (`samples`).  
``` {r loadMae}
(mae <- qs::qread(
    system.file("extdata", "kyung2022mae", "MAE_E2.qs", package = "gDRimport")
  )
)
```

# Converting the MAE object into a PharmacoSet object
First we will view the experiments and assays within the MultiAssayExperiment object. 
``` {r viewAssays}
gDRutils::MAEpply(mae, assays)
```

We will now convert the `MAE` object into a `PharmacoSet` object. We need to provide a name for the data object, in this example
we will use the name of the study.
``` {r, warning=FALSE}
pset <- convert_MAE_to_PSet(mae, pset_name="Kyung2022")
```

We can now view the `PharmacoSet` object. 
``` {r viewPharmacoSet}
pset
```

We can view information about the drugs and cell lines in the `PharmacoSet` object using the `treatment` and `sample` slots:

``` {r}
head(treatmentInfo(pset))

head(sampleInfo(pset))

```


# TreatmentResponseExperiment Object
The `TreatmentResponseExperiment` is a new class that was developed to store drug response data within the `PharmacoSet` object.
This vignette will cover some basics about this class, however for more information, please refer to the `PharmacoSet` vignette 
[TreatmentResponseExperiment](https://bioconductor.org/packages/release/bioc/vignettes/CoreGx/inst/doc/TreatmentResponseExperiment.html)****.

## Row and Column Names
```{r rownames, echo=TRUE}
tre <- treatmentResponse(pset)
head(rownames(tre))
head(rowData(tre))
```

```{r colnames, echo=TRUE}
head(colnames(tre))
head(colData(tre))
```

## `data.table` Subsetting

In addition to regex queries, a `TreatmentResponseExperiment` object supports
arbitrarily complex subset queries using the `data.table` API. To access this
API, you will need to use the `.` function, which allows you to pass raw R
expressions to be evaluated inside the `i` and `j` arguments for
`dataTable[i, j]`.

For example if we want to subset to rows where the cell line is "CL131891_EFM-19_Breast_EFM-19_unknown_64" and drug is "G02967907_GDC-0077_PI3K-A_168".

```{r , echo=TRUE}
tre[
  .(treatmentid=="G02967907_GDC-0077_PI3K-A_168"), # query on row
  .(sampleid=="CL131891_EFM-19_Breast_EFM-19_unknown_64") # query on column
]
```

We can also invert matches or subset on other columns in `rowData` or `colData`:

```{r echo=TRUE}
tre[
  .(treatmentid!="G02967907_GDC-0077_PI3K-A_168"), # query on row
  .(sampleid!="CL131891_EFM-19_Breast_EFM-19_unknown_64") # query on column
]
```


## assays
The assays can be viewed through the `assays` function. Each assay has a column "data_type" which references which `SummarizedExperiment` from
the `MAE` the data corresponds with. 

``` {r viewTREAssays}
lapply(assays(tre), head)
```


``` {r viewAssayNames}
assayNames(tre)
```


``` {r viewAssay}
head(assay(tre, "Metrics"),3)
```

# References
1. Song, K. W., Edgar, K. A., Hanan, E. J., Hafner, M., Oeh, J., Merchant, M., ... & Friedman, L. S. (2022). 
   RTK-dependent inducible degradation of mutant PI3K drives GDC-0077 (Inavolisib) efficacy. Cancer discovery, 12(1), 204-219.


```{r sessionInfo}
sessionInfo()
```