---
title: "Spatial heterogeneity with SPIAT"
author: "Yuzhou Feng"
date: "`r Sys.Date()`"
output:
  BiocStyle::html_document:
    self_contained: yes
    toc_float: true
    toc_depth: 4
package: "`r pkg_ver('SPIAT')`"
bibliography: "`r file.path(system.file(package='SPIAT', 'vignettes'), 'introduction.bib')`"    
vignette: >
  %\VignetteIndexEntry{Spatial heterogeneity with SPIAT}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  markdown: 
    wrap: 72
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

```{r message=FALSE}
library(SPIAT)
```

Cell colocalisation metrics allow capturing a dominant spatial pattern
in an image. However, patterns are unlikely to be distributed evenly in
a tissue, but rather there will be spatial heterogeneity of patterns.
To measure this, SPIAT splits the image into smaller images (either
using a grid or concentric circles around a reference cell population),
followed by calculation of a spatial metric of a pattern of interest
(e.g. cell colocalisation, entropy), and then measures the Prevalence
and Distinctiveness of the pattern.

In this vignette we will use an inForm data file that's already been
formatted for SPIAT with `format_image_to_spe()`, which we can load with
`data()`. We will use `define_celltypes()` to define the cells with certain 
combinations of markers.

```{r}
data("simulated_image")

# define cell types
formatted_image <- define_celltypes(
    simulated_image, 
    categories = c("Tumour_marker","Immune_marker1,Immune_marker2", 
                   "Immune_marker1,Immune_marker3", 
                   "Immune_marker1,Immune_marker2,Immune_marker4", "OTHER"), 
    category_colname = "Phenotype", 
    names = c("Tumour", "Immune1", "Immune2", "Immune3", "Others"),
    new_colname = "Cell.Type")
```



# Localised Entropy

Entropy in spatial analysis refers to the balance in the number of cells
of distinct populations. An entropy score can be obtained for an entire
image. However, the entropy of one image does not provide us spatial
information of the image.

```{r}
calculate_entropy(formatted_image, cell_types_of_interest = c("Immune1","Immune2"), 
                  feature_colname = "Cell.Type")
```

We therefore propose the concept of Localised Entropy which calculates
an entropy score for a predefined local region. These local regions can
be calculated as defined in the next two sections.

## Fishnet grid

One approach to calculate localised metric is to split the image into
fishnet grid squares. For each grid square, `grid_metrics()` calculates
the metric for that square and visualise the raster image. Users can
choose any metric as the localised metric. Here we use entropy as an
example.

For cases where the localised metric is not symmetrical (requires
specifying a target and reference cell type), the first item in the
vector used for `cell_types_of_interest` marks the reference cells and
the second item the target cells. Here we are using Entropy, which is
symmetrical, so we can use any order of cell types in the input.

```{r, out.width = "70%"}
data("defined_image")
grid <- grid_metrics(defined_image, FUN = calculate_entropy, n_split = 20,
                     cell_types_of_interest=c("Tumour","Immune3"), 
                     feature_colname = "Cell.Type")
```

After calculating the localised entropy for each of the grid squares, we
can apply metrics like percentages of grid squares with patterns
(Prevalence) and Moran's I (Distinctiveness).

For the Prevalence, we need to select a threshold over which grid
squares are considered 'positive' for the pattern. The selection of
threshold depends on the pattern and metric the user chooses to find the
localised pattern. Here we chose 0.75 for entropy because 0.75 is
roughly the entropy of two cell types when their ratio is 1:5 or 5:1.

```{r}
calculate_percentage_of_grids(grid, threshold = 0.75, above = TRUE)
```

```{r}
calculate_spatial_autocorrelation(grid, metric = "globalmoran")
```

## Gradients (based on concentric circles)

We can use the `compute_gradient()` function to calculate metrics (entropy, mixing
score, percentage of cells within radius, marker intensity) for a range
of radii from reference cells. Here, an increasing circle is drawn
around each cell of the reference cell type and the desired score is
calculated for cells within each circle.

The first item in the vector used for `cell_types_of_interest` marks the
reference cells and the second item the target cells. Here, Immune1
cells are reference cells and Immune2 are target cells.

```{r}
gradient_positions <- c(30, 50, 100)
gradient_entropy <- 
  compute_gradient(defined_image, radii = gradient_positions, 
                   FUN = calculate_entropy,  cell_types_of_interest = c("Immune1","Immune2"),
                   feature_colname = "Cell.Type")
length(gradient_entropy)
head(gradient_entropy[[1]])
```

The `compute_gradient()` function outputs the numbers cells within each
radii for each reference cell. The output is formatted as a list of
data.frames, one for each specified radii. In each data.frame, the rows
show the reference cells. The last column of the data.frame is the
entropy calculated for cells in the circle of the reference cell. Users
can then an average score or another aggregation metric to report the
results.

An alternative approach is to combine the results of all the circles
(rather than have one for each individual reference cell). Here, for
each radii, we simultaneously identify all the cells in the circles
surrounding each reference cell, and calculate a single entropy score.
We have created a specific function for this -
`entropy_gradient_aggregated()`. The output of this function is an overall
entropy score for each radii.

```{r}
gradient_pos <- seq(50, 500, 50) ##radii
gradient_results <- entropy_gradient_aggregated(defined_image, cell_types_of_interest = c("Immune3","Tumour"),
                                                feature_colname = "Cell.Type", radii = gradient_pos)
# plot the results
plot(1:10,gradient_results$gradient_df[1, 3:12])
```

# You can access the vignettes for other modules of SPIAT here:

 - [Overview of SPIAT](SPIAT.html)
 - [Data reading and formatting](data_reading-formatting.html)
 - [Quality control and visualisation](quality-control_visualisation.html)
 - [Basic analysis](basic_analysis.html)
 - [Cell colocalisation](cell-colocalisation.html)
 - [Tissue structure](tissue-structure.html)
 - [Cellular neighborhood](neighborhood.html)
 
# Reproducibility

```{r}
sessionInfo()
```

# Author Contributions

AT, YF, TY, ML, JZ, VO, MD are authors of the package code. MD and YF
wrote the vignette. AT, YF and TY designed the package.