---
title: Saving arrays to artifacts and back again
author:
- name: Aaron Lun
  email: infinite.monkeys.with.keyboards@gmail.com
package: alabaster.matrix
date: "Revised: November 28, 2023"
output:
  BiocStyle::html_document
vignette: >
  %\VignetteIndexEntry{Saving and loading arrays}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, echo=FALSE}
library(BiocStyle)
self <- Githubpkg("ArtifactDB/alabaster.matrix")
knitr::opts_chunk$set(error=FALSE, warning=FALSE, message=FALSE)
```

# Overview 

The `r self` package implements methods to save matrix-like objects to file artifacts and load them back into R.
Check out the `r Githubpkg("ArtifactDB/alabaster.base")` for more details on the motivation and the **alabaster** framework.

# Quick start

Given an array-like object, we can use `saveObject()` to save it inside a staging directory:

```{r}
library(Matrix)
y <- rsparsematrix(1000, 100, density=0.05)

library(alabaster.matrix)
tmp <- tempfile()
saveObject(y, tmp)

list.files(tmp, recursive=TRUE)
```

We then load it back into our R session with `loadObject()`.
This creates a HDF5-backed S4 array that can be easily coerced into the desired format, e.g., a `dgCMatrix`.

```{r}
roundtrip <- readObject(tmp)
class(roundtrip)
```

This process is supported for all base arrays, `r CRANpkg("Matrix")` objects and `r Biocpkg("DelayedArray")` objects.

# Saving delayed operations

For `DelayedArray`s, we may instead choose to save the delayed operations themselves to file. 
This creates a HDF5 file following the [**chihaya**](https://ltla.github.io/chihaya) format, containing the delayed operations rather than the results of their evaluation.

```{r}
library(DelayedArray)
y <- DelayedArray(rsparsematrix(1000, 100, 0.05))
y <- log1p(abs(y) / 1:100) # adding some delayed ops.

tmp <- tempfile()
saveObject(y, tmp, DelayedArray.preserve.ops=TRUE)

# Inspecting the HDF5 file reveals many delayed operations:
rhdf5::h5ls(file.path(tmp, "array.h5"))

# And indeed, we can recover those same operations.
readObject(tmp)
```

This allows users to avoid evaluation of the operations when saving objects,
which may improve efficiency, e.g., by avoiding loss of sparsity or casting to a larger type.

# Session information {-}

```{r}
sessionInfo()
```