--- title: Saving arrays to artifacts and back again author: - name: Aaron Lun email: infinite.monkeys.with.keyboards@gmail.com package: alabaster.matrix date: "Revised: September 22, 2021" output: BiocStyle::html_document vignette: > %\VignetteIndexEntry{Saving and loading arrays} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo=FALSE} library(BiocStyle) self <- Githubpkg("ArtifactDB/alabaster.matrix") knitr::opts_chunk$set(error=FALSE, warning=FALSE, message=FALSE) ``` # Overview The `r self` package implements methods to save matrix-like objects to file artifacts and load them back into R. Check out the `r Githubpkg("ArtifactDB/alabaster.base")` for more details on the motivation and the **alabaster** framework. # Quick start Given an array-like object, we can use `stageObject()` to save it inside a staging directory: ```{r} library(Matrix) y <- rsparsematrix(1000, 100, density=0.05) library(alabaster.matrix) tmp <- tempfile() dir.create(tmp) meta <- stageObject(y, tmp, "my_sparse_matrix") library(alabaster.base) .writeMetadata(meta, tmp) list.files(tmp, recursive=TRUE) ``` We then load it back into our R session with `loadObject()`. This creates a HDF5-backed `DelayedArray` that can be easily coerced into the desired format, e.g., a `dgCMatrix`. ```{r} meta <- acquireMetadata(tmp, "my_sparse_matrix/matrix.h5") roundtrip <- loadObject(meta, tmp) class(roundtrip) ``` This process is supported for all base arrays, `r CRANpkg("Matrix")` objects and `r Biocpkg("DelayedArray")` objects. # Saving delayed operations For `DelayedArray`s, we may instead choose to save the delayed operations themselves to file, using the `r Githubpkg("LTLA/chihaya")` package. This creates a HDF5 file following the [**chihaya**](https://ltla.github.io/chihaya) format, containing the delayed operations rather than the results of their evaluation. ```{r} library(DelayedArray) y <- DelayedArray(rsparsematrix(1000, 100, 0.05)) y <- log1p(abs(y) / 1:100) # adding some delayed ops. preserveDelayedOperations(TRUE) meta <- stageObject(y, tmp, "delayed") .writeMetadata(meta, tmp) meta <- acquireMetadata(tmp, "delayed/delayed.h5") roundtrip <- loadObject(meta, tmp) class(roundtrip) ``` However, it is probably best to avoid preserving delayed operations for file-backed `DelayedArray`s if you want the artifacts to be re-usable on different filesystems. For example, `HDF5Array`s will be saved with a reference to an absolute file path, which will not be portable. # Session information {-} ```{r} sessionInfo() ```