---
title: "Tables"
output: rmarkdown::html_vignette
execute:
  echo: true
  warning: false
  message: false
  cache: false
vignette: >
  %\VignetteIndexEntry{a01_tables}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, echo=FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  warning = FALSE,
  message = FALSE,
  fig.width = 7.2,
  fig.height = 5
)
options(rmarkdown.html_vignette.check_title = FALSE)

library(visOmopResults)
```

# Introduction

The **visOmopResults** package provides user-friendly tools for creating publication-ready tables and plots. This vignette documents the package's table functions:(1) the high-level table helpers and (2) the lower-level `format*()` functions they rely on.

Supported table types: `<tibble>` (data.frame), `<gt>`, `<flextable>`, `<tinytable>`, `<datatables>` (DT), and `<reactable>`. These table types work in R Markdown, Quarto, Shiny, and other contexts.

To list supported table types by the package use the following function:

```{r}
tableType()
```

Although the package primarily targets the `<summarised_result>` class (see the `omopgenerics` package for details), most functions work with any `data.frame`.

## Overview of table functions

Two main categories of table funcions:

- **Main table functions** — high-level table functions that return a completly formatted table object: `visTable()` and `visOmopTable()`.
- **Formatting functions** — lower-level functions (`format*()` family set) that formats a `data.frame` or `<summarised_result>` in a pipeline fashion, giving finer control. 

This vignette first shows the main functions and then explains the formatting building blocks so you understand how to compose more complex table workflows, and understand advanced options of the main functions.

# Main table functions

The high-level helpers are convenient wrappers built on top of the `format*()` functions. They accept a `result` (a `data.frame` or `<summarised_result>`) and return a rendered table object.

## `visTable()` — format any data.frame

`visTable()` formats an arbitrary `data.frame`. Basic features include renaming columns, hiding columns, grouping rows, and choosing the output table type via `type`. 

Tow show an example we'll use the pengun dataset from `palmerpenguins`.

```{r}
library(visOmopResults)
library(palmerpenguins)
library(dplyr)
library(tidyr)

x <- penguins |>
  filter(!is.na(sex) & year == 2008) |>
  select(!"body_mass_g") |>
  summarise(across(ends_with("mm"), ~mean(.x)), .by = c("species", "island", "sex"))
head(x)
```

`visTable()` is used to quickly produce a `gt` table, where sex column is used for groupping, column names are nicely renamed, and the year column hided:

```{r}
visTable(
  result = x,
  groupColumn = c("sex"),
  rename = c(
    "Bill length (mm)" = "bill_length_mm",
    "Bill depth (mm)" = "bill_depth_mm",
    "Flipper length (mm)" = "flipper_length_mm"
  ),
  type = "gt",
  hide = "year"
)
```

If estimates of a `data.frame` are arranged into three three standard columns (`estimate_name`, `estimate_type`, and `estimate_value`) these can be formatted. This includes setting 2 decimals, allowing for estimate combination with `estimateName`, and finally, allowing to create headers with `header`:

```{r}
# Transforming to estimate columns
x <- x |>
  pivot_longer(
    cols = ends_with("_mm"),
    names_to = "estimate_name",
    values_to = "estimate_value"
  ) |>
  mutate(estimate_type = "numeric")

# Use estimateName and header 
visTable(
  result = x,
  estimateName = c(
    "Bill length - Bill depth (mm)" = "<bill_length_mm> - <bill_depth_mm>",
    "Flipper length (mm)" = "<flipper_length_mm>"
  ),
  header = c("species", "island"),
  groupColumn = "sex",
  type = "gt",
  hide = c("year", "estimate_type")
)
```

We can obtain the same table with `flextable` or `tinytable`, the former seen below:
```{r}
visTable(
  result = x,
  estimateName = c(
    "Bill length - Bill depth (mm)" = "<bill_length_mm> - <bill_depth_mm>",
    "Flipper length (mm)" = "<flipper_length_mm>"
  ),
  header = c("species", "island"),
  groupColumn = "sex",
  type = "flextable",
  hide = c("year", "estimate_type")
)
```

We can also have a similar interactive table using `datatable`:

```{r}
visTable(
  result = x,
  estimateName = c(
    "Bill length - Bill depth (mm)" = "<bill_length_mm> - <bill_depth_mm>",
    "Flipper length (mm)" = "<flipper_length_mm>"
  ),
  header = c("species", "island"),
  groupColumn = "sex",
  type = "datatable",
  hide = c("year", "estimate_type")
)
```

Nevertheless, `reactable` currently only supports one level headers. Thereby to use this type we have to reduce to one header. Instead of having a two-level header, we can group by two columns:

```{r}
visTable(
  result = x,
  estimateName = c(
    "Bill length - Bill depth (mm)" = "<bill_length_mm> - <bill_depth_mm>",
    "Flipper length (mm)" = "<flipper_length_mm>"
  ),
  header = c("island"),
  groupColumn = c("species", "sex"),
  type = "reactable",
  hide = c("year", "estimate_type")
)
```


## `visOmopTable()` — specialized for `<summarised_result>`

`visOmopTable()` builds on `visTable()` with behavior tuned to `<summarised_result>` objects:

- The result is processed with `splitAll()` (see `omopgenerics`) internally, so column names passed to arguments must match the split output.
- `settingsColumn` lets you use settings metadata as table columns and use them in `header`, `rename`, or `groupColumn`.
- `header` can accept special values (e.g. "strata" to show variables in `strata_name` and `strata_level`, or "settings" to show all `settingsColumn` values).
- `result_id` and `estimate_type` are hidden internally.
- If the input was processed with `omopgenerics::suppress()`, suppressed estimates can be shown as `NA` or as the placeholder `<{minCellCount}` via the `showMinCellCount` argument.

Example using a mock `<summarised_result>`:

```{r}
result <- mockSummarisedResult() |>
  filter(strata_name == "age_group &&& sex")

# A flextable table with a few estimate formats
visOmopTable(
  result = result,
  estimateName = c(
    "N (%)" = "<count> (<percentage>%)",
    "N" = "<count>",
    "Mean (SD)" = "<mean> (<sd>)"
  ),
  header = c("package_name", "age_group"),
  groupColumn = c("cohort_name", "sex"),
  settingsColumn = "package_name",
  type = "flextable"
)
```

Example showing suppressed values (treat input with `suppress()` then display them with `showMinCellCount = TRUE`):

```{r}
result |>
  suppress(minCellCount = 1000000) |>
  visOmopTable(
    estimateName = c(
      "N (%)" = "<count> (<percentage>%)",
      "N" = "<count>",
      "Mean (SD)" = "<mean> (<sd>)"
    ),
    header = c("group"),
    groupColumn = c("strata"),
    hide = c("cdm_name"),
    showMinCellCount = TRUE,
    type = "flextable"
  )
```


## Using `.options` 

The main table functions (high-level) do not expose every granular formatting argument directly. Instead, you can pass further customisation through the `.options` list. To inspect available options and defaults:

```{r}
tableOptions()
```

These options originate from the lower-level formatting functions — see the following section to better understand how to use `.options`.

Additionally, both `visTable()` and `visOmopTable()` have the argument `style`. This allows to customise the visualisation of the table, either by using built in styles or providing your own. To know more about styles refer to the [vignette](https://darwin-eu.github.io/visOmopResults/articles/a05_style.html) on styles.

# Formatting functions (the `format*()` family)

Use these functions in a pipeline to prepare your data before rendering. Typical pipeline order:

1. (Optional for `<summarised_result>`): split the object (`splitAll()`) 
2. Handle suppressed values (`formatMinCellCount()`)
3. Format estimate values (`formatEstimateValue()`)
4. Combine and rename estimates (`formatEstimateName()`)
5. Prepare headers (`formatHeader()`)
6. Render the table (`formatTable()`)

Below we illustrate each of the steps 2 to 6.

## `formatMinCellCount()` — suppressed estimates

If estimates were suppressed with `omopgenerics::suppress()`, use `formatMinCellCount()` to mark which cells were suppressed (the function differentiates suppressed cells from `NA`).

```{r}
result <- result |> formatMinCellCount()
```

## `formatEstimateValue()` — numeric formatting

Control decimal places and separators per `estimate_type` (integer, numeric, percentage and proportion) or per `estimate_name` (any estimate name in your results). 

In the example we distinguish by `estimate_type`:

```{r}
result <- result |>
  formatEstimateValue(
    decimals = c(integer = 0, numeric = 4, percentage = 2),
    decimalMark = ".",
    bigMark = ","
  )
```

## `formatEstimateName()` — combine and order estimates

Create composite estimate displays (e.g. "N (%)") and control ordering/retention of unformatted rows:

```{r}
result <- result |>
  formatEstimateName(
    estimateName = c(
      "N (%)" = "<count> (<percentage>%)",
      "N" = "<count>",
      "Mean (SD)" = "<mean> (<sd>)"
    ),
    keepNotFormatted = TRUE,
    useFormatOrder = FALSE
  )
```

If `keepNotFormatted = FALSE`, rows with an estimate name not included between `<>` in `estimateName` will be dropped. The argument `useFormatOrder` whether to use the order in which estimates are mentioned in `estimateName` (TRUE) or use the order in the input table (TRUE).


## `formatHeader()` — multi-level headers

Create multi-level column headers using up to three levels: custom `header` labels, `header_name` (derived from column names), and `header_level` (derived from column values). Use `delim` to set a delimiter for multi-line headers.

```{r}
result <- result |>
  mutate(across(c("strata_name", "strata_level"), ~ gsub("&&&", "and", .x))) |>
  formatHeader(
    header = c("Stratifications", "strata_name", "strata_level"),
    delim = "\n",
    includeHeaderName = FALSE,
    includeHeaderKey = TRUE
  )
```

It is important to set `includeHeaderKey = TRUE` for styling in the next step, since style differentiates between header, header_name, and header_level. 


## `formatTable()` — render the final table

`formatTable()` converts the prepared `data.frame` into a final `gt`, `flextable`, `tinytable`, `datatable`, or `reactable` object. It accepts options such as `na`, `title`, `subtitle`, `caption`, `groupColumn`, `groupAsColumn`, `groupOrder`, and `merge`.

Example pipeline:

```{r}
result <- result |>
  splitGroup() |>
  splitAdditional() |>
  select(!c("result_id", "estimate_type", "cdm_name"))

result |>
  formatTable(
    type = "gt",
    delim = "\n",
    na = "-",
    title = "My formatted table!",
    subtitle = "Created with the `visOmopResults` R package.",
    caption = NULL,
    groupColumn = "cohort_name",
    groupAsColumn = FALSE,
    groupOrder = c("cohort2", "cohort1"),
    merge = "variable_name"
  )
```

# Notes & recommendations

- Use `visTable()` for quick formatting of generic `data.frame` inputs. Use `visOmopTable()` when working with `<summarised_result>` objects to leverage the specialized behavior (automatic splitting, settings columns, suppressed-value handling).
- When you need fine-grained control over formatting build a pipeline using the `format*()` functions and finish with `formatTable()`.
- Visit the [vignette](https://darwin-eu.github.io/visOmopResults/articles/a05_style.html) on `styles` to learn how to leverage build-in styles and create your own.