Package {PSpower}


Type: Package
Title: Sample Size and Power for Propensity Score Weighted Estimators
Version: 2.0.0
Description: Computes sample size and power for causal inference studies that use propensity score (PS) weighting. Supports continuous, binary, and time-to-event (survival) outcomes under four estimands: average treatment effect (ATE), average treatment effect on the treated (ATT), average treatment effect on the controls (ATC), and average treatment effect on the overlap population (ATO). For continuous and binary outcomes, the asymptotic variance of the Hajek inverse probability weighting estimator is derived under a logit-normal propensity score model, approximated by a Beta distribution matched through the Bhattacharyya overlap coefficient. For survival outcomes, the asymptotic variance of the propensity-score- weighted partial likelihood estimator is used for randomized trials and observational studies. The Schoenfeld formula is also available for randomized trial settings.
License: GPL-3
Encoding: UTF-8
Depends: R (≥ 3.5.0)
RoxygenNote: 7.3.3
Imports: stats
Suggests: ggplot2, knitr, rmarkdown, testthat (≥ 3.0.0)
VignetteBuilder: knitr
Config/testthat/edition: 3
NeedsCompilation: no
Packaged: 2026-05-14 06:40:00 UTC; chengxin
Author: Bo Liu [aut, cre], Chengxin Yang [aut], Fan Li [aut]
Maintainer: Bo Liu <bo.liu1997@gmail.com>
Repository: CRAN
Date/Publication: 2026-05-14 08:40:08 UTC

Bhattacharyya overlap coefficient for propensity score distributions

Description

Two calling conventions are provided:

From propensity scores (empirical):

Supply ps and Z. The empirical formula is

\hat\phi = \frac{\mathrm{E}[\sqrt{e(1-e)}]}{\sqrt{r(1-r)}},

where r = \mathrm{E}[Z]. This is the sample mean of \sqrt{e_i(1-e_i)} divided by \sqrt{\hat r(1-\hat r)}.

From Beta parameters (analytical):

Supply a and b. Under the \mathrm{Beta}(a,b) approximation,

\phi = \exp\!\Bigl[ \log\Gamma(a+\tfrac12) - \tfrac12\log a - \log\Gamma(a) + \log\Gamma(b+\tfrac12) - \tfrac12\log b - \log\Gamma(b) \Bigr].

Usage

overlap_coef(ps = NULL, Z = NULL, a = NULL, b = NULL)

Arguments

ps

Numeric vector of estimated propensity scores e_i = \Pr(Z_i = 1 \mid X_i), all in (0, 1). Required when a and b are not supplied.

Z

Integer or numeric vector of treatment indicators (Z_i \in \{0, 1\}), the same length as ps. Required when ps is supplied.

a

Shape parameter a > 0 of the Beta distribution. Supply together with b to use the analytical formula.

b

Shape parameter b > 0 of the Beta distribution. Supply together with a to use the analytical formula.

Details

Computes the Bhattacharyya overlap coefficient \phi, a scalar measure of propensity score overlap between the treatment and control groups. Values close to 1 indicate near-complete overlap (little confounding); values well below 1 indicate poor overlap.

Value

A list with components:

phi

The overlap coefficient \hat\phi.

r

Treatment proportion: mean(Z) (empirical) or a / (a + b) (analytical).

References

Chengxin Yang, Bo Liu, and Fan Li. Sample size and power calculations for causal inference with time-to-event outcomes. arXiv preprint arXiv:2605.10088 (2026).

Bo Liu, Chengxin Yang, and Fan Li. Sample size and power calculations for causal inference with continuous and binary outcomes. Annals of Statistics (2026).

See Also

power_ps, power_cox

Examples

# From propensity scores
set.seed(1)
n  <- 500
X  <- rnorm(n)
ps <- plogis(0.5 * X)
Z  <- rbinom(n, 1, ps)
overlap_coef(ps = ps, Z = Z)

# From Beta parameters
overlap_coef(a = 2, b = 3)


Sample size and power for PS-weighted marginal Cox model

Description

The required sample size is

N = V\,(z_{1-\alpha} + z_{\beta})^2 \,/\, \tau_0^2,

where \tau_0 = \log(\text{HR}) is the target log hazard ratio and V is the asymptotic variance of the estimator.

Randomized trial — robust sandwich variance (method = "robust"):

V_{RCT} = \frac{(\lambda_1 + \lambda_0)^2 \bigl[r\lambda_0^2 d_1 + (1-r)\lambda_1^2 d_0\bigr]}{d^2},

where \lambda_1 = \sqrt{r/(1-r)}\,e^{\tau_0/2}, \lambda_0 = 1/\lambda_1, and d = r\,d_1 + (1-r)\,d_0.

Randomized trial — Schoenfeld formula (method = "schoenfeld"):

V_{Sch} = \frac{1}{r(1-r)\,d}.

Note: the Schoenfeld formula is derived under a null effect and may underestimate or overestimate the required sample size at non-null effects.

Observational study — inverse probability weights (ATE), robust sandwich variance (study_type = "obs", estimand = "ATE"):

V_{obs} = \frac{(\lambda_1+\lambda_0)^2}{d^2} \left[r^2\lambda_0^2 d_1\,\frac{a+b-1}{a-1} + (1-r)^2\lambda_1^2 d_0\,\frac{a+b-1}{b-1}\right],

where a, b > 1 are Beta distribution parameters determined by (r, \phi). Requires \min(a, b) > 1.

Observational study — overlap weights (ATO) or treated population weights (ATT) (study_type = "obs", estimand in "ATO", "ATT"):

N = \kappa_{DE} \times N_{RCT},

where N_{RCT} uses V_{RCT} above and \kappa_{DE} is a design effect estimated by Monte Carlo simulation from the Beta approximation of propensity scores.

Usage

power_cox(
  effect_size,
  r,
  d1,
  d0 = NULL,
  phi = NULL,
  study_type = "obs",
  estimand = "ATE",
  method = "robust",
  sig_level = 0.05,
  power = NULL,
  sample_size = NULL,
  test = "one-sided",
  n_mc = 1e+06
)

Arguments

effect_size

Log hazard ratio \tau_0 = \log(\text{HR}). Negative values indicate benefit (lower hazard in group 1). Scalar or vector.

r

Treatment proportion r = \Pr(Z = 1), in (0, 1). Scalar or vector.

d1

Event rate in group 1 (treated), in (0, 1]. Scalar or vector.

d0

Event rate in group 0 (control), in (0, 1]. If NULL (default), set equal to d1. Scalar or vector.

phi

Overlap coefficient \phi \in (0, 1). Required when study_type = "obs"; ignored for "rct". Rule of thumb: < 0.80 very poor, [0.80, 0.90) poor, [0.90, 0.95) moderate, \ge 0.95 good. Scalar or vector.

study_type

"obs" (observational study, default) or "rct" (randomized trial).

estimand

Target estimand. "ATE" (average treatment effect, uses inverse probability weights), "ATO" (overlap population, uses overlap weights), or "ATT" (group 1 population, uses weights for the treated). Ignored when study_type = "rct". Scalar or character vector.

method

Variance approximation method. "robust" (default) for the robust sandwich variance; "schoenfeld" for the classical Schoenfeld formula. "schoenfeld" is only available when study_type = "rct". Scalar or character vector.

sig_level

Significance level \alpha. Default 0.05.

power

Target power \beta. Provide for sample size calculation; mutually exclusive with sample_size.

sample_size

Total sample size N. Provide for power calculation; mutually exclusive with power.

test

"one-sided" (default) or "two-sided".

n_mc

Number of Monte Carlo samples used to estimate the design effect for estimand in "ATO", "ATT". Default 1e6.

Details

Computes the required sample size or the achieved power for the propensity-score-weighted partial likelihood estimator of the marginal hazard ratio in a Cox proportional hazards model.

Value

An object of class "power_cox", a list containing:

call

The matched call.

calculation

"sample_size" or "power".

result

A data frame with one row per scenario and columns for every design parameter plus the computed sample_size or power.

settings

A list with sig_level, power, sample_size, and test.

n_scenarios

Number of rows in result.

d0_set_equal

Logical; TRUE when d0 was not specified and set equal to d1.

References

Chengxin Yang, Bo Liu, and Fan Li. Sample size and power calculations for causal inference with time-to-event outcomes. arXiv preprint arXiv:2605.10088 (2026).

See Also

power_ps, overlap_coef

Examples

# RCT sample size, robust variance
power_cox(effect_size = log(0.6), r = 0.5, d1 = 0.8, study_type = "rct",
          power = 0.8)

# Observational study, ATE
power_cox(effect_size = log(0.6), r = 0.5, d1 = 0.8, phi = 0.9,
          study_type = "obs", estimand = "ATE", power = 0.8)

# Sensitivity over phi and estimand
power_cox(effect_size = log(0.6), r = 0.5, d1 = 0.8,
          phi = c(0.9, 0.95), estimand = c("ATE", "ATO"),
          power = 0.8)


Sample size and power for PS-weighted average treatment effect estimators

Description

The required sample size is

N = \bar{V}\,(z_{1-\alpha/2} + z_{\beta})^2 \,/\, \tilde{\tau}^2,

where \tilde{\tau} is the standardized effect size and \bar{V} is the asymptotic variance of the Hajek estimator.

For the ATE estimand, \bar{V} has the closed form

\bar{V} = 2\!\left\{1 + \bigl(\rho^2\sigma_e^2+1\bigr) \exp\!\bigl(\sigma_e^2/2\bigr)\cosh(\mu_e)\right\},

where (\mu_e,\,\sigma_e^2) are uniquely determined by (r,\phi).

For the ATT, ATC, and ATO estimands, \bar{V} is computed by numerical integration of the same variance expression with the corresponding tilting function h(e). A custom tilting function may also be supplied.

For binary outcomes, the estimand is the risk difference; the same formula applies with S^2 = \mathrm{Var}(Y(0)) estimated from a linear probability model.

Usage

power_ps(
  effect_size,
  r,
  phi,
  rho2 = 0,
  estimand = "ATE",
  sig_level = 0.05,
  power = NULL,
  sample_size = NULL,
  test = "two-sided"
)

Arguments

effect_size

Standardized effect size \tilde{\tau} = \tau / S, where \tau is the treatment effect and S = \sqrt{\mathrm{Var}(Y(0))}. Scalar or vector.

r

Treatment proportion r = \Pr(Z = 1), in (0, 1). Scalar or vector.

phi

Overlap coefficient \phi \in (0, 1), measuring covariate similarity between groups. Rule of thumb: < 0.80 very poor, [0.80, 0.90) poor, [0.90, 0.95) moderate, \ge 0.95 good. Scalar or vector.

rho2

Confounding coefficient \rho^2 \in [0, 1), the squared correlation between the potential outcome and the propensity score linear predictor. Bounded above by the R^2 of regressing the outcome on covariates. Sensitivity analysis over \rho^2 \in [0, 0.05) is recommended. Default 0. Scalar or vector.

estimand

Target estimand. One of "ATE" (average treatment effect), "ATT" (average treatment effect for group 1), "ATC" (average treatment effect for group 0), "ATO" (average treatment effect for the overlap population), or a custom tilting function h(e) (must be a scalar function, not vectorized with other parameters). Scalar or character vector.

sig_level

Significance level \alpha. Default 0.05.

power

Target power \beta. Provide for sample size calculation; mutually exclusive with sample_size.

sample_size

Total sample size N. Provide for power calculation; mutually exclusive with power.

test

"two-sided" (default) or "one-sided".

Details

Computes the required sample size or the achieved power for the propensity-score-weighted Hajek estimator of a weighted average treatment effect (WATE) with continuous or binary outcomes.

Value

An object of class "power_ps", a list containing:

call

The matched call.

calculation

"sample_size" or "power".

result

A data frame with one row per scenario (all combinations of vector inputs) and columns for every design parameter plus the computed sample_size or power.

settings

A list with sig_level, power, sample_size, and test.

n_scenarios

Number of rows in result.

rho2_is_default

Logical; TRUE when rho2 was left at its default value of 0.

References

Bo Liu, Chengxin Yang, and Fan Li. Sample size and power calculations for causal inference in observational studies. Annals of Statistics (2026), forthcoming.

See Also

power_cox, overlap_coef

Examples

# Sample size for ATE, scalar inputs
power_ps(effect_size = 0.2, r = 0.5, phi = 0.9, power = 0.8)

# Power at a fixed N
power_ps(effect_size = 0.2, r = 0.5, phi = 0.9, sample_size = 250)

# Sensitivity over r and estimand (vector inputs)
power_ps(effect_size = 0.2, r = c(0.3, 0.5, 0.7), phi = 0.9,
         estimand = c("ATE", "ATO"), power = 0.8)