| Type: | Package |
| Title: | Sample Size and Power for Propensity Score Weighted Estimators |
| Version: | 2.0.0 |
| Description: | Computes sample size and power for causal inference studies that use propensity score (PS) weighting. Supports continuous, binary, and time-to-event (survival) outcomes under four estimands: average treatment effect (ATE), average treatment effect on the treated (ATT), average treatment effect on the controls (ATC), and average treatment effect on the overlap population (ATO). For continuous and binary outcomes, the asymptotic variance of the Hajek inverse probability weighting estimator is derived under a logit-normal propensity score model, approximated by a Beta distribution matched through the Bhattacharyya overlap coefficient. For survival outcomes, the asymptotic variance of the propensity-score- weighted partial likelihood estimator is used for randomized trials and observational studies. The Schoenfeld formula is also available for randomized trial settings. |
| License: | GPL-3 |
| Encoding: | UTF-8 |
| Depends: | R (≥ 3.5.0) |
| RoxygenNote: | 7.3.3 |
| Imports: | stats |
| Suggests: | ggplot2, knitr, rmarkdown, testthat (≥ 3.0.0) |
| VignetteBuilder: | knitr |
| Config/testthat/edition: | 3 |
| NeedsCompilation: | no |
| Packaged: | 2026-05-14 06:40:00 UTC; chengxin |
| Author: | Bo Liu [aut, cre], Chengxin Yang [aut], Fan Li [aut] |
| Maintainer: | Bo Liu <bo.liu1997@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-05-14 08:40:08 UTC |
Bhattacharyya overlap coefficient for propensity score distributions
Description
Two calling conventions are provided:
- From propensity scores (empirical):
-
Supply
psandZ. The empirical formula is\hat\phi = \frac{\mathrm{E}[\sqrt{e(1-e)}]}{\sqrt{r(1-r)}},where
r = \mathrm{E}[Z]. This is the sample mean of\sqrt{e_i(1-e_i)}divided by\sqrt{\hat r(1-\hat r)}. - From Beta parameters (analytical):
-
Supply
aandb. Under the\mathrm{Beta}(a,b)approximation,\phi = \exp\!\Bigl[ \log\Gamma(a+\tfrac12) - \tfrac12\log a - \log\Gamma(a) + \log\Gamma(b+\tfrac12) - \tfrac12\log b - \log\Gamma(b) \Bigr].
Usage
overlap_coef(ps = NULL, Z = NULL, a = NULL, b = NULL)
Arguments
ps |
Numeric vector of estimated propensity scores
|
Z |
Integer or numeric vector of treatment indicators
( |
a |
Shape parameter |
b |
Shape parameter |
Details
Computes the Bhattacharyya overlap coefficient \phi, a scalar
measure of propensity score overlap between the treatment and control
groups. Values close to 1 indicate near-complete overlap (little
confounding); values well below 1 indicate poor overlap.
Value
A list with components:
phiThe overlap coefficient
\hat\phi.rTreatment proportion:
mean(Z)(empirical) ora / (a + b)(analytical).
References
Chengxin Yang, Bo Liu, and Fan Li. Sample size and power calculations for causal inference with time-to-event outcomes. arXiv preprint arXiv:2605.10088 (2026).
Bo Liu, Chengxin Yang, and Fan Li. Sample size and power calculations for causal inference with continuous and binary outcomes. Annals of Statistics (2026).
See Also
Examples
# From propensity scores
set.seed(1)
n <- 500
X <- rnorm(n)
ps <- plogis(0.5 * X)
Z <- rbinom(n, 1, ps)
overlap_coef(ps = ps, Z = Z)
# From Beta parameters
overlap_coef(a = 2, b = 3)
Sample size and power for PS-weighted marginal Cox model
Description
The required sample size is
N = V\,(z_{1-\alpha} + z_{\beta})^2 \,/\, \tau_0^2,
where \tau_0 = \log(\text{HR}) is the target log hazard ratio and
V is the asymptotic variance of the estimator.
Randomized trial — robust sandwich variance (method = "robust"):
V_{RCT} = \frac{(\lambda_1 + \lambda_0)^2
\bigl[r\lambda_0^2 d_1 + (1-r)\lambda_1^2 d_0\bigr]}{d^2},
where \lambda_1 = \sqrt{r/(1-r)}\,e^{\tau_0/2},
\lambda_0 = 1/\lambda_1, and d = r\,d_1 + (1-r)\,d_0.
Randomized trial — Schoenfeld formula (method = "schoenfeld"):
V_{Sch} = \frac{1}{r(1-r)\,d}.
Note: the Schoenfeld formula is derived under a null effect and may underestimate or overestimate the required sample size at non-null effects.
Observational study — inverse probability weights (ATE),
robust sandwich variance (study_type = "obs", estimand = "ATE"):
V_{obs} = \frac{(\lambda_1+\lambda_0)^2}{d^2}
\left[r^2\lambda_0^2 d_1\,\frac{a+b-1}{a-1}
+ (1-r)^2\lambda_1^2 d_0\,\frac{a+b-1}{b-1}\right],
where a, b > 1 are Beta distribution parameters determined by
(r, \phi). Requires \min(a, b) > 1.
Observational study — overlap weights (ATO) or
treated population weights (ATT)
(study_type = "obs", estimand in "ATO", "ATT"):
N = \kappa_{DE} \times N_{RCT},
where N_{RCT} uses V_{RCT} above and \kappa_{DE} is a
design effect estimated by Monte Carlo simulation from the Beta
approximation of propensity scores.
Usage
power_cox(
effect_size,
r,
d1,
d0 = NULL,
phi = NULL,
study_type = "obs",
estimand = "ATE",
method = "robust",
sig_level = 0.05,
power = NULL,
sample_size = NULL,
test = "one-sided",
n_mc = 1e+06
)
Arguments
effect_size |
Log hazard ratio |
r |
Treatment proportion |
d1 |
Event rate in group 1 (treated), in |
d0 |
Event rate in group 0 (control), in |
phi |
Overlap coefficient |
study_type |
|
estimand |
Target estimand. |
method |
Variance approximation method. |
sig_level |
Significance level |
power |
Target power |
sample_size |
Total sample size |
test |
|
n_mc |
Number of Monte Carlo samples used to estimate the design
effect for |
Details
Computes the required sample size or the achieved power for the propensity-score-weighted partial likelihood estimator of the marginal hazard ratio in a Cox proportional hazards model.
Value
An object of class "power_cox", a list containing:
callThe matched call.
calculation"sample_size"or"power".resultA data frame with one row per scenario and columns for every design parameter plus the computed
sample_sizeorpower.settingsA list with
sig_level,power,sample_size, andtest.n_scenariosNumber of rows in
result.d0_set_equalLogical;
TRUEwhend0was not specified and set equal tod1.
References
Chengxin Yang, Bo Liu, and Fan Li. Sample size and power calculations for causal inference with time-to-event outcomes. arXiv preprint arXiv:2605.10088 (2026).
See Also
Examples
# RCT sample size, robust variance
power_cox(effect_size = log(0.6), r = 0.5, d1 = 0.8, study_type = "rct",
power = 0.8)
# Observational study, ATE
power_cox(effect_size = log(0.6), r = 0.5, d1 = 0.8, phi = 0.9,
study_type = "obs", estimand = "ATE", power = 0.8)
# Sensitivity over phi and estimand
power_cox(effect_size = log(0.6), r = 0.5, d1 = 0.8,
phi = c(0.9, 0.95), estimand = c("ATE", "ATO"),
power = 0.8)
Sample size and power for PS-weighted average treatment effect estimators
Description
The required sample size is
N = \bar{V}\,(z_{1-\alpha/2} + z_{\beta})^2 \,/\, \tilde{\tau}^2,
where \tilde{\tau} is the standardized effect size and
\bar{V} is the asymptotic variance of the Hajek estimator.
For the ATE estimand, \bar{V} has the closed form
\bar{V} = 2\!\left\{1 + \bigl(\rho^2\sigma_e^2+1\bigr)
\exp\!\bigl(\sigma_e^2/2\bigr)\cosh(\mu_e)\right\},
where (\mu_e,\,\sigma_e^2) are uniquely determined by (r,\phi).
For the ATT, ATC, and ATO estimands, \bar{V} is computed
by numerical integration of the same variance expression with the
corresponding tilting function h(e). A custom tilting function may
also be supplied.
For binary outcomes, the estimand is the risk difference; the same formula
applies with S^2 = \mathrm{Var}(Y(0)) estimated from a linear
probability model.
Usage
power_ps(
effect_size,
r,
phi,
rho2 = 0,
estimand = "ATE",
sig_level = 0.05,
power = NULL,
sample_size = NULL,
test = "two-sided"
)
Arguments
effect_size |
Standardized effect size |
r |
Treatment proportion |
phi |
Overlap coefficient |
rho2 |
Confounding coefficient |
estimand |
Target estimand. One of |
sig_level |
Significance level |
power |
Target power |
sample_size |
Total sample size |
test |
|
Details
Computes the required sample size or the achieved power for the propensity-score-weighted Hajek estimator of a weighted average treatment effect (WATE) with continuous or binary outcomes.
Value
An object of class "power_ps", a list containing:
callThe matched call.
calculation"sample_size"or"power".resultA data frame with one row per scenario (all combinations of vector inputs) and columns for every design parameter plus the computed
sample_sizeorpower.settingsA list with
sig_level,power,sample_size, andtest.n_scenariosNumber of rows in
result.rho2_is_defaultLogical;
TRUEwhenrho2was left at its default value of0.
References
Bo Liu, Chengxin Yang, and Fan Li. Sample size and power calculations for causal inference in observational studies. Annals of Statistics (2026), forthcoming.
See Also
Examples
# Sample size for ATE, scalar inputs
power_ps(effect_size = 0.2, r = 0.5, phi = 0.9, power = 0.8)
# Power at a fixed N
power_ps(effect_size = 0.2, r = 0.5, phi = 0.9, sample_size = 250)
# Sensitivity over r and estimand (vector inputs)
power_ps(effect_size = 0.2, r = c(0.3, 0.5, 0.7), phi = 0.9,
estimand = c("ATE", "ATO"), power = 0.8)