Simulation based calibration for OncoBayes2

Mon May 2 10:29:16 2022

This report documents the results of a simulation based calibration (SBC) run for OncoBayes2. TODO

The calibration data presented here has been generated at and with the OncoBayes git version as:

## Created:  2022-02-02 15:28:28 UTC
## git hash: a5cb76361091022cdd8f6b0158b92bcf719b76d0
## MD5:      8471f3d58cd0b32772d2907b73e1361e

The MD5 hash of the calibration data file presented here must match the above listed MD5:

##                    calibration.rds 
## "8471f3d58cd0b32772d2907b73e1361e"

Introduction

Simulation based calibration (SBC) is a necessary condition which must be met for any Bayesian analysis with proper priors. The details are presented in Talts, et. al (see https://arxiv.org/abs/1804.06788).

Self-consistency of any Bayesian analysis with a proper prior:

\[ p(\theta) = \iint \mbox{d}\tilde{y} \, \mbox{d}\tilde{\theta} \, p(\theta|\tilde{y}) \, p(\tilde{y}|\tilde{\theta}) \, p(\tilde{\theta}) \] \[ \Leftrightarrow p(\theta) = \iint \mbox{d}\tilde{y} \, \mbox{d}\tilde{\theta} \, p(\theta,\tilde{y},\tilde{\theta}) \]

SBC procedure:

Repeat \(s=1, ..., S\) times:

  1. Sample from the prior \[\tilde{\theta} \sim p(\theta)\]

  2. Sample fake data \[\tilde{y} \sim p(y|\tilde{\theta})\]

  3. Obtain \(L\) posterior samples \[\{\theta_1, ..., \theta_L\} \sim p(\tilde{\theta}|\tilde{y})\]

  4. Calculate the rank \(r_s\) of the prior draw \(\tilde{\theta}\) wrt to the posterior sample \(\{\theta_1, ..., \theta_L\} \sim p(\tilde{\theta}|\tilde{y})\) which falls into the range \([0,L]\) out of the possible \(L+1\) ranks. The rank is calculated as \[r_s = \sum_{l=1}^L \mathbb{I}[ \theta_l < \tilde{\theta}]\]

The \(S\) ranks then form a uniform \(0-1\) density and the count in each bin has a binomial distribution with probability of \[p(r \in \mbox{Any Bin}) =\frac{(L+1)}{S}.\]

Model description TODO

The fake data simulation function returns ... TODO. Please refer to the sbc_tools.R and make_reference_rankhist.R R programs for the implementation details.

The reference runs are created with \(L=1023\) posterior draws for each replication and a total of \(S=10^4\) replications are run per case. For the evaluation here the results are reduced to \(B=L'+1=64\) bins to ensure a sufficiently large sample size per bin.

SBC results

Sampler Diagnostics Overview

data_scenario N total_divergent min_ess max_Rhat total_large_Rhat min_lp_ess_bulk min_lp_ess_tail
combo2_EX 10000 0 454 1.011 0 117 164
combo2_EXNEX 10000 0 32 1.081 0 145 151
combo3_EXNEX 10000 0 12 1.216 1 78 217
log2bayes_EXNEX 10000 0 122 1.016 0 124 237

Large Rhat is defined as exceeding \(1.2\).

\(\chi^2\) Statistic, Model 1: Single-agent logistic regression

param statistic df p.value
beta_group[A,I(log(drug_A/1)),intercept] 23.354 31 0.836
beta_group[A,I(log(drug_A/1)),log_slope] 20.819 31 0.917
beta_group[B,I(log(drug_A/1)),intercept] 16.774 31 0.982
beta_group[B,I(log(drug_A/1)),log_slope] 23.859 31 0.816
beta_group[C,I(log(drug_A/1)),intercept] 41.146 31 0.105
beta_group[C,I(log(drug_A/1)),log_slope] 35.322 31 0.271
mu_log_beta[I(log(drug_A/1)),intercept] 18.554 31 0.962
mu_log_beta[I(log(drug_A/1)),log_slope] 18.893 31 0.957
tau_log_beta[STRAT,I(log(drug_A/1)),intercept] 39.014 31 0.153
tau_log_beta[STRAT,I(log(drug_A/1)),log_slope] 20.672 31 0.920

\(\chi^2\) Statistic, Model 2: Double combination, fully exchangeable

param statistic df p.value
beta_group[A,I(log(drug_A/1)),intercept] 20.525 31 0.924
beta_group[A,I(log(drug_A/1)),log_slope] 32.525 31 0.392
beta_group[A,I(log(drug_B/1)),intercept] 23.245 31 0.840
beta_group[A,I(log(drug_B/1)),log_slope] 24.518 31 0.789
beta_group[B,I(log(drug_A/1)),intercept] 33.318 31 0.355
beta_group[B,I(log(drug_A/1)),log_slope] 25.427 31 0.748
beta_group[B,I(log(drug_B/1)),intercept] 38.221 31 0.174
beta_group[B,I(log(drug_B/1)),log_slope] 30.605 31 0.486
beta_group[C,I(log(drug_A/1)),intercept] 35.334 31 0.271
beta_group[C,I(log(drug_A/1)),log_slope] 26.899 31 0.677
beta_group[C,I(log(drug_B/1)),intercept] 42.854 31 0.076
beta_group[C,I(log(drug_B/1)),log_slope] 28.621 31 0.589
eta_group[A,I(drug_A/1 * drug_B/1)] 28.838 31 0.578
eta_group[B,I(drug_A/1 * drug_B/1)] 26.016 31 0.721
eta_group[C,I(drug_A/1 * drug_B/1)] 29.837 31 0.526
mu_eta[I(drug_A/1 * drug_B/1)] 33.990 31 0.326
mu_log_beta[I(log(drug_A/1)),intercept] 33.728 31 0.337
mu_log_beta[I(log(drug_A/1)),log_slope] 25.190 31 0.759
mu_log_beta[I(log(drug_B/1)),intercept] 39.085 31 0.151
mu_log_beta[I(log(drug_B/1)),log_slope] 21.805 31 0.889
tau_eta[STRAT,I(drug_A/1 * drug_B/1)] 40.525 31 0.118
tau_log_beta[STRAT,I(log(drug_A/1)),intercept] 31.955 31 0.419
tau_log_beta[STRAT,I(log(drug_A/1)),log_slope] 18.880 31 0.957
tau_log_beta[STRAT,I(log(drug_B/1)),intercept] 38.266 31 0.173
tau_log_beta[STRAT,I(log(drug_B/1)),log_slope] 29.824 31 0.526

\(\chi^2\) Statistic, Model 3: Double combination, EXchangeable/NonEXchangeable model

param statistic df p.value
beta_group[A,I(log(drug_A/1)),intercept] 22.829 31 0.855
beta_group[A,I(log(drug_A/1)),log_slope] 34.042 31 0.323
beta_group[A,I(log(drug_B/1)),intercept] 24.115 31 0.806
beta_group[A,I(log(drug_B/1)),log_slope] 25.184 31 0.759
beta_group[B,I(log(drug_A/1)),intercept] 18.861 31 0.957
beta_group[B,I(log(drug_A/1)),log_slope] 41.914 31 0.091
beta_group[B,I(log(drug_B/1)),intercept] 20.672 31 0.920
beta_group[B,I(log(drug_B/1)),log_slope] 31.462 31 0.443
beta_group[C,I(log(drug_A/1)),intercept] 28.986 31 0.570
beta_group[C,I(log(drug_A/1)),log_slope] 16.250 31 0.986
beta_group[C,I(log(drug_B/1)),intercept] 25.299 31 0.754
beta_group[C,I(log(drug_B/1)),log_slope] 29.811 31 0.527
eta_group[A,I(drug_A/1 * drug_B/1)] 30.483 31 0.492
eta_group[B,I(drug_A/1 * drug_B/1)] 45.830 31 0.042
eta_group[C,I(drug_A/1 * drug_B/1)] 30.714 31 0.481
mu_eta[I(drug_A/1 * drug_B/1)] 34.867 31 0.289
mu_log_beta[I(log(drug_A/1)),intercept] 39.328 31 0.145
mu_log_beta[I(log(drug_A/1)),log_slope] 37.094 31 0.208
mu_log_beta[I(log(drug_B/1)),intercept] 27.059 31 0.669
mu_log_beta[I(log(drug_B/1)),log_slope] 20.992 31 0.912
tau_eta[STRAT,I(drug_A/1 * drug_B/1)] 22.528 31 0.866
tau_log_beta[STRAT,I(log(drug_A/1)),intercept] 36.141 31 0.241
tau_log_beta[STRAT,I(log(drug_A/1)),log_slope] 23.539 31 0.829
tau_log_beta[STRAT,I(log(drug_B/1)),intercept] 28.960 31 0.571
tau_log_beta[STRAT,I(log(drug_B/1)),log_slope] 27.264 31 0.659

\(\chi^2\) Statistic, Model 4: Triple combination, EX/NEX model

param statistic df p.value
beta_group[A,I(log(drug_A/1)),intercept] 23.078 31 0.846
beta_group[A,I(log(drug_A/1)),log_slope] 32.550 31 0.390
beta_group[A,I(log(drug_B/1)),intercept] 28.307 31 0.605
beta_group[A,I(log(drug_B/1)),log_slope] 16.864 31 0.982
beta_group[A,I(log(drug_C/1)),intercept] 27.296 31 0.657
beta_group[A,I(log(drug_C/1)),log_slope] 46.285 31 0.038
beta_group[B,I(log(drug_A/1)),intercept] 27.725 31 0.635
beta_group[B,I(log(drug_A/1)),log_slope] 34.662 31 0.297
beta_group[B,I(log(drug_B/1)),intercept] 22.214 31 0.876
beta_group[B,I(log(drug_B/1)),log_slope] 39.091 31 0.151
beta_group[B,I(log(drug_C/1)),intercept] 35.136 31 0.278
beta_group[B,I(log(drug_C/1)),log_slope] 33.184 31 0.361
beta_group[C,I(log(drug_A/1)),intercept] 31.302 31 0.451
beta_group[C,I(log(drug_A/1)),log_slope] 44.806 31 0.052
beta_group[C,I(log(drug_B/1)),intercept] 29.920 31 0.521
beta_group[C,I(log(drug_B/1)),log_slope] 20.403 31 0.927
beta_group[C,I(log(drug_C/1)),intercept] 32.595 31 0.388
beta_group[C,I(log(drug_C/1)),log_slope] 27.923 31 0.625
eta_group[A,I(drug_A/1 * drug_B/1 * drug_C/1)] 21.242 31 0.905
eta_group[A,I(drug_A/1 * drug_B/1)] 30.253 31 0.504
eta_group[A,I(drug_A/1 * drug_C/1)] 24.979 31 0.769
eta_group[A,I(drug_B/1 * drug_C/1)] 17.907 31 0.971
eta_group[B,I(drug_A/1 * drug_B/1 * drug_C/1)] 27.578 31 0.643
eta_group[B,I(drug_A/1 * drug_B/1)] 27.392 31 0.652
eta_group[B,I(drug_A/1 * drug_C/1)] 29.018 31 0.568
eta_group[B,I(drug_B/1 * drug_C/1)] 36.538 31 0.227
eta_group[C,I(drug_A/1 * drug_B/1 * drug_C/1)] 18.701 31 0.960
eta_group[C,I(drug_A/1 * drug_B/1)] 34.176 31 0.318
eta_group[C,I(drug_A/1 * drug_C/1)] 20.538 31 0.924
eta_group[C,I(drug_B/1 * drug_C/1)] 28.538 31 0.593
mu_eta[I(drug_A/1 * drug_B/1 * drug_C/1)] 23.053 31 0.847
mu_eta[I(drug_A/1 * drug_B/1)] 34.976 31 0.285
mu_eta[I(drug_A/1 * drug_C/1)] 16.595 31 0.984
mu_eta[I(drug_B/1 * drug_C/1)] 24.288 31 0.799
mu_log_beta[I(log(drug_A/1)),intercept] 21.075 31 0.910
mu_log_beta[I(log(drug_A/1)),log_slope] 24.320 31 0.797
mu_log_beta[I(log(drug_B/1)),intercept] 25.222 31 0.758
mu_log_beta[I(log(drug_B/1)),log_slope] 37.722 31 0.189
mu_log_beta[I(log(drug_C/1)),intercept] 36.640 31 0.223
mu_log_beta[I(log(drug_C/1)),log_slope] 29.862 31 0.524
tau_eta[STRAT,I(drug_A/1 * drug_B/1 * drug_C/1)] 21.312 31 0.903
tau_eta[STRAT,I(drug_A/1 * drug_B/1)] 29.587 31 0.539
tau_eta[STRAT,I(drug_A/1 * drug_C/1)] 45.664 31 0.043
tau_eta[STRAT,I(drug_B/1 * drug_C/1)] 19.475 31 0.946
tau_log_beta[STRAT,I(log(drug_A/1)),intercept] 38.144 31 0.176
tau_log_beta[STRAT,I(log(drug_A/1)),log_slope] 29.056 31 0.566
tau_log_beta[STRAT,I(log(drug_B/1)),intercept] 26.995 31 0.672
tau_log_beta[STRAT,I(log(drug_B/1)),log_slope] 25.050 31 0.765
tau_log_beta[STRAT,I(log(drug_C/1)),intercept] 34.086 31 0.321
tau_log_beta[STRAT,I(log(drug_C/1)),log_slope] 46.528 31 0.036

Session Info

## R version 4.1.0 (2021-05-18)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.4 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] tools     stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
## [1] ggplot2_3.3.5    broom_0.7.9      tidyr_1.1.3      dplyr_1.0.8     
## [5] assertthat_0.2.1 knitr_1.33       rmarkdown_2.11  
## 
## loaded via a namespace (and not attached):
##  [1] pillar_1.6.2     bslib_0.3.1      compiler_4.1.0   jquerylib_0.1.4 
##  [5] highr_0.9        digest_0.6.29    jsonlite_1.7.2   evaluate_0.14   
##  [9] lifecycle_1.0.1  tibble_3.1.3     gtable_0.3.0     pkgconfig_2.0.3 
## [13] rlang_1.0.1      cli_3.1.1        DBI_1.1.2        yaml_2.2.1      
## [17] xfun_0.25        fastmap_1.1.0    withr_2.4.3      stringr_1.4.0   
## [21] generics_0.1.0   vctrs_0.3.8      sass_0.4.0       grid_4.1.0      
## [25] tidyselect_1.1.1 glue_1.6.1       R6_2.5.1         fansi_0.5.0     
## [29] purrr_0.3.4      magrittr_2.0.1   scales_1.1.1     backports_1.2.1 
## [33] ellipsis_0.3.2   htmltools_0.5.2  colorspace_2.0-2 utf8_1.2.2      
## [37] stringi_1.7.3    munsell_0.5.0    crayon_1.4.2