In this vignette, we are reporting results for a parameter recovery
of the tva_recovery data set. We need packages
tibble, tidyr and dplyr for data
wrangling, ggplot2 for plotting, RStanTVA for
the actual analyses, and brms for specifying priors.
library(tibble)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
library(ggplot2)
library(RStanTVA)
#> Loading required package: rstan
#> Loading required package: StanHeaders
#>
#> rstan version 2.32.7 (Stan version 2.32.2)
#> For execution on a local, multicore CPU with excess RAM we recommend calling
#> options(mc.cores = parallel::detectCores()).
#> To avoid recompilation of unchanged Stan programs, we recommend calling
#> rstan_options(auto_write = TRUE)
#> For within-chain threading using `reduce_sum()` or `map_rect()` Stan functions,
#> change `threads_per_chain` option:
#> rstan_options(threads_per_chain = 1)
#>
#> Attaching package: 'rstan'
#> The following object is masked from 'package:tidyr':
#>
#> extract
library(brms)
#> Loading required package: Rcpp
#> Loading 'brms' package (version 2.22.0). Useful instructions
#> can be found by typing help('brms'). A more detailed introduction
#> to the package is available through vignette('brms_overview').
#>
#> Attaching package: 'brms'
#> The following objects are masked from 'package:RStanTVA':
#>
#> fixef, ranef
#> The following object is masked from 'package:rstan':
#>
#> loo
#> The following object is masked from 'package:stats':
#>
#> arWe want to make use of the parallelization built into Stan by activating the corresponding option in RStan:
Each trial is simulated assuming partial report in TVA with a global processing capacity \(C\), top-down selectivity \(\alpha\), hemispheric bias (\(r\)), probit-distributed memory capacity \(K\) (parameters \(\mu_K\) and \(\sigma_K\)), and Gaussian sensory thresholds \(t_0\) (parameters \(\mu_0\) and \(\sigma_0\)).
For easier understanding, we will later on only look at the expected \(K\) values instead of the distributional parameters \(\mu_K\) and \(\sigma_K\). The expected value (mean) can be derived from \((\mu_K,\sigma_K)\) samples like this:
meanprobitK <- Vectorize(function(mK, sigmaK, nS) {
p <- pnorm(1:nS, mK, sigmaK)
lps <- c(0, p)
ups <- c(p, 1)
sum(0:nS * (ups - lps))
}, c("mK","sigmaK"))The same is often done for \(t_0\), where the expected value of a Gaussian \(t_0\) is simply \(\mu_0\).
Note: For a demonstration of simulated TVA trials, see the
ShinyTVA app, which can be run as follows:
Processing speed \(C\) differs between the “low” and “high” conditions so that \(C_\textrm{low}=80\) and \(C_\textrm{high}=100\). There are no true differences between the conditions for any of the other model parameters.
Each of the 50 subjects has their individual parameter configuration and condition-level effects, and all of those parameters were randomly sampled from a multivariate normal distribution so that:
\[ \begin{pmatrix} a_{\log C}\\ b_{\log C}\\ a_{\log\alpha}\\ b_{\log\alpha}\\ a_{\mu_K}\\ b_{\mu_K}\\ a_{\log\sigma_K}\\ b_{\log\sigma_K}\\ a_{\log\sigma_0}\\ b_{\log\sigma_0}\\ a_{\mu_0}\\ b_{\mu_0}\\ a_{\textrm{logit}(r)}\\ b_{\textrm{logit}(r)} \end{pmatrix}\sim\mathcal{N}_{14}\left(\begin{pmatrix} 4.38202663467388\\0.223143551314211\\-0.356674943938732\\0\\3.5\\0\\-0.7\\0\\2\\0\\20\\0\\0\\0 \end{pmatrix},\begin{pmatrix} 0.04&-0.002&-0.012&0&0&0&0&0&0&0&0&0&0&0\\-0.002&0.0025&0&0&0&0&0&0&0&0&0&0&0&0\\-0.012&0&0.04&0&0&0&0&0&0&0&0&0&0&0\\0&0&0&0.0025&0&0&0&0&0&0&0&0&0&0\\0&0&0&0&0.5625&0&0.015&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0&0&0&0&0&0&0\\0&0&0&0&0&0&0.01&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0&0.04&0&0&0&0&0\\0&0&0&0&0&0&0&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0&0&0&100&0&0&0\\0&0&0&0&0&0&0&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0&0&0&0&0&0.01&0\\0&0&0&0&0&0&0&0&0&0&0&0&0&0 \end{pmatrix}\right) \]
The data frame tva_recovery contains simulated CombiTVA
trials for 50 subjects (234 trials per subject, =11700 trials total).
Half of the trials of each subject were performed in a “low” condition
and the other half in a “high” condition:
data("tva_recovery")
head(tva_recovery)
#> # A tibble: 6 × 9
#> # Groups: subject [1]
#> subject trial T condition nD S[,1] [,2] D[,1] R[,1] true_values$t[,1]
#> <int> <int> <dbl> <fct> <int> <int> <int> <int> <int> <dbl>
#> 1 1 1 200 low 0 1 1 0 1 -6.37
#> 2 1 2 150 high 3 1 1 1 0 240.
#> 3 1 3 16.6 low 0 1 1 0 0 136.
#> 4 1 4 50 low 0 1 1 0 0 302.
#> 5 1 5 33.3 high 0 1 1 0 0 29.7
#> 6 1 6 200 low 0 1 1 0 0 36.2
#> # ℹ 14 more variables: S[3:6] <int>, D[2:6] <int>, R[2:6] <int>,
#> # true_values$t[2:6] <dbl>, true_values$v <dbl[,6]>, $t0 <dbl>, $K <dbl>,
#> # $C <dbl>, $alpha <dbl>, $w <dbl[,2]>, $mu0 <dbl>, $sigma0 <dbl>, $mK <dbl>,
#> # $sK <dbl>The package also contains the true values used for the simulation (including the subject-level parameters):
data("tva_recovery_true_params")
str(tva_recovery_true_params)
#> List of 5
#> $ b : Named num [1:14] 4.382 0.223 -0.357 0 3.5 ...
#> ..- attr(*, "names")= chr [1:14] "C_Intercept" "C_conditionhigh" "alpha_Intercept" "alpha_conditionhigh" ...
#> $ s_subject : Named num [1:14] 0.2 0.05 0.2 0.05 0.75 0 0.1 0 0.2 0 ...
#> ..- attr(*, "names")= chr [1:14] "C_Intercept" "C_conditionhigh" "alpha_Intercept" "alpha_conditionhigh" ...
#> $ r_subject : num [1:14, 1:14] 1 -0.2 -0.3 0 0 0 0 0 0 0 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:14] "C_Intercept" "C_conditionhigh" "alpha_Intercept" "alpha_conditionhigh" ...
#> .. ..$ : chr [1:14] "C_Intercept" "C_conditionhigh" "alpha_Intercept" "alpha_conditionhigh" ...
#> $ z_subject : num [1:50, 1:14] 0.02 -0.114 1.22 -0.345 0.664 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : NULL
#> .. ..$ : chr [1:14] "C_Intercept" "C_conditionhigh" "alpha_Intercept" "alpha_conditionhigh" ...
#> $ coef_subject: tibble [100 × 9] (S3: tbl_df/tbl/data.frame)
#> ..$ subject : int [1:100] 1 1 2 2 3 3 4 4 5 5 ...
#> ..$ condition: Factor w/ 2 levels "high","low": 1 2 1 2 1 2 1 2 1 2 ...
#> ..$ C : Named num [1:100] 104.7 80.3 93.5 78.2 135.3 ...
#> .. ..- attr(*, "names")= chr [1:100] "high" "low" "high" "low" ...
#> ..$ alpha : Named num [1:100] 0.487 0.499 0.578 0.579 0.855 ...
#> .. ..- attr(*, "names")= chr [1:100] "high" "low" "high" "low" ...
#> ..$ w : num [1:100, 1:2] 0.506 0.506 0.471 0.471 0.511 ...
#> .. ..- attr(*, "dimnames")=List of 2
#> .. .. ..$ : chr [1:100] "high" "low" "high" "low" ...
#> .. .. ..$ : NULL
#> ..$ mu0 : Named num [1:100] 5.64 5.64 13.71 13.71 22.44 ...
#> .. ..- attr(*, "names")= chr [1:100] "high" "low" "high" "low" ...
#> ..$ sigma0 : Named num [1:100] 8.2 8.2 7.4 7.4 6.77 ...
#> .. ..- attr(*, "names")= chr [1:100] "high" "low" "high" "low" ...
#> ..$ mK : Named num [1:100] 2.43 2.43 1.65 1.65 3.86 ...
#> .. ..- attr(*, "names")= chr [1:100] "high" "low" "high" "low" ...
#> ..$ sK : Named num [1:100] 0.529 0.529 0.497 0.497 0.511 ...
#> .. ..- attr(*, "names")= chr [1:100] "high" "low" "high" "low" ...These can be conveniently converted to data frame for later comparison with fitted values:
true_params <- tva_recovery_true_params$coef_subject
true_params_narrow <- true_params %>% mutate(`italic(w)[lambda]` = w[,2], `italic(K)` = meanprobitK(mK,sK,6)) %>% select(-w,-mK,-sK) %>% rename(`italic(t)[0]` = mu0, `sigma[0]` = sigma0, `italic(C)` = C) %>% pivot_longer(-c(subject,condition), names_to = "param", values_to = "true_value")
head(true_params_narrow)
#> # A tibble: 6 × 4
#> subject condition param true_value
#> <int> <fct> <chr> <dbl>
#> 1 1 high italic(C) 105.
#> 2 1 high alpha 0.487
#> 3 1 high italic(t)[0] 5.64
#> 4 1 high sigma[0] 8.20
#> 5 1 high italic(w)[lambda] 0.494
#> 6 1 high italic(K) 1.94
true_params_narrow2 <- bind_rows(
true_params_narrow %>% filter(condition == "low" & param != "italic(C)") %>% select(-condition),
true_params_narrow %>% filter(param == "italic(C)") %>% transmute(subject, param = sprintf("%s[%s]", param, condition), true_value)
)
head(true_params_narrow2)
#> # A tibble: 6 × 3
#> subject param true_value
#> <int> <chr> <dbl>
#> 1 1 alpha 0.499
#> 2 1 italic(t)[0] 5.64
#> 3 1 sigma[0] 8.20
#> 4 1 italic(w)[lambda] 0.494
#> 5 1 italic(K) 1.94
#> 6 2 alpha 0.579Note: The R script that generates tva_recovery
is located in the package source at
data-raw/tva_recovery.R.
For a single trial, given the effective exposure duration \(\tau=t-t_0\) of the set of displayed items \(S\), and predicted individual processing rates \(\mathbf v\) for those items, the probability of a memory report \(M\) is given as:
\[ P_{\mathrm{M}}\left(M\mid\tau,S,K,\mathbf{v}\right) =\begin{cases} 1 & \textrm{if }M=\emptyset\textrm{ and }K=0\\ 1 & \textrm{if }M=\emptyset\textrm{ and }\tau=0\\ \bar{F}\left(\tau;\sum_{z\in S}v_{z}\right) & \textrm{if }M=\emptyset\textrm{ and }K>0\textrm{ and }\tau>0\\ \prod_{y\in M}F\left(\tau;v_{y}\right)\prod_{z\in S\setminus M}\bar{F}\left(\tau;v_{z}\right) & \textrm{if }0<\left|M\right|<K\leq\left|S\right|\textrm{ and }\tau>0\\ \prod_{y\in M}F\left(\tau;v_{y}\right)\prod_{z\in S\setminus M}\bar{F}\left(\tau;v_{z}\right) & \textrm{if }0<\left|M\right|=\left|S\right|\leq K\textrm{ and }\tau>0\\ \sum_{x\in M}\int_{0}^{\tau}f\left(t;v_{x}\right)\prod_{y\in M\setminus\left\{ x\right\} }F\left(t;v_{y}\right)\prod_{z\in S\setminus M}\bar{F}\left(t;v_{z}\right)\mathrm{d}t & \textrm{if }0<\left|M\right|=K<\left|S\right|\textrm{ and }\tau>0\\ 0 & \textrm{otherwise} \end{cases}\ \]
For partial reports (i.e., when the display set contains distractors \(S_{{\rm D}}\) and targets \(S_{{\rm T}}\), we need to iterate over the set of potential subsets of \(S_{{\rm D}}\), powerset \(\mathcal{P}_{\leq K-\left|R_{{\rm T}}\right|}\left(S_{{\rm D}}\right)\), that could have made it into VSTM before \(\tau\), in addition to the actually reported target items \(R_\mathrm{T}\):
\[ P_{{\rm PR}}\left(R_{{\rm T}}\mid\tau,S_{{\rm T}},S_{{\rm D}},K,\mathbf{v}\right) =\begin{cases} \sum_{R_{{\rm D}}\in\mathcal{P}_{\leq K-\left|R_{{\rm T}}\right|}\left(S_{{\rm D}}\right)}P_{{\rm M}}\left(R_{{\rm T}}\cup R_{{\rm D}}\mid\tau,S_{{\rm T}}\cup S_{{\rm D}},K,\mathbf{v}\right) & \textrm{if }S_{{\rm D}}\neq\emptyset\textrm{ and }\tau\geq0\\ P_{{\rm M}}\left(R_{{\rm T}}\mid\tau,S_{{\rm T}},K,\mathbf{v}\right) & \textrm{if }S_{{\rm D}}=\emptyset\textrm{ and }\tau\geq0\\ 0 & \textrm{otherwise} \end{cases} \]
Note that \(P_{{\rm PR}}\) simplifies to \(P_{\mathrm{M}}\) if there were no distractors (\(S_{{\rm D}}=\emptyset\)), i.e. when the trial was effectively a whole-report trial.
Being based on RStan, RStanTVA can be used in different ways, i.e. using maximum-likelihood estimation or Bayesian inference of different hierarchical complexity (single-condition, single-participant/fixed-effects, fully-hierarchical/mixed-effects). Hence we can also recover model parameters in different ways.
Using MLE, we can define four different inference models:
The most straightforward model uses no regularization by priors
(priors = FALSE) and conducts single fits for all model
parameters:
By defining a regression model for \(\log
C\), which includes an intercept and a treatment effect of “high”
vs. “low” (log(C) ~ 1 + condition), we can model both
experimental conditions in a single model:
Omitting the priors = FALSE argument falls back to the
default priors = NULL, which uses vaguely informative
default priors for all parameters. The model is otherwise identical to
the baseline RStanTVA-ML model above:
When using regularizing priors, if a parameter is described by an equation such as \(C\), we need to specify priors for intercepts and slopes:
priors <-
prior(normal(0,.5),dpar=C)+
prior(normal(4.5,.6),dpar=C,coef=Intercept)
m_nested_reg <- stantva_model(
formula = list(log(C) ~ 1 + condition),
locations = 6,
task = "pr",
regions = list(left = 1:3, right = 4:6),
w_mode = "regions",
t0_mode = "gaussian",
K_mode = "probit",
save_log_lik = FALSE,
priors = priors
)For brevity, we are only reporting results of the RStanTVA-MLRN fits here, since those have been found to be most reliable (see manuscript). The model can be fitted to the individual data sets (each participant) as follows:
fit_mle_nested_reg <- lapply(1:50, function(i) {
d <- tva_recovery %>% filter(subject == i)
repeat {
p <- optimizing(m_nested_reg, d)
if(p$return_code == 0L) break
}
tibble(subject = i, param = c("italic(C)[low]","italic(C)[high]","alpha","italic(w)[lambda]","italic(t)[0]","sigma[0]","italic(K)"), fitted_value = c(exp(p$par["C_Intercept"]), exp(p$par["C_Intercept"]+p$par["C_conditionhigh"]), p$par[c("alpha","r[2]","mu0","sigma0")], meanprobitK(p$par["mK"],p$par["sK"],6)), converged = p$return_code == 0L)
}) %>% bind_rows() %>% left_join(true_params_narrow2)The critical function above is
optimizing(m_nested_reg, d), which uses MLE to fit
m_nested_reg to the subject-level subset
d.
We can, in principle, fit the exact same models as above to the data
using Bayesian inference instead of MLE, simply by replacing the
optimizing(...) with a sampling(...) function
call. So the following code will fit the RStanTVA-MLRN
(m_nested_reg) model to subject-level subsets
d using HMC methods in Stan:
fit_bayesian_nested <- lapply(1:50, function(i) {
d <- tva_recovery %>% filter(subject == i)
sf <- sampling(m_nested_reg, d)
p1 <- predict(sf, data.frame(subject = i, condition = "low"), c("C","alpha", "r", "mu0","sigma0","mK","sK"))
p2 <- predict(sf, data.frame(subject = i, condition = "high"), "C")
tibble(
`italic(C)[low]` = t(p1$C),
`italic(C)[high]` = t(p2$C),
`alpha` = t(p1$alpha),
`italic(w)[lambda]` = t(p1$r[,2,1]),
`italic(t)[0]` = t(p1$mu0),
`sigma[0]` = t(p1$sigma0),
`italic(K)` = t(meanprobitK(p1$mK,p1$sK,6))
) %>%
pivot_longer(everything(), names_to = "param", values_to = "samples") %>%
rowwise(param) %>%
reframe(subject = i, posterior_sd = sd(samples), fitted_value = median(samples), cri = t(quantile(samples, c(.025,.975))), converged = Rhat(matrix(samples, ncol = sf@sim$chains)) < 1.1)
}) %>%
bind_rows() %>%
left_join(true_params_narrow2)Note that we must use the method
predict(fit, data, parameters) to predict the TVA parameter
values from the fitted parameters. This will return a posterior
distribution for each row (subject) in data and each
specified parameter. We are then calculating 95% credible intervals
(CrIs), medians, \(\hat R\) and
posteriors SDs as measures of goodness-of-fit.
For hierarchical inference, we define mixed-effects regressions for the model parameters \(C\), \(\log\alpha\), \(\mu_K\), \(\mu_0\), and \(\textrm{logit}(r)\), while keeping \(\sigma_0\) and \(\sigma_K\) fixed across all subjects:
m_hierarchical <- stantva_model(
formula = list(
C ~ 1 + condition + (1 + condition | subject),
log(alpha) ~ 1 + (1 | subject),
mK ~ 1 + (1 | subject),
mu0 ~ 1 + (1 | subject),
log(r) ~ 1 + (1 | subject)
),
locations = 6,
task = "pr",
regions = list(left = 1:3, right = 4:6),
C_mode = "equal",
w_mode = "regions",
t0_mode = "gaussian",
K_mode = "probit",
priors =
prior(normal(90, 20/sqrt2()), coef = Intercept, dpar = C) +
prior(normal(0, 10/sqrt2()), dpar = C) +
prior(gamma(2, 2/20 * sqrt2()), class = sd, coef = Intercept, dpar = C) +
prior(gamma(2, 2/10 * sqrt2()), class = sd, dpar = C) +
prior(normal(-0.4, 0.6/sqrt2()), coef = Intercept, dpar = alpha) +
prior(normal(0, 0.3/sqrt2()), dpar = alpha) +
prior(gamma(2, 2/0.6 * sqrt2()), class = sd, coef = Intercept, dpar = alpha) +
prior(gamma(2, 2/0.3 * sqrt2()), class = sd, dpar = alpha) +
prior(normal(0, 0.1/sqrt2()), coef = Intercept, dpar = r) +
prior(normal(0, 0.05/sqrt2()), dpar = r) +
prior(gamma(2, 2/0.1 * sqrt2()), class = sd, coef = Intercept, dpar = r) +
prior(gamma(2, 2/0.05 * sqrt2()), class = sd, dpar = r) +
prior(normal(3.5, 0.5/sqrt2()), coef = Intercept, dpar = mK) +
prior(normal(0, 0.25/sqrt2()), dpar = mK) +
prior(gamma(2, 2/0.5 * sqrt2()), class = sd, coef = Intercept, dpar = mK) +
prior(gamma(2, 2/0.25 * sqrt2()), class = sd, dpar = mK) +
prior(normal(20, 15/sqrt2()), coef = Intercept, dpar = mu0) +
prior(normal(0, 7.5/sqrt2()), dpar = mu0) +
prior(gamma(2, 2/15 * sqrt2()), class = sd, coef = Intercept, dpar = mu0) +
prior(gamma(2, 2/7.5 * sqrt2()), class = sd, dpar = mu0) +
prior(normal(0.5, 0.05/sqrt2()), coef = Intercept, dpar = sK) +
prior(normal(0, 0.025/sqrt2()), dpar = sK) +
prior(gamma(2, 2/0.05 * sqrt2()), class = sd, coef = Intercept, dpar = sK) +
prior(gamma(2, 2/0.025 * sqrt2()), class = sd, dpar = sK) +
prior(normal(20, 15/sqrt2()), coef = Intercept, dpar = sigma0) +
prior(normal(0, 7.5/sqrt2()), dpar = sigma0) +
prior(gamma(2, 2/15 * sqrt2()), class = sd, coef = Intercept, dpar = sigma0) +
prior(gamma(2, 2/7.5 * sqrt2()), class = sd, dpar = sigma0)
)We can then simply fit the hierarchical model
m_hierarchical to the entire data set
tva_recovery without subsetting:
fg <- sampling(m_hierarchical, tva_recovery)
fit_bayesian_hierarchical_globals <-
{
p1 <- predict(fg, tibble(subject = 1:50, condition = "low"), c("C","alpha", "r", "mu0","sigma0","mK","sK"))
p2 <- predict(fg, tibble(subject = 1:50, condition = "high"), "C")
tibble(
subject = 1:50,
`italic(C)[low]` = t(p1$C),
`italic(C)[high]` = t(p2$C),
`alpha` = t(p1$alpha),
`italic(w)[lambda]` = t(p1$r[,2,]),
`italic(t)[0]` = t(p1$mu0),
`sigma[0]` = t(p1$sigma0),
`italic(K)` = t(matrix(meanprobitK(p1$mK,p1$sK,6),ncol=50))
)
} %>%
pivot_longer(-subject, names_to = "param", values_to = "samples") %>%
rowwise(c(subject,param)) %>%
reframe(posterior_sd = sd(samples), fitted_value = median(samples), cri = t(quantile(samples, c(.025,.975))), converged = Rhat(t(samples)) < 1.1) %>%
left_join(true_params_narrow2)figdat <- bind_rows(
fit_mle_nested_reg %>% add_column(method = "MLRN"),
fit_bayesian_hierarchical_globals %>% add_column(method = "HB"),
fit_bayesian_nested %>% add_column(method = "NB")
) %>%
mutate(method = factor(method, levels = c("MLRN","NB","HB")))
fig <- ggplot(figdat) +
theme_minimal() +
theme(text = element_text(family = "sans", size = 10)) +
labs(x = "True value", y = "Recovered value") +
facet_wrap(method~param, ncol = 7, scales = "free", labeller = function(x) label_parsed(list(sprintf("%s~(\"%s\")", x$param, x$method)))) +
geom_abline(linetype = "dashed") +
geom_linerange(aes(x=true_value,ymin=cri[,1],ymax=cri[,2]), color = "gray50", linewidth = 0.2) +
geom_point(aes(x=true_value,y=fitted_value), size = 0.5, color = "blue") +
geom_point(aes(x=vx,y=vy), color = "transparent", figdat %>% group_by(param) %>% reframe(vx = range(true_value), vy = range(c(fitted_value,cri), na.rm = TRUE)))
print(fig)