Pairs of R script to fit model to data and the definition of model

Aim of this vignettes

Radiograph

Radiograph

Provides the following:

Notation

By \(\Phi()\), we means the cumulative distribution function of the Gaussian disitribution with mean 0 and variance 1.

Single reader and single modality

Example data:

An R object named d

Recall that the data named \(d\) appeared later in the R console has the following format:

Number of Confidence Level Number of Hits Number of False alarms
3 = definitely present \(H_{3}=97\) \(F_{3}=1\)
2 = equivocal \(H_{2}=32\) \(F_{2}=14\)
1 = questionable \(H_{1}=31\) \(F_{1}=74\)

Further let \(N_L\) and \(N_I\) represent the number of Lesions and number of images, respectively. In the data d, \(N_L=259\) and \(N_I=57\).

In case of data of multiple reader and multiple case, such hits and false alarms are calculated with each modality and reader, we omit to show it, since it is complex.

false positive fraction is per image.

R script

In this R script, the function rstan::sampling() runs implicitly, thus, some stan file are used, I explain the model described in the stan file used in the above code.

Implemented model

First, for the simplicity, we show the definition of Bayesian Model without any explanation or proof:

\[\begin{eqnarray*} H_{c } & \sim &\text{Binomial} ( p_{c}, N_{L} ), \text{ for $c=1,2,...,C$.}\\ F_{c } & \sim &\text{Poisson}( (\lambda _{c} -\lambda _{c+1} )\times N_{I} ), \text{ for $c=1,2,...,C-1$.}\\ \lambda _{c}& =& - \log \Phi ( z_{c } ),\text{ for $c=1,2,...,C$.}\\ p_{c} &=&\Phi (\frac{z_{c +1}-\mu}{\sigma})-\Phi (\frac{z_{c}-\mu}{\sigma}), \text{ for $c=1,2,...,C-1$.}\\ p_C & =& 1-\Phi (\frac{z_{C}-\mu}{\sigma}),\\ F_{C} & \sim & \text{Poisson}( (\lambda _{C} - 0)N_I),\\ dz_c=z_{c+1}-z_{c} &\sim& \text{Uniform}(0,\infty), \text{ for $c=1,2,...,C-1$.}\\ \mu &\sim& \text{Uniform}(-\infty,\infty),\\ \sigma &\sim& \text{Uniform}(0,\infty),\\ \end{eqnarray*}\] Our model has parameters \(z_{1}, dz_1,dz_2,\cdots, dz_{C-1}\), \(\mu\), and \(\sigma\). Notation \(\text{Uniform}( -\infty,100000)\) means the improper uniform distribution of its support is the unbounded interval \(( -\infty,100000)\).

per lesion

R script

Next we show the model of the following code:

Implemented model

\[\begin{eqnarray*} H_{c } & \sim &\text{Binomial} ( p_{c}, N_{L} ), \text{ for $c=1,2,...,C$.}\\ F_{c } & \sim &\text{Poisson}( (\lambda _{c} -\lambda _{c+1} )\times N_{L} ), \text{ for $c=1,2,...,C-1$.}\\ \lambda _{c}& =& - \log \Phi ( z_{c } ),\text{ for $c=1,2,...,C$.}\\ p_{c} &=&\Phi (\frac{z_{c +1}-\mu}{\sigma})-\Phi (\frac{z_{c}-\mu}{\sigma}), \text{ for $c=1,2,...,C-1$.}\\ p_C & =& 1-\Phi (\frac{z_{C}-\mu}{\sigma}),\\ F_{C} & \sim & \text{Poisson}( (\lambda _{C} - 0)N_I),\\ dz_c=z_{c+1}-z_{c} &\sim& \text{Uniform}(0,\infty), \text{ for $c=1,2,...,C-1$.}\\ \mu &\sim& \text{Uniform}(-\infty,\infty),\\ \sigma &\sim& \text{Uniform}(0,\infty),\\ \end{eqnarray*}\] Our model has parameters \(z_{1}, dz_1,dz_2,\cdots, dz_{C-1}\), \(\mu\), and \(\sigma\). Notation \(\text{Uniform}( -\infty,100000)\) means the improper uniform distribution of its support is the unbounded interval \(( -\infty,100000)\).

The second model differs from the first model in the Poisson part.

Multiple reader and multiple case

R script

where, dd are data of multiple reader and multiple modality., which is not shown nor explained here for simplicity.

The R object fit is a fitted model object.

Next, we shall show the Bayesian model using the above code.

Implemented model

\[\begin{eqnarray*} H_{c,m,r} & \sim &\text{Binomial }( p_{c,m,r}, N_L ),\\ F_{c,m,r} &\sim& \text{Poisson }( ( \lambda _{c} - \lambda _{c+1})N_L ),\\ \lambda _{c}& =& - \log \Phi (z_{c }),\\ p_{c,m,r} &:=&\Phi (\frac{z_{c +1}-\mu_{m,r}}{\sigma_{m,r}})-\Phi (\frac{z_{c}-\mu_{m,r}}{\sigma_{m,r}}), \\ p_C & =& 1-\Phi (\frac{z_{C}-\mu_{m,r}}{\sigma_{m,r}}),\\ F_{C,m,r} & \sim &\text{Poisson } ( (\lambda _{C} - 0)N_I),\\ A_{m,r}&:=&\Phi (\frac{\mu_{m,r}/\sigma_{m,r}}{\sqrt{(1/\sigma_{m,r})^2+1}}), \\ A_{m,r}&\sim&\text{Normal} (A_{m},\sigma_{r}^2), \\ dz_c&:=&z_{c+1}-z_{c},\\ dz_c, \sigma_{m,r} &\sim& \text{Uniform}(0,\infty),\\ z_{c} &\sim& \text{Uniform}( -\infty,100000),\\ A_{m} &\sim& \text{Uniform}(0,1).\\ \end{eqnarray*}\] Our new model has parameters \(z_{1}, dz_1,dz_2,\cdots, dz_{C}\), \(A_{m}\), \(\sigma_{r}\), \(\mu_{m,r}\), and \(\sigma_{m,r}\).

The followings are redundant, should not read

Divergent transition (no need to read the section)

In the above model, author write down the highest confidence level separately. Because if not, then it cause the divergent transition issue. In the past, I wrote the model with the assumption that the highest threshold \(z_{C+1}\) is the very large number (theoretically it is infinity) and whose prior is the uniform distribution with very large support, then it cause the divergent transitions almost all iterations in the MCMC. Also, I use the target formulation to avoid Jacobin warnings, such warnings also difficult for me to solve or understand what it say or how overcome.

One may thinks my models are very complex. In fact many people will hate my explanation. One reason why my model is so complex is the FROC statistical model is very complex and without the reader’s effort, cannot understand. Further more, the description of my model is changed to avoid the divergent transition issues, it make my model more complex one. Furhter I had to overcome Jacobian issues, which formulation is not intuitive for me.

I am very tired to explain my model and also disappeared my ability :’-D

These model are made in two years ago, 2017.May ~ 2017. August.

I briefly explain the divergent transition in my model. In theoretical perspective, it natural to introduce the highest decision threshold as an very large number, which is a parameter of model. Unfortunately, it is distributed by the uniform distributions with very large support. Then once we introduce such theoretically infinity parameter in the numerical program, it case the divergent transition issues.

Please read my paper

This vignettes are most important one, and if reader wish to understand these theory, then please read my paper.

My paper cannot upload to Arxiv, since I cannot find some who endorse my paper.

I also …

include in this package other hierarchical models, but I do not exhibit all of them here, my health is very bad, so, I am tired, sorry.

Future research direction

ANOVA model

In FROC model. we need to … my aches stop me to write … good bye !!