Theory of FROC for Bayesian context on a single reader and single modality

Issei Tsunoda

2019-05-04

Step by step explanation of the FROC theory

Heauristic and Intuitive explanation

I read several paper or books which explain FROC models. However, I think all of them did not sucessfully explain about the theory FROC model. And I also think my explanation is not sufficient. But I attempt to explain it by slowly and heauristically. I hope this brieaf review of the FROC theory helps the user of this package.

FROC statistical model is very complex, and thus we begin to learn this theory slowly and do not start without the strict manner.

To show what FROC is, we first recall the ROC analysis.

Word

In Radiological context,

We also use the word

Note that

Note that

ROC task (trial)

ROC analysis aims to evaluate the classification ability of the dichotomous outcome, such as desease or non-desease.

That is each trial, reader can answer only two choice. So, each trial exatly generate only one Hits or one False alarm.

In FROC paradigm, reader can answer various choice which leades many hits and many false alarms for each trial.

\(\dagger\) : Gold standard : It means that each shadow in the radiographs can be specified that it is deseased or not.

FROC task (trial)

\(\dagger\) : Gold standard : It means that each shadow in the radiographs can be specified that it is deseased or not.

\({\dagger \dagger }\): Hit \(H_c\) means True Positive (TP)

\({\dagger \dagger \dagger}\): False alarm \(F_c\) means False Positive (FP)

FROC data

Now, we get the following FROC dataset after the FROC trial.

Confidence Level No. of Hits No. of False alarms
5 = definitely present \(H_{5}\) \(F_{5}\)
4 = probably present \(H_{4}\) \(F_{4}\)
3 = equivocal \(H_{3}\) \(F_{3}\)
2 = probably absent \(H_{2}\) \(F_{2}\)
1 = questionable \(H_{1}\) \(F_{1}\)

In the form of FROC data is quietly same as that of ROC data.

But, it differ when we summarise false alarms. In ROC context, the sum over all false alarms cannot exceed the number of trials (images), however, in FROC case it can occure frequently.

Note that both ROC and FROC case, the sum of Hits over all confidence levels cannot be greater than the number of targets (lesions, nodules).

Ontline of the FROC theory

In ROC paradigm, reader allowed only two choice for each image, that is this image is deseased or non deseased. On the other hand, in FROC task reader can mark all his suspicious locations in a single image. So, roughly speaking, FROC statistical model are characterized by the rates of hits \(H\) and false alarms \(F\). Now if we assume that there is \(N_I\) images in which \(N_L\) lesions exists. Here, lesions means targets which should be detected by readers. So, roughly speaking, that is without the binormal assumptions, we can say that the FROC model is as follows;

\[H \sim \text{Binomial}(p, N_L )\\ F \sim \text{Poisson}(\lambda N_I) \] where \(p = \mathbb{E}[H]/N_L\) and \(\lambda = \mathbb{E}[F]/N_I\). We call \(p\) the hit rate and \(\lambda\) the false alarm rate. This model means that hits is generated by each lesion with the rate \(p\). False alarms arise from each image by the rate \(\lambda\).

Unfortunately, FROC model is a little more complicated than this rough model. Since, reader answer his suspicous location with his confidence level which is selected from the levels 1,2,…,\(C\) defined beforehand.

To visualize the FROC model, we use the notion of FROC curve which is analogy of the ROC curve. Recall that ROC curve is a curve in the plane whose \(x\) coordinate is false alarm rate and \(y\) coordinate is hit rate. In the above notation, we can say that

\[\text{Pre FROC curve }(t) := (x(t),y(t)) = (\lambda, p(\lambda))\]

From the definition we can easily see that if \(p\) is more greater, then the FROC curve shift upper direction.

More precise FROC theory

In this section, we introduce the new random variable for the classification of shadows to desease or non desease. Let \(S_l\) be a random variable for each lesion \(l=1,2,...,N_L\) and \(N_i\) be associated with for each image \(i=1,2,...,N_I\).

We assume that these random variable \(X_i,Y_l\) cannot be observed from radiographs, thus to emphasize this non measurable property we call these variable latent variable. We suppose the following two assumptions.

If \(X_i>t\) or \(Y_i >t\), then reader thinks it is deseased with his confidence \(z(t)\), where \(z(t)\) is a some function which is monotonic with respect to \(t\). Thus this variables determine decision of readers. In this context the latent Gaussian variable is called the desicion variable.

These variables decide whether reader think it is positive or not according to its value is greater than certain threshold. Thus it is also called decision variables.

Pre FROC curve

Define \(y(t):=\mathbb{P}[Y_l>t]\) and \(x(t):=\mathbb{P}[X_i>t]\).

To calculate \(y(t):=\mathbb{P}[Y_l>t]\) and \(x(t):=\mathbb{P}[X_i>t]\) more explicitly, we further assume that

\[X_i \sim \text{Normal}(0,1) \\ Y_l \sim \text{Normal}(\mu,\sigma^2)\] , which is sometimes called the bi-normal assumption. Note that if \(\mu\) is far from the mean of \(X_i\) (that is 0), then the observer performance will be greater.

Please keep in mind that the parameter \(\mu,\sigma^2\) is a part of paramteter of our model which should be estimated.

Since \(\frac{Y -\mu}{\sigma}\) is distributed by the standard Gaussian, we can calculate the probabilty of the event that \(\frac{Y -\mu}{\sigma} > \frac{t -\mu}{\sigma}\) and it follows that

\[y(t):=\mathbb{P}[Y_l>t] = \mathbb{P}[\frac{Y -\mu}{\sigma} > \frac{t -\mu}{\sigma}] = 1-\Phi( \frac{t -\mu}{\sigma}) \]

On the other hand, \[x(t):=\mathbb{P}[X_i>t] = 1-\Phi( t),\] thus \[t = t(x) = \Phi ^{-1} (1-x(t) ).\] Substituting this, we get \[y(t) = 1-\Phi( \frac{t -\mu}{\sigma}) = 1-\Phi( \frac{ \Phi ^{-1} (1-x(t) ) -\mu}{\sigma})\] To combine \(1-x(t)\) to the falase alarm rate, recall that \(x(t):=\mathbb{P}[X_i>t]\), so \(1-x(t):=\mathbb{P}[X_i<t]\). Until now, we use the parameter \(t\). From here we consider to use the parameter \(\lambda\) instead of \(t\).

We consider the function \(t=t(\lambda)\) such that \[1-x(t)= 1-x( t(\lambda) ) = e^{-\lambda}= \mathbb{P}[F =0]\]. So, we define \(\xi(\lambda) = x(t(\lambda))\) and \(\eta(\lambda) = y(t(\lambda))\). We define the FROC curve as a pair \((\xi(\lambda), \eta(\lambda))\)

We combine the confidence level and the above results.

In the FROC data, each hits and false alarms are counted for each confidence level.

we introduced the notations, \(N_L\), \(N_I\), \(H_c\), \(F_c\), \(C\). In the R console, these notations are represented by NL, NI, h, f, C.

If \(C=5\), then the dataset for FROC analysis is the follows;

Confidence Level No. of Hits No. of False alarms
5 = definitely present \(H_{5}\) \(F_{5}\)
4 = probably present \(H_{4}\) \(F_{4}\)
3 = equivocal \(H_{3}\) \(F_{3}\)
2 = probably absent \(H_{2}\) \(F_{2}\)
1 = questionable \(H_{1}\) \(F_{1}\)

We devide the set of 1 dimensional real numbers into \(z_1 < z_2 < .... < z_C\). Combining these thresholds with the parameter of the bi-noraml assumptions \(\mu,\sigma^2\) , we obtain our model parameter. Other model parameters, for example \(\lambda_c\) can be deduced from thresholds by \(\lambda _c = -\log \Phi({z_c})\).

And we consider the hits rate are generated by \[p_{c} := \Phi (\frac{z_{c +1}-\mu_{}}{\sigma_{}})-\Phi (\frac{z_{c}-\mu_{}}{\sigma_{}})=\mathbb{P}[z_c <Y_l<z_{c+1}], \]

We also define the squence \(\lambda_1 < \lambda_2 < .... < \lambda_C\) such that \[F_c+...+F_C \sim \text{Poisson}(\lambda_cN_I)\] ,where \(N_I\) is the number of images. This assumption is called the Poisson assumption in FROC context. The following condition combine the bi-normal assumption and the Poisson assumption:

\[ \mathbb{P}[X_i < z_1]= \mathbb{P}[F_1+F_2+F_3+...+F_C =0]\\ \mathbb{P}[X_i < z_2]= \mathbb{P}[F_2+F_3...+F_C =0]\\ .....\\ \mathbb{P}[X_i < z_c]= \mathbb{P}[F_c+...+F_C =0]\\ ...\\ \mathbb{P}[X_i < z_C]= \mathbb{P}[ F_C =0]\\ \] which is equivalent that \[\Phi(z_c) = e^{-\lambda_c}\]

for all \(c\). Thus we can get a correspondence between Poisson rate \(\lambda\) and the threshold \(z\).

Recall that FROC curves parameter \(t\) is the range of the Gaussian random variable, we use \(z\) instead of \(t\), then the curve is \[y(t) =y(z)= 1-\Phi( \frac{t -\mu}{\sigma}) = 1-\Phi( \frac{ \Phi ^{-1} (1-x(t) ) -\mu}{\sigma})\\ = 1-\Phi( \frac{ \Phi ^{-1} (1-x(z) ) -\mu}{\sigma})\\ = 1-\Phi( \frac{ \Phi ^{-1} (e^{-\lambda} ) -\mu}{\sigma})\\ \] In FROC analysis, the parameter of the FROC curve is taken by \(\lambda\).

Thus we can obtain the FROC curve as the following;

Precise Definition of the FROC curve

\[ y(\lambda)= 1-\Phi( \frac{ \Phi ^{-1} (e^{-\lambda} ) -\mu}{\sigma})\\ x(\lambda) = \lambda \]

Note that FROC curve can be interpreted as the curve of the Expectation of the pair of TPF and FPF, where we use two abbreviations FPF = False Positive Fraction and TPF = True Positive Fraction. These words are widely used in the ROC thery and thus we omitt the definition.

In mathematical philosophy, the expression is not important. The most important thing in mathematical expression is the property. So, in the case of the definition of the notion of the FROC curve, the above equations to represents the FROC curve is not important, so, reader may forget it !! I also forget the expressions. However we never forget about the property that the FROC curve is expectation pair of the FPF and TPF. That is, we can write that $$ [ {c’ =c}^C H{c’} ] = y(_c) \ [ {c’ =c}^C F{c’} ] = x(_c) \

$$ I think this equality is more important than the expression of the definition of FROC curve.

To tell the truth, I like the word cumulative false positives per images rather than FPF. If is my original word, but it means obvious and clear rather than FPF. Also I like the word cumulative true positives per lesions. The abbreviations of these two words are CFP and CTP in my paper.

Prior for thersholds monotonicity assumptions

Thresholds \(z_1, z_2, ...., z_C\) should satisfy the monotonicity condition \(z_1 < z_2 < .... < z_C\).

To do so, we use the prior that \[z_2 - z_1 \sim \text{Uniform}(0,\infty) \\ z_3 - z_2 \sim \text{Uniform}(0,\infty) \\ :\\ :\\ z_C - z_{C-1} \sim \text{Uniform}(0,\infty) \\ \] where Uniform\((0,\infty)\) means improper prior whose support is the interbal \((0,\infty)\).

What I want to is …

simple explanation of the FROC theory. But this explanation did not satisy it. Sorry …

The explanation of this vignette is redundant. In the future I reduce this.

References:

I think the follwoing paper is sufficient to understand my paper or this package: