Theory of FROC for Bayesian context on a single reader and single modality

Issei Tsunoda

2019-05-28

#Step by step explanation of the FROC theory #Heuristic and Intuitive explanation

I read several paper or books which explain FROC models. However, I think all of them did not successfully explain about the theory FROC model. And I also think my explanation is not sufficient. But I attempt to explain it by slowly and heuristically. I hope this brief review of the FROC theory helps the user of this package.

FROC statistical model is very complex, and thus we begin to learn this theory slowly and do not start without the strict manner.

To show what FROC is, we first recall the ROC analysis.

Word

In Radiological context,

We also use the word

Note that

Note that

ROC task (trial)

ROC analysis aims to evaluate the classification ability of the dichotomous outcome, such as disease or non-disease.

That is each trial, reader can answer only two choice. So, each trial exactly generate only one Hits or one False alarm.

In FROC paradigm, reader can answer various choice which leads many hits and many false alarms for each trial.

\(\dagger\) : Gold standard : It means that each shadow in the radiographs can be specified that it is diseased or not.

FROC task (trial)

\(\dagger\) : Gold standard : It means that each shadow in the radiographs can be specified that it is diseased or not.

\({\dagger \dagger }\): Hit \(H_c\) means True Positive (TP)

\({\dagger \dagger \dagger}\): False alarm \(F_c\) means False Positive (FP)

FROC data

Now, we get the following FROC dataset after the FROC trial.

Confidence Level No. of Hits No. of False alarms
5 = definitely present \(H_{5}\) \(F_{5}\)
4 = probably present \(H_{4}\) \(F_{4}\)
3 = equivocal \(H_{3}\) \(F_{3}\)
2 = probably absent \(H_{2}\) \(F_{2}\)
1 = questionable \(H_{1}\) \(F_{1}\)

In the form of FROC data is quietly same as that of ROC data.

But, it differ when we summarize false alarms. In ROC context, the sum over all false alarms cannot exceed the number of trials (images), however, in FROC case it can occur frequently.

Note that both ROC and FROC case, the sum of Hits over all confidence levels cannot be greater than the number of targets (lesions, nodules).

Outline of the FROC theory

In ROC paradigm, reader allowed only two choice for each image, that is this image is diseased or non diseased. On the other hand, in FROC task reader can mark all his suspicious locations in a single image. So, roughly speaking, FROC statistical model are characterized by the rates of hits \(H\) and false alarms \(F\). Now if we assume that there is \(N_I\) images in which \(N_L\) lesions exists. Here, lesions means targets which should be detected by readers. So, roughly speaking, that is without the bi-normal assumptions, we can say that the FROC model is as follows;

\[H \sim \text{Binomial}(p, N_L )\\ F \sim \text{Poisson}(\lambda N_I) \] where \(p = \mathbb{E}[H]/N_L\) and \(\lambda = \mathbb{E}[F]/N_I\). We call \(p\) the hit rate and \(\lambda\) the false alarm rate. This model means that hits is generated by each lesion with the rate \(p\). False alarms arise from each image by the rate \(\lambda\).

Unfortunately, FROC model is a little more complicated than this rough model. Since, reader answer his suspicious location with his confidence level which is selected from the levels 1,2,…,\(C\) defined beforehand.

To visualize the FROC model, we use the notion of FROC curve which is analogy of the ROC curve. Recall that ROC curve is a curve in the plane whose \(x\) coordinate is false alarm rate and \(y\) coordinate is hit rate. In the above notation, we can say that

\[\text{Pre FROC curve }(t) := (x(t),y(t)) = (\lambda, p(\lambda))\]

From the definition we can easily see that if \(p\) is more greater, then the FROC curve shift upper direction.

More precise FROC theory

In this section, we introduce the new random variable for the classification of shadows to disease or non disease. Let \(S_l\) be a random variable for each lesion \(l=1,2,...,N_L\) and \(N_i\) be associated with for each image \(i=1,2,...,N_I\).

We assume that these random variable \(X_i,Y_l\) cannot be observed from radiographs, thus to emphasize this non measurable property we call these variable latent variable. We suppose the following two assumptions.

If \(X_i>t\) or \(Y_i >t\), then reader thinks it is diseased with his confidence \(z(t)\), where \(z(t)\) is a some function which is monotonic with respect to \(t\). Thus this variables determine decision of readers. In this context the latent Gaussian variable is called the decision variable.

These variables decide whether reader think it is positive or not according to its value is greater than certain threshold. Thus it is also called decision variables.

Pre FROC curve

Define \(y(t):=\mathbb{P}[Y_l>t]\) and \(x(t):=\mathbb{P}[X_i>t]\).

To calculate \(y(t):=\mathbb{P}[Y_l>t]\) and \(x(t):=\mathbb{P}[X_i>t]\) more explicitly, we further assume that

\[X_i \sim \text{Normal}(0,1) \\ Y_l \sim \text{Normal}(\mu,\sigma^2)\] , which is sometimes called the bi-normal assumption. Note that if \(\mu\) is far from the mean of \(X_i\) (that is 0), then the observer performance will be greater.

Please keep in mind that the parameter \(\mu,\sigma^2\) is a part of parameter of our model which should be estimated.

Since \(\frac{Y -\mu}{\sigma}\) is distributed by the standard Gaussian, we can calculate the probability of the event that \(\frac{Y -\mu}{\sigma} > \frac{t -\mu}{\sigma}\) and it follows that

\[y(t):=\mathbb{P}[Y_l>t] = \mathbb{P}[\frac{Y -\mu}{\sigma} > \frac{t -\mu}{\sigma}] = 1-\Phi( \frac{t -\mu}{\sigma}) \]

On the other hand, \[x(t):=\mathbb{P}[X_i>t] = 1-\Phi( t),\] thus \[t = t(x) = \Phi ^{-1} (1-x(t) ).\] Substituting this, we get \[y(t) = 1-\Phi( \frac{t -\mu}{\sigma}) = 1-\Phi( \frac{ \Phi ^{-1} (1-x(t) ) -\mu}{\sigma})\] To combine \(1-x(t)\) to the false alarm rate, recall that \(x(t):=\mathbb{P}[X_i>t]\), so \(1-x(t):=\mathbb{P}[X_i<t]\). Until now, we use the parameter \(t\). From here we consider to use the parameter \(\lambda\) instead of \(t\).

We consider the function \(t=t(\lambda)\) such that \[1-x(t)= 1-x( t(\lambda) ) = e^{-\lambda}= \mathbb{P}[F =0]\]. So, we define \(\xi(\lambda) = x(t(\lambda))\) and \(\eta(\lambda) = y(t(\lambda))\). We define the FROC curve as a pair \((\xi(\lambda), \eta(\lambda))\)

We combine the confidence level and the above results.

In the FROC data, each hits and false alarms are counted for each confidence level.

we introduced the notations, \(N_L\), \(N_I\), \(H_c\), \(F_c\), \(C\). In the R console, these notations are represented by NL, NI, h, f, C.

If \(C=5\), then the dataset for FROC analysis is the follows;

Confidence Level No. of Hits No. of False alarms
5 = definitely present \(H_{5}\) \(F_{5}\)
4 = probably present \(H_{4}\) \(F_{4}\)
3 = equivocal \(H_{3}\) \(F_{3}\)
2 = probably absent \(H_{2}\) \(F_{2}\)
1 = questionable \(H_{1}\) \(F_{1}\)

We divide the set of 1 dimensional real numbers into \(z_1 < z_2 < .... < z_C\). Combining these thresholds with the parameter of the bi-normal assumptions \(\mu,\sigma^2\) , we obtain our model parameter. Other model parameters, for example \(\lambda_c\) can be deduced from thresholds by \(\lambda _c = -\log \Phi({z_c})\).

And we consider the hits rate are generated by \[p_{c} := \Phi (\frac{z_{c +1}-\mu_{}}{\sigma_{}})-\Phi (\frac{z_{c}-\mu_{}}{\sigma_{}})=\mathbb{P}[z_c <Y_l<z_{c+1}], \]

We also define the sequence \(\lambda_1 < \lambda_2 < .... < \lambda_C\) such that \[F_c+...+F_C \sim \text{Poisson}(\lambda_cN_I)\] ,where \(N_I\) is the number of images. This assumption is called the Poisson assumption in FROC context. The following condition combine the bi-normal assumption and the Poisson assumption:

\[ \mathbb{P}[X_i < z_1]= \mathbb{P}[F_1+F_2+F_3+...+F_C =0]\\ \mathbb{P}[X_i < z_2]= \mathbb{P}[F_2+F_3...+F_C =0]\\ .....\\ \mathbb{P}[X_i < z_c]= \mathbb{P}[F_c+...+F_C =0]\\ ...\\ \mathbb{P}[X_i < z_C]= \mathbb{P}[ F_C =0]\\ \] which is equivalent that \[\Phi(z_c) = e^{-\lambda_c}\]

for all \(c\). Thus we can get a correspondence between Poisson rate \(\lambda\) and the threshold \(z\).

Recall that FROC curves parameter \(t\) is the range of the Gaussian random variable, we use \(z\) instead of \(t\), then the curve is \[y(t) =y(z)= 1-\Phi( \frac{t -\mu}{\sigma}) = 1-\Phi( \frac{ \Phi ^{-1} (1-x(t) ) -\mu}{\sigma})\\ = 1-\Phi( \frac{ \Phi ^{-1} (1-x(z) ) -\mu}{\sigma})\\ = 1-\Phi( \frac{ \Phi ^{-1} (e^{-\lambda} ) -\mu}{\sigma})\\ \] In FROC analysis, the parameter of the FROC curve is taken by \(\lambda\).

Thus we can obtain the FROC curve as the following;

Precise Definition of the FROC curve

\[ y(\lambda)= 1-\Phi( \frac{ \Phi ^{-1} (e^{-\lambda} ) -\mu}{\sigma})\\ x(\lambda) = \lambda \]

Note that FROC curve can be interpreted as the curve of the Expectation of the pair of TPF and FPF, where we use two abbreviations FPF = False Positive Fraction and TPF = True Positive Fraction. These words are widely used in the ROC theory and thus we omit the definition.

In mathematical philosophy, the expression is not important. The most important thing in mathematical expression is the property. So, in the case of the definition of the notion of the FROC curve, the above equations to represents the FROC curve is not important, so, reader may forget it !! I also forget the expressions. However we never forget about the property that the FROC curve is expectation pair of the FPF and TPF. That is, we can write that $$ [ {c’ =c}^C H{c’} ] = y(_c) \ [ {c’ =c}^C F{c’} ] = x(_c) \

$$ I think this equality is more important than the expression of the definition of FROC curve.

To tell the truth, I like the word cumulative false positives per images rather than FPF. If is my original word, but it means obvious and clear rather than FPF. Also I like the word cumulative true positives per lesions. The abbreviations of these two words are CFP and CTP in my paper.

Prior for thresholds monotonicity assumptions

Thresholds \(z_1, z_2, ...., z_C\) should satisfy the monotonicity condition \(z_1 < z_2 < .... < z_C\).

To do so, we use the prior that \[z_2 - z_1 \sim \text{Uniform}(0,\infty) \\ z_3 - z_2 \sim \text{Uniform}(0,\infty) \\ :\\ :\\ z_C - z_{C-1} \sim \text{Uniform}(0,\infty) \\ \] where Uniform\((0,\infty)\) means improper prior whose support is the interval \((0,\infty)\).

What I want to is …

simple explanation of the FROC theory. But this explanation did not satisfy it. Sorry …

The explanation of this vignette is redundant. In the future I reduce this.

##References:

I think the following paper is sufficient to understand my paper or this package: