PDF version of this entire document

Specificity

Consider first the training data for the model, that is, the set of images to which NRR is applied. Without loss of generality, each training image can be treated as a single point in an $n$-dimensional image space. A statistical model is then a probability density function (pdf) $p(\mathbf{z})$ defined on this space.

To be specific, let $\{\mathbf{I}_{i}:i=1,\ldots \mathcal{N}\}$ denote the $\mathcal{N}$ images of the training set when considered as points in image space. Let $p(\mathbf{z})$ be the probability density function of the model.

A quantitative measure of the specificity $S$ of the model is defined, wrt the training set $\mathcal{I} = \{\mathbf{I}_{i}\}$ as follows:

\begin{displaymath}S_{\lambda}(\mathcal{I};p) \doteq \int p(\mathbf{z}) \mathbf{...
...mathbf{z}-\mathbf{I}_{i}\vert\right)^{\lambda} \: d\mathbf{z},
\end{displaymath} (6.1)

where $\vert\cdot\vert$ is a distance on image space (see Section 6.3), raised to some positive power $\lambda$ (for the remainder of this chapter only the case $\lambda$ = 1 will be considered). That is, for each point $\mathbf{z}$ on image space, the nearest-neighbour to this point in the training set is found. Then, the nearest-neighbour distances raised to the power $\lambda$ are weighted by the pdf $p(\mathbf{z})$. Greater specificity is indicated by smaller values of $S$, and vice versa. In Figure [*], diagrammatic examples of models with differing specificity are given.

The integral in equation [*] can be approximated using a Monte-Carlo method. A large random set of images $\{ \mathbf{I}_{\mu}:\, \mu=1,\ldots \mathcal{M}\}$ is generated, having the same distribution as the model pdf $p(\mathbf{z})$. The estimate of the specificity ([*]) is:

\begin{displaymath}
S_{\lambda}(\mathcal{I};p)\approx \frac{1}{\mathcal{M}}\sum\...
...ft(\vert\mathbf{I}_{i}-\mathbf{I}_{\mu}\vert\right)^{\lambda}, \end{displaymath} (6.2)

with standard error:
\begin{displaymath}\sigma_{S}=\frac{SD_{\mu} \left\{\mathbf{min}_{i}\{\vert\math...
...athbf{I}_{\mu}\vert^\lambda\}\right\}}{\sqrt{\mathcal{M}-1}},
\end{displaymath} (6.3)

where $SD_{\mu}$ is the standard deviation of the set of $\mu$ measurements. Note that this definition of $S$ does not require that the space of images is constructed. Instead, one simply needs to be able to define distances between images. This is discussed in Section [*] below.

It is worth adding that while more specific models will be close to the training data, complete separation between the two groups of images (training and synthetic) ensures there is no bias.

Roy Schestowitz 2010-04-05