Our approach to the assessment of NRR relies on the close
relationship between registration and statistical model building,
and extends the work of Davies et al. on evaluating shape models
[6]. We note that NRR of a set of images establishes the
dense correspondence which is required to build a combined appearance
model. Given the correct correspondence, the model provides a concise
description of the training set. As the correspondence is degraded,
the model also degrades in terms of its ability to reconstruct images
of the same class, not in the training set, and its ability to only
synthesise new images similar to those in the training set. If we
represent training images and those synthesised by the model as points
in a high dimensional space, the clouds represented by training and
synthetic images ideally overlap fully (see Fig. 2). The two clouds
can be inter-connected to form a graph. Given a measure of the distance
between images (see next section), graph entropy and its standard
errors [11] can be defined as follows:
(4) |
(5) |
where { is a large set of images sampled from the model, is the distance between two images and is standard deviation.
Entropy is used as a measure of model compactness. A good model will generate images which are similar to its training set and thus, inter-image distances will be small (entropy values are low for a good model).
Entropy estimates the distance between images generated by the model and their closest neighbours in the training set, but it can also estimates the mean distance between images in the training set and their closest neighbours in the synthesised set. The approach is illustrated diagrammatically in Fig. 3.
Fig. 3. The model evaluation framework. Fig. 2. Training set and model
Each image in the training set is compared synthesis in hyperspace
against every image generated by the model