The key idea to rely on is the fact that a good NRR produces correspondences that build a good model. If good NRR algorithms and results lead to a good model, then all that remains is to evaluate models.
Previous work on shape models utilised measures of model specificity and generalisation. Chapter 4, which is focused on shapes, illustrates the fact that the quality of shape models can be evaluated directly from just a set of points, but it is not the case when handling models of both shape and intensity because their significance needs to be weighed and then combined somehow. The contribution here is the introduction of Generalisation and Specificity, which are means for evaluating appearance models. Both depend on generating a distribution of images using the model, then comparing this distribution with the distribution of training images. In turn, these two measures become metrics that tell apart good NRR from a poorer one.