One powerful method for the modelling of anatomy was introduced by Edwards et al. [4] and it is known as appearance models - a natural successor to shape models [3]. This method's prerequisite is a large enough set of data, which is representative of a population and ideally spans its full variability. Appearance models are able to learn what characterises inter-subject or intra-subject changes and determine the prominence of the main characteristics. Hence, it is able to identify changes and derive a model that encapsulates change - all in a data-driven manner.
Non-rigid image registration is ubiquitously used as the basis for analysis of medical images. The results of registration can be used for structural analysis, atlas matching, and analysis of change. Methods for obtaining registration are are well-established and quite uniform in nature. A goal is achieved by warping pairs of images so that they appear more similar. The similarity leads to overlap, which allows corresponding structures to be identified. This problem is complementary to that of modelling groups of images. Statistical models of a group of images need dense correspondence to be defined across the group; non-rigid registration provides exactly that.
Ever since the emergence of appearance models, attempts have been made to reproduce and improve it. To name a few such efforts, Stegmann [5] built 4-dimensional cardiac appearance models and Reuckert et al. [14] derived statistical deformation models from several registrations of the brain. Models have been built in a variety of ways and what is yet lacked is the ability to compare them. It becomes clear from experience that attempts to distinguish between them by eyesight are hopeless. More recently, appearance models were built automatically using piece-wise affine registration [16]. Evaluation of models in this particular case enabled evaluation of registration algorithms.
The approach we present is based on the observation that one of the things one can do with a registered set of image is build a statistical model. So, our proposal is that you can measure the quality of registration in terms of statistical models quality.
The idea of evaluating models was successfully exampled. Davies et al. [2] explored the evaluation of shape models and ultimately developed a robust framework for the task. This paper outlines a principled approach to the evaluation of appearance models, which is a challenging task since their complexity is far higher. The approach is shown to be reliable in evaluation of brain models and, more importantly, it is then used to evaluate registration algorithms, from which appearance can be been derived. The way evaluation has been done so far is by deliberately warping the same image and observing whether the registration algorithm helped in recovering the same answer. The other alternative was to make use of ground truth, though in this the method we present, no such knowledge is needed. Evaluation of registration only requires the registered data and the entire process is automatic.