An assumption is made when defining Specificity and Generalisation in this way. The space is required to be of a form which permits meaningful distances to be measured. Additionally, one must consider the fact that it is important to show the distinction between a measure which explores/accounts for too broad a space and one which is centered only around the training set. For example, generalisation might not properly reflect on the distribution among training images.
The situation is likely to be acceptable if the sets being considered are quite large. It is not possible to generate or obtain more training images (as opposed to synthetic ones), but assuming that the training set is sufficiently large, as it is in subsequent chapters, the difference between the sizes of the two sets - training and synthetic, that is - can blur this gap to a degree. It is thus recommended that this method is applied where a distribution of the training set is fairly consistent, without many outliers. Large training sets are recommended also.
Another important point to be made is that while the measured specificity and generalisation increase as a function of misregistration, this does not necessarily mean that they actually measure misregistration. This word of warning applies here, but as I will show later, an empirical relationship is demonstrated between assessment which is based on ground truth and the method presented in this chapter. At least for this plausible data, there is a monotonic relationship between the degree of registration and these other measures that I use. While it does work for this particular data, there is no way to guarantee that it would work for all data. What I can conclude is based on strong empirical evidence for one class of images, registered in one particular way.
Roy Schestowitz 2010-04-05