Our approach to model evaluation is based on measuring, directly, key properties of the model. To be effective, a model needs the ability to generate a broad range of examples of the class of images that have been modelled. We refer to this as Generalisation ability. Although this property is necessary , it is not sufficient. We also require that the model can only generate examples that are consistent with the class of images modelled. We refer to this as Specificity. We define both of these measures by comparing the distribution of training images and the distribution of images generated using the model. An overview of the approach is given in Figure 2. Any image can be considered as a point in a high-dimensional space (defined by it's intensity values). The training set forms a cloud of points in such a space. If we sample from the model, we generate a second cloud of points in this space. For an ideal model, the two clouds are coincident. We define Generalisation and Specificity in terms of the distance from each training image to the nearest model-generated image, and the distance from each model-generated image to the nearest training image. We discuss the choice of an appropriate distance metric in section 3.3.
[width = 0.85 ]../Graphics/hyperspace_example.png