Appearance Model Evaluation

Our approach to model evaluation is based on measuring, directly,
key properties of the model. This approach is based on the work
of Davies et al [6], who defined specificity and
generalisation ability for shape models. To be effective, a model
needs the ability to generate a broad range of examples of the
class of images that have been modelled. We refer to this as
*Generalisation* ability. Although this property is
necessary , it is not sufficient. We also require that the model
can only generate examples that are consistent with the class of
images modelled. We refer to this as *Specificity*. We
define both of these measures by comparing the distribution of
training images and the distribution of images generated using the
model. An overview of the approach is given in Figure
2. Any image can be considered as a point in a
high-dimensional space (defined by its intensity values). The
training set forms a cloud of points in such a space. If we
sample from the model, we generate a second cloud of points in
this space. For an ideal model, the two clouds are coincident. We
define *Generalisation* and *Specificity* in terms of
the distance from each training image to the nearest
model-generated image, and the distance from each model-generated
image to the nearest training image respectively. We discuss the
choice of an appropriate distance metric in section
3.3.