PDF version of this entire document

next up previous
Next: Distances in Image Space Up: Evaluation Method Previous: Evaluation Method

Specificity and Generalisation

A good model of some training set of data should possess several properties. Firstly, the model should be able to effectively extrapolate and interpolate from the training data, to produce a range of images from the same general class as those seen in the training set. We will call this generalisation ability. Conversely, the model should not produce images which cannot be considered as valid examples of the class of object imaged. That is, a model built from brain images should only generate images which could be considered as valid images of possible brains. We will call this the specificity of the model.

In previous work, quantitative measures of specificity and generalisation were used to evaluate shape models [#!Davies!#]. We here present an extension of these quantitative measures.

Consider first the training data for our model, that is, the set of images which were the input to our NRR algorithm. Without loss of generality, each training image can be considered as a single point in image space (see Figure [*]). A statistical model is then a probability density function 40#40 defined on this space.

Figure: The model evaluation framework: A model is constructed from the training set and then images are generated from the model. The training set of images and the set generated by the model can be viewed as clouds of points in image space.
To be specific, let 42#42 denote the 43#43 images of the training set when considered as points in image space. Let 40#40 be the probability density function of the model.

We then define our basic quantitative measure of the specificity 44#44 of the model with respect to the training set 45#45 as follows:

46#46 (6)

where 47#47 is a distance on image space, raised to some positive power 48#48. That is, for each point 49#49 on image space, we find the nearest-neighbour to this point in the training set, and sum the powers of the nearest-neighbour distances, weighted by the pdf 40#40. Greater specificity is indicated by smaller values of 44#44, and lesser by larger. In Figure [*], we give diagrammatic examples of cases with varying specificity.

Figure: Training set (points) and model pdf (shading) in image space. Left: A model which is specific, but not general. Right: A model which is general, but not specific.
The integral in equation [*] is approximated using a Monte-Carlo method. A large random set of images 51#51 is generated, having the same distribution as the model pdf 40#40. The estimate of the specificity ([*]) is:

52#52 (7)

with standard error:

53#53 (8)

where 54#54 is the standard deviation of the set of measurements for the set of values of 55#55.

The measure of generalisation is then defined in an analogous manner:

56#56 (9)

with standard error:

57#57 (10)

That is, for each member of the training set 58#58, we compute the distance to the nearest-neighbour in the sample set 59#59. Large values of 60#60 correspond to model distributions which do not cover the training set and have poor generalisation ability, whereas small values of 60#60 indicate models with better generalisation ability.

We note here that both measures can be further extended, by considering the sum of distances to k-nearest-neighbours, rather than just to the single nearest-neighbour. However, in what follows, we restrict ourselves to just the single nearest-neighbour case.

next up previous
Next: Distances in Image Space Up: Evaluation Method Previous: Evaluation Method
Roy Schestowitz 2007-03-11