Next: Distances in Image Space
Up: Evaluation Method
Previous: Evaluation Method
A good model of some training set of data should
possess several properties. Firstly, the model
should be able to effectively extrapolate and
interpolate from the training data, to produce a
range of images from the same general class as
those seen in the training set. We will call this
generalisation ability. Conversely, the
model should not produce images which cannot be
considered as valid examples of the class of
object imaged. That is, a model built from brain
images should only generate images which could be
considered as valid images of possible brains. We
will call this the specificity of the
model.
In previous work, quantitative measures of specificity and generalisation were used
to evaluate shape models [#!Davies!#]. We here
present an extension of these quantitative
measures.
Consider first the training data for our model, that is, the set
of images which were the input to our NRR algorithm. Without loss
of generality, each training image can be considered as a single
point in image space (see Figure ). A statistical
model is then a probability density function 40#40 defined on
this space.
Figure:
The model
evaluation framework: A model is constructed from the training set
and then images are generated from the model. The training set of
images and the set generated by the model can be viewed as clouds
of points in image space.
41#41 
To be specific, let
42#42
denote the 43#43 images of the training set when
considered as points in image space. Let 40#40
be the probability density function of the model.
We then define our basic quantitative measure of
the specificity 44#44 of the model with
respect to the training set
45#45 as follows:
where 47#47 is a distance on image space,
raised to some positive power 48#48. That is,
for each point 49#49 on image space, we find the
nearestneighbour to this point in the training
set, and sum the powers of the nearestneighbour
distances, weighted by the pdf 40#40. Greater
specificity is indicated by smaller values
of 44#44, and lesser by larger. In
Figure , we give diagrammatic examples
of cases with varying specificity.
Figure:
Training set (points) and model pdf
(shading) in image space. Left: A model
which is specific, but not general. Right:
A model which is general, but not specific.
50#50

The integral in equation is approximated using a MonteCarlo method. A large
random set of images
51#51 is
generated, having the same distribution as the model pdf 40#40.
The estimate of the specificity () is:
with standard error:
where
54#54 is the standard
deviation of the set of measurements for the set
of values of 55#55.
The measure of generalisation is then defined in
an analogous manner:
with standard error:
That is, for each member of the training set
58#58, we compute the distance to the
nearestneighbour in the sample set
59#59.
Large values of 60#60 correspond to model
distributions which do not cover the training set
and have poor generalisation ability, whereas
small values of 60#60 indicate models
with better generalisation ability.
We note here that both measures can be further
extended, by considering the sum of distances to
knearestneighbours, rather than just to the
single nearestneighbour. However, in what
follows, we restrict ourselves to just the single
nearestneighbour case.
Next: Distances in Image Space
Up: Evaluation Method
Previous: Evaluation Method
Roy Schestowitz
20070311