PDF version of this document

next up previous
Next: Results Up: Assessing the Accuracy of Previous: Introduction

Method

The first of the proposed methods assesses registration as the spatial overlap, defined using Tanimoto's formulation of corresponding regions in the registered images. The correspondence is defined by binary labels of distinct image regions (in this case brain tissue classes), produced by manual mark-up of the original images (ground-truth labels). When labels are registered in tandem with corresponding images, these labels become fuzzy. A correctly registered image set will exhibit high relative overlap between corresponding brain structures in different images and, in the opposite case, low overlap with non-corresponding structures. A generalised overlap measure [1] is used to compute a single figure of merit for the overall overlap of all labels over all subjects.

PMF$\displaystyle = \frac{ \displaystyle\sum\limits_{\mbox{\small pairs},k}\: \sum\...
...abels},l}\alpha_{l} \sum\limits_{\mbox{\small voxels},i} MAX(A_{kli},B_{kli}) }$ (1)

Figure 1: Left: The model evaluation framework. Each image in the training set is compared against every image generated by the model; Right: Training set and model in hyperspace.
[scale=0.35]../EPS/model-horizontal.png [scale=0.2]../EPS/clouds.png

where $ i$ indexes voxels in the registered images, $ l$ indexes the label and $ k$ indexes the two images under consideration. $ A_{kli}$ and $ B_{kli}$ represent voxel label values in a pair of registered images and are in the range [0, 1]. The $ MIN()$ and $ MAX()$ operators are standard results for the intersection and union of a fuzzy set. This generalised overlap measures the consistency with which each set of labels partitions the image volume. The parameter $ \alpha_{l}$ affects the relative weighting of different labels. With $ \alpha_{l}=1$, label contributions are implicitly volume weighted with respect to one another. We have also considered the cases where $ \alpha_{l}$ weights for the inverse label volume (which makes the relative weighting of different labels equal), where $ \alpha_{l}$ weights for the inverse label volume squared (which gives labels of smaller volume higher weighting) and where $ \alpha_{l}$ weights for a measure of label complexity (which we define arbitrarily as the mean absolute voxel intensity gradient in the label).

Figure 2: The effect of varying the first model parameter of a brain appearance model by $ \pm 2.5$ standard deviations.
[scale=0.4]../EPS/brain_0_cps.png

The second method assesses registration as the quality of a generative, statistical appearance model, constructed from registered images. The idea is that a correct registration produces a true dense correspondence between the images, resulting in a better statistical appearance model of the images.

Registration is then evaluated through specificity and generalisation ability [17] of the model, or the ability of the model to i) generate realistic examples of the modelled entity and ii) represent well both seen and unseen examples of the modelled class. In practice, these are evaluated by using generative properties of the model to produce a large number of synthetic examples (in this case brain images) that are then compared to real examples in the original set using some pre-defined image distance measure. Minimum distances of synthetic examples to examples in the original set and vice versa, give model specificity and generalisation respectively. Image distance is measured as a mean shuffle distance, or minimum Euclidean distance between a pixel in one image and a corresponding neighbourhood of pixels in the other.

Figure 3: Left: The calculation of a shuffle difference image; Right: An example of the shuffle difference (right) when applied to two MR brain slices
[scale=0.33]../EPS/shuffle.png [scale=0.25]../EPS/shuffle_distance_7x7.png

Figure 4: Behaviour of proposed metrics with increasing registration perturbation: a) Generalisation, b) Specificity and c) Tantimoto overlap
[scale=0.33]../EPS/BW_MIAS_generalisation.png [scale=0.33]../EPS/BW_MIAS_specificity.png [scale=0.33]../EPS/BW_MIAS_overlap_label.png

To test the validity of the proposed methods, the brain images were annotated with 6 tissue classes including gray, white matter and CSF that provided the ground truth for image correspondence. Initially, the images were brought into alignment using an NRR algorithm based on the MDL optimisation. A test set of different registrations was then created by applying random perturbations to each image in the registered set using diffeomorphic clamped-plate splines. By choosing a different perturbation seed for each image and gradually increasing the magnitude of the perturbations, a series of image sets of progressively worse spatial correspondence and thus registration quality were obtained. By measuring the quality of the registration at each step, the proposed registration assessment measures can be validated.


next up previous
Next: Results Up: Assessing the Accuracy of Previous: Introduction
Roy Schestowitz 2005-11-17