PDF version of this document

next up previous
Next: Experimental Validation Up: Assessing the Accuracy of Previous: Introduction


The first of the proposed methods for assessing registration quality uses a generalisation of Tanimoto's spatial overlap measure [1]. We start with a manual mark-up of each image, providing an anatomical/tissue label for each voxel, and measure the overlap of corresponding labels following registration. Each label is represented using a binary image, but after warping and interpolation into a common reference frame, based on the results of NRR, we obtain a set of fuzzy label images. These are combined in a generalised overlap score [5]:

$\displaystyle \mathcal{O} = \frac{ \sum\limits_{\mbox{\small pairs},k}\: \sum\l...
...abels},l}\alpha_{l} \sum\limits_{\mbox{\small voxels},i} MAX(A_{kli},B_{kli}) }$ (1)

where $ i$ indexes voxels in the registered images, $ l$ indexes the label and $ k$ indexes image pairs. $ A_{kli}$ and $ B_{kli}$ represent voxel label values in a pair of registered images and are in the range [0, 1]. The $ MIN()$ and $ MAX()$ operators are standard results for the intersection and union of fuzzy sets. The generalised overlap measures the consistency with which each set of labels partitions the image volume. The parameter $ \alpha_{l}$ affects the relative weighting of different labels. With $ \alpha_{l}=1$, label contributions are implicitly volume weighted with respect to one another. We have also considered the cases where $ \alpha_{l}$ weights for the inverse label volume (which makes the relative weighting of different labels equal), where $ \alpha_{l}$ weights for the inverse label volume squared (which gives labels of smaller volume higher weighting) and where $ \alpha_{l}$ weights for a measure of label complexity (which we define arbitrarily as the mean absolute voxel intensity gradient in the label).

Figure 1: Training set and model in hyperspace

The second method assesses registration in terms of the quality of a generative statistical appearance model, constructed from the registered images - for all the experiments reported here, this was an active appearance model (AAM) [3]. The idea is that a correct registration produces an anatomically meaningful dense correspondence between the images, resulting in a better appearance model of the anatomy. We define model quality using two measures - generalisation and specificity. Both are measures of overlap between the distribution of original images and a distribution of images sampled from the model, as illustrated in Figure 1. If we use the generative property of the model to synthesise a large set of images, $ \{I_{\alpha}:\alpha=1,\ldots m\}$, we can define Generalisation $ G$ as:

$\displaystyle G=\frac{1}{n} \sum\limits_{i=1}^{n}\min_{\alpha}\vert I_{i}-I_{\alpha}\vert,$ (2)

where $ \vert\cdot\vert$ is a measure of distance between images, $ I_{i}$ is the $ i^{th}$ training image, and $ \min_{\alpha}$ is the minimum over $ \alpha$ (the set of synthetic images). That is, Generalisation is the average distance from each training image to its nearest neighbour in the synthetic image set. A good model exhibits a low value of $ G$, indicating that the model can generate images that cover the full range of appearances present in the original image set. Similarly, we can define Specificity $ S$ as:

$\displaystyle S=\frac{1}{m} \sum\limits_{\alpha=1}^{m}\min_{i}\vert I_{i}-I_{\alpha}\vert.$ (3)

That is, Specificity is the average distance of each synthetic image from its nearest neighbour in the original image set. A good model exhibits a low value of $ S$, indicating that the model only generates synthetic images that are similar to those in the original image set. The uncertainty in estimating $ G$ and $ S$ can also be computed.

In our experiments we have defined $ \vert\cdot\vert$ as the shuffle distance between two images, as illustrated in Figure 2. Shuffle distance is the mean of the minimum absolute difference between each pixel/voxel in one image, and the pixels/voxels in a shuffle neighbourhood of radius $ r$ around the corresponding pixel/voxel in a second image. When $ r
\leq 1$, this is equivalent to the mean absolute difference between corresponding pixels/voxels, but for larger values of $ r$ the distance increases more smoothly as the misalignment of structures in the two images increases. The effect on the pixel-by-pixel contribution to shuffle distance as $ r$ is increased is illustrated in Figure 3.

Figure 2: The calculation of a shuffle difference image

Figure: Shuffle distance evaluation: Left: one image, Right: another image, Centre, from left to right: images showing contributions to shuffle distance, for $ r =
0\:$(abs. diff.)$ ,\:1.5,\: 2.1$    & $ 3.7$ respectively.

Figure 4: From Left: Specificity, Generalisation & Tanimoto overlap as a function of registration perturbation.
[width=0.89 ]../EPS/Carole/Allthree.png

next up previous
Next: Experimental Validation Up: Assessing the Accuracy of Previous: Introduction
Roy Schestowitz 2006-02-08