$next$ $up$ $previous$
Next: Experimental Validation Up: Assessing the Accuracy of Previous: Introduction

Method

The first of the proposed methods for assessing registration quality uses a generalisation of Tanimoto's spatial overlap measure [1]. We start with a manual mark-up of each image, providing an anatomical/tissue label for each voxel, and measure the overlap of corresponding labels following registration. Each label is represented using a binary image, but after warping and interpolation into a common reference frame, based on the results of NRR, we obtain a set of fuzzy label images. These are combined in a generalised overlap score [5]:

$\displaystyle \mathcal{O} = \frac{ \sum\limits_{\mbox{\small pairs},k}\: \sum\l... ...abels},l}\alpha_{l} \sum\limits_{\mbox{\small voxels},i} MAX(A_{kli},B_{kli}) }$

(1)

where

indexes voxels in the registered images,

indexes the label and

indexes image pairs. $A_{kli}$ and $B_{kli}$ represent voxel label values in a pair of registered images and are in the range [0, 1]. The

and

operators are standard results for the intersection and union of fuzzy sets. The generalised overlap measures the consistency with which each set of labels partitions the image volume. The parameter $\alpha_{l}$ affects the relative weighting of different labels. With $\alpha_{l}=1$ , label contributions are implicitly volume weighted with respect to one another. We have also considered the cases where $\alpha_{l}$ weights for the inverse label volume (which makes the relative weighting of different labels equal), where $\alpha_{l}$ weights for the inverse label volume squared (which gives labels of smaller volume higher weighting) and where $\alpha_{l}$ weights for a measure of label complexity (which we define arbitrarily as the mean absolute voxel intensity gradient in the label).

**Figure 1:** Training set and model in hyperspace
[width=0.5]../EPS/Carole/clouds.png

The second method assesses registration in terms of the quality of a generative statistical appearance model, constructed from the registered images - for all the experiments reported here, this was an active appearance model (AAM) [3]. The idea is that a correct registration produces an anatomically meaningful dense correspondence between the images, resulting in a better appearance model of the anatomy. We define model quality using two measures - generalisation and specificity. Both are measures of overlap between the distribution of original images and a distribution of images sampled from the model, as illustrated in Figure 1. If we use the generative property of the model to synthesise a large set of images, $\{I_{\alpha}:\alpha=1,\ldots m\}$ , we can define Generalisation as:

$\displaystyle G=\frac{1}{n} \sum\limits_{i=1}^{n}\min_{\alpha}\vert I_{i}-I_{\alpha}\vert,$

(2)

where $\vert\cdot\vert$ is a measure of distance between images, $I_{i}$ is the $i^{th}$ training image, and $\min_{\alpha}$ is the minimum over $\alpha$ (the set of synthetic images). That is, Generalisation is the average distance from each training image to its nearest neighbour in the synthetic image set. A good model exhibits a low value of , indicating that the model can generate images that cover the full range of appearances present in the original image set. Similarly, we can define Specificity as:

$\displaystyle S=\frac{1}{m} \sum\limits_{\alpha=1}^{m}\min_{i}\vert I_{i}-I_{\alpha}\vert.$

(3)

That is, Specificity is the average distance of each synthetic image from its nearest neighbour in the original image set. A good model exhibits a low value of , indicating that the model only generates synthetic images that are similar to those in the original image set. The uncertainty in estimating and can also be computed.

In our experiments we have defined $\vert\cdot\vert$ as the shuffle distance between two images, as illustrated in Figure 2. Shuffle distance is the mean of the minimum absolute difference between each pixel/voxel in one image, and the pixels/voxels in a shuffle neighbourhood of radius around the corresponding pixel/voxel in a second image. When $r \leq 1$ , this is equivalent to the mean absolute difference between corresponding pixels/voxels, but for larger values of the distance increases more smoothly as the misalignment of structures in the two images increases. The effect on the pixel-by-pixel contribution to shuffle distance as is increased is illustrated in Figure 3.

**Figure 2:** The calculation of a shuffle difference image
[width=0.9]../EPS/Final/shuffle-example-from-presentation.png

**Figure:** Shuffle distance evaluation: **Left:** one image, **Right:** another image, **Centre, from left to right:** images showing contributions to shuffle distance, for $r = 0\:$ (abs. diff.) $,\:1.5,\: 2.1$ & respectively.
[width=0.98]../EPS/Carole/shuffle_dist_example_lighter_shades.png

**Figure 4:** **From Left:** Specificity, Generalisation & Tanimoto overlap as a function of registration perturbation.
[width=0.89 ]../EPS/Carole/Allthree.png

$next$ $up$ $previous$
Next: Experimental Validation Up: Assessing the Accuracy of Previous: Introduction

Roy Schestowitz 2006-02-08