This shows you the differences between two versions of the page.
— |
mias-irc-2005-rev-4 [2014/05/31 17:36] (current) admin created |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | Assessing the Accuracy of Non-Rigid Registration | ||
+ | |||
+ | Non-rigid registration (NRR) of both pairs and groups of images has in | ||
+ | recent years increasingly been used as a basis for medical image analysis. | ||
+ | The problem is highly under-constrained and a host of algorithms that have | ||
+ | become available will, given a set of images to be registered, in general | ||
+ | produce different results. We present two methods for assessing the | ||
+ | performance of non-rigid registration algorithms, compare them on a | ||
+ | registration of a set of 38 MR brain images and show them to provide a | ||
+ | robust evaluation of registration success. | ||
+ | |||
+ | The first of the proposed methods assesses registration as the spatial | ||
+ | overlap, defined using Tanimoto's formulation [Ref], of corresponding | ||
+ | regions in the registered images. The correspondence is defined by labels of | ||
+ | distinct image regions (in this case brain tissue classes), produced by | ||
+ | manual mark-up of the original images (ground truth labels). A correctly | ||
+ | registered image set, will exhibit high relative overlap between | ||
+ | corresponding brain structures in different images and the other way around. | ||
+ | |||
+ | |||
+ | The second method assesses registration as the quality of a generative, | ||
+ | statistical appearance model, constructed from registered images. The idea | ||
+ | is that a correct registration produces a true dense correspondence between | ||
+ | the images resulting in a better statistical appearance model of the images. | ||
+ | Registration is then evaluated through specificity and generalisation | ||
+ | ability of the model, or the ability of the model to i) generate realistic | ||
+ | examples of the modelled entity and ii) represent well both seen and unseen | ||
+ | examples of the modelled class. In practice these are evaluated by using | ||
+ | generative properties of the model to produce a large number of synthetic | ||
+ | examples (in this case brain images) that are then compared to real examples | ||
+ | in the original set using some pre-defined image distance measure. Minimum | ||
+ | distances of synthetic examples to examples in the original set and vice | ||
+ | versa, give model specificity and generalisation respectively. Image | ||
+ | distance is measured as a mean shuffle distance, or minimum euclidian | ||
+ | distance between a pixel in one image and a corresponding neighbourhood of | ||
+ | pixels in the other. | ||
+ | |||
+ | To test the validity of the proposed methods, the brain images were | ||
+ | annotated with 6 tissue classes including gray, white matter and CSF that | ||
+ | provided the ground truth for image correspondence. Initially, the images | ||
+ | were brought into alignment using an NRR algorithm based on the MDL | ||
+ | optimisation [Ref us IPMI say]. A test set of different registrations was | ||
+ | then created by applying random perturbation to each image in the registered | ||
+ | set using diffeomorphic clamped-plate splines. By choosing a different | ||
+ | perturbation seed for each image and gradually increasing the magnitude of | ||
+ | the perturbations a series of image sets of progressively worse spatial | ||
+ | correspondence and thus registration quality was obtained. By measuring the | ||
+ | quality of the registraton at each step the proposed registration assessment | ||
+ | measures can be validated. | ||
+ | |||
+ | Overall, the above approach was applied 10 times using 10 different | ||
+ | perturbation seeds to ensure that both methods are consistent and results | ||
+ | unbiased. Results of the proposed measures for increasing registration | ||
+ | perturbation are shown in Figure 1, note that Generalisation and Specificity | ||
+ | plotted for different shuffle neighbourhood radious are in error form, i.e. | ||
+ | they increase with decreasing performance. All metrics are generally | ||
+ | well-behaved and show a monotonic decrease in registration performance. Such | ||
+ | results directly validate the model based metrics which are shown be in | ||
+ | agreement with the ground truth embodied in the region overlap based | ||
+ | measure. | ||
+ | |||
+ | <Graphics file: ./Graphics/1.eps> | ||
+ | |||
+ | Figure 1: Behaviour of proposed metrics with increasing registration | ||
+ | perturbation: a) Generalisation, b) Specificity and c) Tantimoto overlap | ||
+ | |||
+ | Finally, in order to obtain a quantitative comparison of the proposed | ||
+ | algorithms we explore sensitivity of the proposed metrics, where the | ||
+ | slighter the difference which can be detected reliably, the more sensitive | ||
+ | the method. Sensitivity is in this case defined as the rate of change in the | ||
+ | measure for a given perturbation range - normalised by the average | ||
+ | uncertainty in the measurement over that range: | ||
+ | |||
+ | where X is... (TODO). Sensitivity is evaluated for all three of the proposed | ||
+ | metrics and shown in Figure 2 with errors bars based on both an | ||
+ | inter-instantiation error and a measure-specific error. The Specificity | ||
+ | measure is the most sensitive for any radius of the shuffle distance | ||
+ | followed by the overlap metric and Generalisation, with shuffle radii of 1.5 | ||
+ | and 2.1 (equivalent to 3x3 and 5x5 neighbourhoods) giving optimal | ||
+ | sensitivity. | ||
+ | |||
+ | Figure 2: Sensitivity of the proposed metrics | ||
+ | |||
+ | The results shown in this abstract indicate that registration performance | ||
+ | can be evaluated reliably both in the cases when ground truth information is | ||
+ | available and when it is not. In particular, the methods based on generative | ||
+ | statistical model evaluation are shown to be in agreement with the ground | ||
+ | truth expressed throught the true image region overlap metric based on the | ||
+ | Tantimoto formulation. Proposed metrics are also shown to have sufficient | ||
+ | sensitivity to detect very subtle changes in registration performance, on | ||
+ | the level of perturbations measured in fractions of a pixel. |