User Tools

Site Tools


mias-irc-2005

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

mias-irc-2005 [2014/05/31 17:37] (current)
admin created
Line 1: Line 1:
 +!!!Assessing the Accuracy of Non-Rigid Registration With and Without Ground Truth
  
 +R. S. Schestowitz^{1},​ W. R. Crum^{2}, V. S. Petrovic^{1},​ C. J. Twining^{1},​ T. F. Cootes^{1} and C. J. Taylor^{1}
 +
 +
 +^{1}Imaging Science and Biomedical Engineering,​ University
 +of Manchester
 +Stopford Building, Oxford Road, Manchester M13 9PT,
 +United Kingdom
 +
 +
 +^{2}Centre for Medical Image Computing, Department of
 +Computer Science, University
 +College London, Gower Street, London WC1E 6BT, United Kingdom
 +
 +
 +
 +
 +
 +A diverse collection or methods exist for the problem
 +of non-rigid registration,​ whereby a set of images is
 +to be aligned. We perceive a deficiency, however, in
 +the ways such registrations are validated or even
 +evaluated. Hereby we present two methods for evaluating
 +non-rigid registration. One of the methods requires
 +ground-truth solutions to be provided a priori, yet the
 +other does not. We shall present results, which confirm
 +that both methods are valid and proceed to calculating
 +their sensitivities. We find that the method which
 +requires ground-truth solutions is not as sensitive as
 +the method which need not have anything but the raw
 +images and the corresponding deformation fields.
 +
 +The aim of registration is to transform images until
 +corresponding structures across them overlap.
 +Registration is an optimisation problem wherein the
 +degree of overlap, as measured by some metric, needs to
 +be increased. Overlap is established by transformation
 +of the images. Transformations and measures of
 +similarity fall under a framework that we call the "
 +objective function",​ which fully describes the approach
 +a registration algorithm takes.
 +
 +There are further factor that distinguish one
 +registration approach from another. Most notably, there
 +is divide over whether pairs of images should be
 +handled apart (pair-wise) rather than the whole group
 +of images simultaneously (group-wise). Therefore, there
 +needs be an unbiased method for assessing the
 +performance of registration algorithms. Such a method
 +must first be validated using careful experimentation,​
 +which incorporates the notion of correct solutions.
 +
 +The first of the methods to be described relies on the
 +existence of ground-truth data such as boundaries of
 +image elements or the location of distinguishable
 +points. Having registered an image set, the method can
 +measure overlap between elements that have been
 +annotated, thus implying how good a registration was.
 +
 +Our latter method is able to assess registration
 +without ground truth of any form. The approach involves
 +automatic construction of appearance models from the
 +registered data, subsequently evaluating, using model
 +syntheses, the quality of that model. Quality of the
 +registration is tightly-related to the quality of its
 +resulting model and the two tasks, namely model
 +construction and image registration,​ are innately the
 +same. Both involve the identification of corresponding
 +points, also known as landmarks in the context of
 +model-building. Expressed differently,​ a registration
 +produces a dense set of corresponding points and models
 +of appearance require nothing but the images and the
 +correspondences in order to be built.
 +
 +To put the validity of both methods to the test, we
 +assembled a set of 2-D 38 MR images of the brain. Each
 +of these images was carefully annotated to identify
 +different compartments within the brain. These
 +anatomical compartments can be perceived as simplified
 +labels that faithfully define brain structure. Our
 +first method of assessment uses the Tanimoto overlap
 +measure to calculate the degree to which labels across
 +the image set overlap. In that respect, it exploits
 +ground truth, which has been identified by an expert,
 +to reason about registration quality.
 +
 +The second method takes an entirely different approach.
 +It feeds on the results of a registration algorithm,
 +where correspondences have been highlighted,​ and builds
 +an appearance model given the images and their
 +correspondences. From that model, many synthetic brain
 +images are derived. Vectorisation of these images
 +allows us to embed (or mentally visualise) them in a
 +high-dimensional space. We can then compare the spatial
 +cloud that these synthetic images form with the cloud
 +that is composed from the original image set -- the set
 +from which the model has been build. Computing the
 +overlap between these clouds gives insight into the
 +quality of the registration. Simply put, it is a model
 +fit evaluation paradigm. The better the registration,​
 +the greater the overlap between those clouds will be.
 +
 +To compute overlap between two clouds of data, we have
 +devised measures that we refer to as Specificity and
 +Generalisablity. The former tells how well the model
 +fits its seminal data, whereas the latter tells how
 +well the data fits its derived model. It is a
 +reciprocal relationship that '​locks'​ a data to its
 +model and vice versa. We calculate Specificity and
 +Generalisablity by measuring distances in space. As we
 +seek a measure that is tolerant to slight differences,​
 +we use the shuffle distance, not neglecting to compare
 +it against Euclidean distance.
 +
 +Our assessment framework, by which we test both
 +methods, uses non-rigid registration,​ whereby many
 +degrees of freedom are involved in image
 +transformations. To systematically generate data over
 +which our hypotheses can be tested, we perturb the
 +brain data using clamped-plate splines. In this brain
 +data, correspondences among images are said to be
 +perfect so they can only ever be degraded. We then wish
 +to show that as the degree of perturbation increases,
 +so do the measures of our registration assessment methods.
 +
 +In our extensive batch of experiments we perturbed the
 +datasets at progressively increasing levels, which led
 +to well-understood misregistration of the data. We
 +repeated these experiments 10 times to demonstrate that
 +both approaches to assessment are consistent are all
 +results unbiased. Having investigated and plotted the
 +measures of overlap for each perturbation extent, we
 +see a rather linear decrease in the amount of overlap
 +(Figure X). This means that, as ground-truth-based
 +registration is eroded, the overlap-based measure is
 +able to detect that and the response is very
 +well-behaved,​ thus meaningful and reliable.
 +
 +<​Graphics file: ./​Graphics/​1.eps>​
 +    <​Graphics file: ./​Graphics/​2.eps>​
 +
 +
 +          Figures X&Y. a
 +          b
 +          c.
 +
 +We then undertake another assessment task, this time
 +exploiting the method which does not use ground truth.
 +We notice a very similar behaviour (Figure Y), which is
 +evidence that the latter is a powerful and reliable
 +method of assessing the degree of misregistration,​ or
 +conversely the quality of registration.
 +
 +As a last step, we embark on the task of comparing the
 +two algorithm, identifying sensitivity as the factor
 +which is most important. Sensitivity reflects on our
 +ability to confidently tell apart a good registration
 +from a worse one. The slighter the difference which can
 +be correctly detected, the more sensitive the method.
 +To calculate sensitivity,​ we compute the amount of
 +change in terms of mean pixel deformation --
 +deformation from the correct solution, that is. We then
 +look at differences in our assessor'​s value, be it
 +overlap, or Specificity,​ or Generalisation. We also
 +stress the need to take account of the errors bars as
 +there is both an inter-experiment error and
 +measure-specific error; the two must be composed
 +carefully. The derivation of sensitivity can be
 +expressed as follows:
 +
 +placeholder
 +
 +where X is something... (TODO)
 +
 +<​Graphics file: ./​Graphics/​3.eps>​
 +
 +
 +          Figure Z. a
 +          b
 +          c.
 +
 +Figure Z suggests that for roughly any selection of
 +shuffle distance neighbourhood,​ the method which does
 +not require ground truth is more sensitive than the
 +method which depends on it. If the trends of these
 +curves are looked at closely, it can be observed that
 +they approximately overlap, which implies that the two
 +methods are very closely correlated.
 +
 +In summary, we have shown two valid methods for
 +assessing non-rigid registration. The methods are
 +correlated in practice, but the principles they build
 +upon are quite different. Their pre-requisites -- if
 +any -- likewise. Registration can be evaluated with or
 +without ground-truth annotation and the behaviour of
 +the measures are consistent across distinct data, are
 +well-behaved,​ and are sensitive. Both methods have been
 +successfully applied to assessment of non-rigid
 +registration algorithms and both methods led to the
 +expected conclusions. That aspect of the work,
 +nonetheless,​ is beyond the scope of this paper.
mias-irc-2005.txt ยท Last modified: 2014/05/31 17:37 by admin