Figure compares the performance of the three registration algorithms outlined in Section . All the measures tested in the previous section were computed, but we show results for only the most sensitive model-based method. Figures (a) and (c) show Specificity calculated using a shuffle radius of 2.1, for different values of 34#34, the number of modes used to build the generative model. Figure (b) shows generalised overlap using different weightings. The results shown in Figure (a) suggest that the MDL groupwise approach gives the best registration result for the MGH Dataset, followed by Pairwise and Congealing in order of decreasing performance - irrespective of the value of 34#34. Inspection of the error bars shows that these differences are statistically significant. The results for Generalised Overlap, shown in Figure (b), are more complicated, with the performance of the different NRR algorithms ordered differently for different weightings, though inspection of the error bars shows that many of the differences are not significant. Overall, the same general pattern emerges as for Specificity, with the Groupwise method generally best (statistically significantly in two cases), but with no significant difference between Pairwise and Congealing in most cases. The results for inverse volume weighting generally lack significance, but are inconsistent with those obtained using the other weighting schemes. Volume weighting gives the best separation between the different variants, and places the three methods in the same order as Specificity. Overall, this supports the interpretation that Specificity give results that are generally equivalent to those obtained using Generalised Overlap, but with higher sensitivity. Finally, the Specificity results shown in Figure (c) for the Dementia Dataset, place the three methods in the same order.