Figure shows the results of applying sensitivity analysis to the validation study. These demonstrate that Specificity is more sensitive (is able to detect smaller misregistrations) than the overlap-based approach, which is in turn more sensitive than Generalisation. Note from the error bars that these differences are statistically significant. Maximum sensitivity is achieved with a shuffle radius of 100#100 or 101#101. The most sensitive generalised overlap measure is obtained using label-complexity weighting.