Residuals

On their own, simple surface-to-surface metrics seem to be weak as classifiers, but in cases of GMDS similarity, values falling around the margins (i.e. close to threshold of ambiguity) can be made more reliable by enhancing and increasing the amount of data. Current work refines methods of detecting and modifying GMDS/stress scores that are low despite inherent differences that might be non-isometric. This essentially combines geodesic metrics on the surface with Euclidean ones, ruling out what would otherwise be false positives.

**Figure:** Examples of shape pair residuals and the corresponding ROC curve

**Figure:** Residual difference and the problem of localised high signal (which makes this a weak similarity measure)

**Figure:** ROC curve obtained by using a residuals of just a particular image region (nose and eyes)

The previous results demonstrated the great weakness of purely Euclidean measures that use the residual, where every small bit of misalignment almost dominates the difference. The challenge has since then been to identify a Euclidean distances-based measure which is robust to this type of variation and then complement the purely geodesic distances-based measure (notably GMDS). In this first batch of experiments, a volumetric-type Euclidean distance (gap between the surfaces put on top of each other) gets measured. In order to demonstrate the great variation, even within pairs of the same individual imaged, a figure was produced (see Figure $[*]$ , showing areas of very high contrast, e.g. at the sides of faces. The ROC curve in Figure $[*]$ shows the problem. By aligning around the nose and then considering just the nose area we can possibly get better results (although still rather poor, as shown in Figure $[*]$ and Figure $[*]$ ) that are based on Euclidean properties. Another Euclidean-based measure worth exploring might be distances between particular points of interest, e.g. eye corners and nose tip. The goal is to eliminate cases where two images are identified as belonging to the same person based on geodesic properties alone, even though based on other criteria this is clearly not always the reliable thing to do.

To make it more robust to movement around the nose tip, the surfaces are shifted a controlled amount in X and Y in search of an optimal match $[*]$ . A good couple of matches are shown in $[*]$ .

**Figure:** Examples of pixel differences for pairs of the same people

**Figure:** ROC curve corresponding to pixel differences for the whole middle section of the face

**Figure:** ROC curve corresponding to pixel differences for the nose area alone

**Figure:** ROC curve corresponding to sum of squared differences for the nose area alone

**Figure:** Example of 2 pairs from which the difference image is produced (shown at the top)

For recognition based on surface sum-of-squared-differences, the best achieved recognition rate is currently around 80%, which gives it vastly inferior discriminative power compared to GMDS (as expected). In order to make a fusion of these two, e.g. using the weaker one as a mere regulariser, careful thought is needed because one can degrade from the usefulness of the other. One idea which was tested earlier is the invocation of a more complex classifier only in cases where classification is on the margin, i.e. GMDS is unable to comfortably discern real pairs from false ones. For the small test set used so far this can yield perfect recognition, but it requires further testing to be generalisable.

**Figure:** Top images show the sum-of-squared-differences of the first 3 true pairs, with the mere difference shown at the bottom

**Figure:** Examples of the first 12 false pairs (sum-of-squared-differences)

**Figure:** ROC curve generated by a sum-of-squared-differences-based similarity measure

By applying a similarity test that falls back onto Euclidean measures when GMDS is unable to make a clear distinction (score between 3 and 4), the algorithm is now able to classify all image pairs (72 images in total) correctly. Increasing the number of those pairs might present new issues and, shall any such issues arise, we can design a workaround. To claim 100% recognition based on just 72 images does not make sense, so I will increase the number of images.