Code was customised and integrated into the main framework with the
aim of putting it in a dimensionality reduction algorithm of another
type, alongside signal of nature other than geometric (and geometry-invariant).
If done improperly or applied to faces of different people (as the
figures below show), it can be demonstrably shown that the resultant
correspondence is rather poor. The data dealt with in this case is
illustrated in Figure . Figure
shows this with
and Figure
shows
the same for
. Conversely, as seen in Figure
,
even with
the found correspondence is considerably better
when handling images acquired of the same person.
![]()
|
Positive pairs/matches are shown in figures
By resolving issues associated with fatal exceptions in the pipeline it should be trivial to utilise the generalised MDS, which by far simplifies experiments performed with MDS (still part of the program, at least as an option to be explored or compared to later).
Further debugging has facilitated a rather reliable algorithm that
is able to assemble GMDS-related metrics (not strictly a metric per
se) from a large group of images, with or without smoothing and some
other parameters that help make the process more robust (e.g. in case
of misalignment). While it is possible to derive a similarity measure
from raw values without a training process (involving a model), for
localised information to bear meaning there ought to be a template
or a more high-level abstraction/model that deforms itself to targets
or specifies a quality of match. The order of points needs to be consistent
with the anatomy and also consistent across examples however, otherwise
no consistent markup can be worked on and the discriminant is accordingly
weak. Examples of matching between dissimilar faces from different
people can be seen in figures ,
,
, and
.
![]()
|
Using GMDS, the recognition performance reached at this stage is around
90% (see Figure ), but there are many
improvements left to be made, either in pre-processing or in the suiting
of GMDS to the task at hand. The main barrier was removal of some
bugs relating to triangulation, as summarised in very few words in
Figure
, which does not delve into pertinent
details as it is uninteresting.
What GMDS does right now is basic and is not yet incorporated with (G)PCA, which would require consistent ordering of points. This is just a set of baseline results to serve as a sanity check.
![]()
|
Regarding (G)MDS versus (G)PCA, it would be reasonable to say that
the right mix should probably be some hybrid, where some sort of GMDS
is used for alignment (as we do right now) and then PCA for efficient
recognition. We are not so sure yet where the line between the two
should be, but it is obvious that the truth is there. Figure
shows the results from a still-buggy algorithm.
We changed sampling density, changing it from 10x10 to 5x5 grids. Preliminary results on 30 images are as follows:
'Predictivity' of negative test (probability that a subject is identical when it is not): 92.9%
95% confidence interval: 79.4% - 100.0%
Negative Likelihood Ratio: 0.1
Accuracy or Potency: 90.0%
Mis-classification Rate: 10.0%
Error odds ratio: 2.1538
Identification odds ratio: 91.0000
As work continues on refinement, it may be possible to find new ways of further improving the sampling, e.g. by selecting particular features.
By disabling ICP we can possibly justify the use as GMDS as its replacement, essentially by taking a template image and performing GMDS on it wrt to each image of the current pair. However, ICP should get us a good initialisation for the GMDS phase.
By shrinking the data sampling rate further the recognition performance is further improved to the point where the ROC curve reaches 95%.
Following some further low-level refinements, there is considerably less attention paid to minor details around shady areas formerly occupied with voids/holes (a bug with a MATLAB toolbox was also found but not reported after it had wasted hours in vain). This was the result of tedious debugging and tweaking by observation.
This leads to very good detection rates, however nose detection is
still short of perfect and provided this can be overcome ~99%
of the time19, matching can exceed 95% detection rate. The FRVT FRGC documents
on the Web20 provide a more formal set of steps to follow, but until the pre-processing
stages can be coupled to form a robust enough process, there is no
point to adding PCA variants to the pipeline and then performing benchmarks.
The pieces are already in place, but it is the failure to accurately
and consistently carve out faces (despite hair occlusion) that merits
increased attention and effort. In the latest small test involving
30 correct pairs (same person) and 30 incorrect pairs, the only misdetections
were due to arbitrary face parts being assumed to be nose, incorrectly.
The reasons vary and solution has been found and implemented many
times before, encouraging reuse now rather than a reinvention of the
wheel. See figures and
.
Following some preliminary overnight experiments, it is possible to
show the practicality of a PCA-GMDS hybrid framework, wherein the
values on which dimensionality reduction is invoked are the geodesic
distances between salient points. The idea is, by studying the variation
of distances between analogous facial landmarks - almost as though
there are strings between every pair - one can know which ones are
expected to vary not across people but within them (intra-person/intrinsic),
in which case these variations are very much expected and predictable.
The model which is built only from correct pairs (8 pairs in an initial
toy example, 76 in the coming tests) is supposed to penalise for variation
in areas of the face that do not exhibit much variation in the training
phase. Results are shown in figures and
.
The subsequent steps delved into ways of improving the data and its preparation for classification for an accurate determination of match/no match status. While in principle the method works quite reliably, a lot of room remains both for improvement in the ordering of points and in the quality of the pre-processing, as most of the false positives and false negatives are a result of the latter. Additionally, removal or conversely proper sampling of points around the cheeks should be considered.
The charts in Figure show the
distribution of mode weights based on the building of two models,
one of 10 people (around 80 pairs), and one of 76 people (around 400
pairs).
![]() ![]() ![]()
|
We then prepared a short report for a decision to be made regarding how long we give this face recognition project, which could otherwise be morphed to measure distances on a surface where corresponding points can less effectively be identified, e.g. anatomical parts inside the body where there is no easily identified part such as the nose, mouth, and eyes, let alone any photometric data to take advantage of. The strength of GMDS is that it autonomously finds points that are otherwise difficult for humans to mark up.
How the current results compare to the scores reported in FRGC FRVT etc. is still an important question and we can we combine mine with Bar's code for improved performance based on prior work. We can work effectively from a distance because there are fewer distractions. In general, the bottleneck is pace of work (about 2 hours per day), but the intervals allow for more results to be processed and delivered in-between. Since a lot of the work is done on computational servers anyway, locality has access to informed people as its main advantage. The weakness of work for long periods of time is that time taken for results to arrive must be dedicated to observation or further coding, which would still depend on the observation of results that had not arrived.
There have been no known attempts to apply GMDS methodology for diagnosis based on deformable atlases (training from patients with atrophies compared to normals). Half a decade ago, Davies, Cootes, and Taylor used reparameterisation on the sphere (Cauchy kernels) in order to classify the 3-D shape (surface, not volumetric) of the hippocampus with the aim is diagnosing disease characteristics of this interesting structure (with known correlation to illnesses), based upon fully automatic training from datasets we may have access to. The work done by Aflalo et al. is reminiscent the above, at least from an analytical angle.
The first to use conformal maps for computational anatomy is probably Eric Schwartz in the 80s. The more recent examples that immediately crop up come from ``http://picsl.upenn.edu/caph08/MICCAI 2008 WORKSHOP ON THE COMPUTATIONAL ANATOMY AND PHYSIOLOGY OF THE HIPPOCAMPUS''. Xie et al. [37], for instance, use shape analysis for Alzheimer's Disease detection.
A. Elad used MDS to map surfaces to spheres. It was around 2002 as far as I recall, but it was definitely not conformal. The mapping to the sphere in Davies' case (his work is still ongoing, but he too only spends only about 50 hours per week on research) is one that warps correspondences onto a sphere (or circle, at least in 2-D) and then applies particular functions to space up the correspondences and make reasonable candidates over which to optimise a groups shape concurrently []. The overall goal is to automatically identify and choose points that represent shapes. My own work extended these ideas to full intensity (texture), seeking points that take both grey-level and spatial values into account at the same time (using a combined shape and appearance mode, or AAM). I published papers on the subject over half a decade ago.
If it is true, as claimed by several people whom we spoke to, that face recognition is best handled by carving out few features that never vary in their relative geometry, then GMDS seems a little unnatural as the only absolute points on which to measure distances are easy to identify either by hand or by template (colour can help too). The continuous mapping that depends not on interpolation but on surface characteristics like curvature or distances on surface may be inadequate (an overkill) unless only few fiducial points whose location can be determined accurately get used. This point is worth getting across when GMDS is criticised for utility in face analysis, wherein simpler algorithms can outdo it.
One would completely agree with the observation about GMDS if indeed faces had been rigid. They are not. This is especially valid if you take the face as a whole and just crop out the mouth. Still, cropping only the upper mushroom part and considering close to neutral expressions, then, ICP alone could be enough. one would guess that GMDS could enhance it by a small notch, but this may be wrong.
ICP appears to be essential for improved initialisation of GMDS. It
is important to be clear about whether we wish to model/sample entire
faces with GMDS or not. The common facial expressions can lead to
degradation in the results, but then again, with PCA these ought to
be weighted accordingly, e.g. with the expectation of large variation
(an already-seen variation, owing to the training set) in particular
regions, whereas other regions remain stable, i.e. distances within
those regions hardly vary or alternatively vary only along particular
dimensions (in hyperspace of dimensions, where
is the
number of points, not in 3-D). I will prepare an experiment which
broadens the scope to entire faces. It oughtn't yield good results
(on a comparable scale), but at least from an academic/scholarly perspective
it ought to validate the inclusion and contrariwise exclusion of particular
parts, e.g. those that accommodate mustaches and caused detection
problems in previously-run large-scale experiments. Likewise, a Euclidean
versus geodesic benchmark (Gaussian fitting for instance) can be produced
to provide validation, similarly to the preparatory work from the
2006 BBK paper in IEEE TPAMI. If it can be proven - empirically -
that geodesic distances always trump Euclidean equivalents, then at
least in the case of 3-D it can be argued that all those leading algorithms
(claiming 99.9% accuracy) can be further improved with FMM. Bar Shalem's
work partly applied some of the same principles but fell short performance-wise.
It is therefore unclear what paths should and should not be explored.
By applying GMDS with just 5 points (classically the eye corners and
the nose) we might be able to attain good performance but also merely
replicate previous attempts by Bar Shalem, thus studying too little.
This is why, upon the inquiry about code fusion, I remained a tad
reluctant. To what extent, for example, were the algorithms tested
and then refined? Was the newer version of FRGC tested on as well?
Since we have got access to code from BBK papers on face recognition
(2005), which route would be better explored? How many parts are merely
reimplemented. Anastasia has argued that GMDS, as a black box, has
not really changed since 2009, so the other building blocks are probably
the only candidates for swapping.
Our job is to prove or disprove this issue which involves feasibility. Starting point should be state of the art ROC curves. This is hopefully a reachable goal. If state of the art is now an error of 1 in a thousand or thereabouts, then it seems like a monumental task.
ICP could be interpreted as a Gromov-Hausdorff distance when the inter-points distance is Euclidean and points are allowed to move in 3D. It would be interesting if coordinate-wise descent could work as well as ICP (one may doubt it, though using multi-grid it could actually work). So, GMDS could in-fact be used like ICP. Therein lies a possible micro-study which compares the R and T matrices that our 4 (currently) ICP methods output, perhaps rationalising the use of GMDS for alignment. Alternatively, it ought to be possible to compare recognition results with and without ICP as a peripheral/separate part from GMDS.
Regarding the comment about existing GMDS implementation and its age, it is likely that Carmi Grushko introduced some changes to the GMDS, and in fact he is currently working on further refinements (of the geodesic distance computation).
A different measure to try is using diffusion distances rather than Euclidean or geodesic. One could also consider diffusion on the surface, diffusion inside the surface, as well as geodesics in the interior of the face, etc. One distance should provide the best discriminative power among all possible ones. We must check it.
The current experiment deals with the performance reached by adding and removing parts of the face, using binary masks that make very basic sense. In all cases, depth values from X and Y (averaged over each grid) are used to scale the binary mark, such that consistent cropping is assured regardless of distance from the camera's aperture. This is one of the crucial areas of improvement, one of about 6 areas that need further improvement.
It is agreeable that 1/1000 is a challenging goal, but one may strongly feel we could get there, and then just play with building blocks to check which metric gives the best results. The hunch is that geodesics should play a leading role there. Either as dense or sparse matching of surfaces.
How would geodesics deal with eye sockets? The problem is, with the eyes being filled the signal is too noisy and without any filling there is a difference in distance/s which depends on how open the eye is. Euclidean distances do not suffer from this apparent drawback. One solution devised so far is almost excessive smoothing, whereby just the very basic geometry is preserved and a lot of the rest vanished out of signal. The fine details are unlikely to be present in different acquisition sites/times.
Roy Schestowitz 2012-01-08