The problem is multi-faceted in the sense that it reinvents the wheel
when it comes to refining surfaces (surely there is a lot of literature
on that part alone) before one can even start to segment the eyes,
nose, eyebrows (if any), etc. Any error in detection anywhere in the
set can result in mis-detection and therefore pollute shape residual
vectors. Typical problems involve mis-location of the nose tip, especially
in the expressions dataset with long hair (this problem was largely
overcome for the FRGC dataset). When approaching this problem, contact
was made with various people who may have already written code that
addresses similar problems, at the very least so as to avoid reproducing
- poorly - the same type of code. The cropping phase is already
reliable enough and pose correction should not be an issue. Actually,
by trying for example not to correct/compensate for rotation we may
be able to model also the movement of the head, even though with the
addition of expression it would make modes rather fuzzy and the sources
of variation less isolated. For
images we'd have
modes,
hopefully ones that capture the principal expression changes and not
a mixture of structural and positional changes. Instincts suggest
that the smarter route to follow is to focus more on model-building
and not delve too deeply into segmentation, which can meanwhile be
assisted by manual work, e.g. with ginput() for 100 images.
Model building is currently more important than grappling with segmentation,
so modeling is what we should go for. However, without proper separation
between signal and cruft (see for example Figure
)
it is hard to guarantee good results. In face recognition, attempts
at recognition by parts was very popular at a point and by manipulating
the coefficients within the GMDS or mapping to some average shape,
one could properly align the scan (normalise in a way) and from there
the path would be much easier than with other forms of normalisation.
Roy Schestowitz 2012-01-08