Systematic Experiments

Comparing a PCA approach to a GMDS approach was the original goal of our work, primarily utilising statistics, e.g. 3-D facial expression interpretation through statistical analysis. With the goal of validating and comparing face recognition methods, we can embark on the following path of exploration. The data to be used needs to be of different individuals and the datasets must be large enough to enable model-building tasks. As such, the data specified in Experiment 3 of FRGC 2.0 should be used for both training and testing. It needs to be manually classified, however, as groups that previously did this have not shared such metadata. It would be handy to select hundreds of those that represent expressions like a smile and then put them in respective loader files, alongside date with an accompanying neutral (no expression) image. It ought to be possible to set aside 200 such pairs, all coming from different people. Identification in such a set ought to be quite challenging, without texture (which is in principle available in separate PPM files).

The experiments can have the set of 200 pairs further split into smaller groups for repetition that takes statistics into account and can yield error bars. Dividing into 5 groups of 40 pairs is one possibility, even though a set of 40 individuals is becoming a tad small. In order to train a model of expressions it ought be be possible to just use the full set.

When approaching this problem the goal would be to pair a person with an expression to the same person without the expression (or vice versa), attaining some sort of gauge of expression-resistant recognition. The gallery is the set of all faces in the set. Similarity measures being pitted for comparison here can include the 4 ICP methods we have, plus variants of these and different selection of parameters. Different measures resulting from ICP and the region being compared (e.g. all face versus nose, versus forehead and nose) are another area of exploration. There ought to be separation between the idea of cropping for alignment alone and cropping or binary masks for the sake of computing difference as well.

What we may find is, by cropping out some parts of the face recognition will improve considerably. But in order to take the deformable parts that change due to expression into account, something like an expression becomes necessary. Then, there is room for comparison between expression-invariant model-based recognition and recognition which is based purely on alignment. The type of alignment too, e,g. the implementation of ICP, can be compared in this way.

To summarise this more formally, we take N=200 pairs 480x640 3-D images acquired from N different subjects under various lighting, pose, and scale conditions, then register them using 4 ICP methods, in turn (potentially with variants, time permitting), using a fixed nose-finding method. As the first experiment we may wish to apply this alignment to a set of cropped faces, ensuring that they all lie in the same frame of reference. A model is built from the residual of all 200 pairs, in order to encompass the difference incurred by an expression of choice, e.g. smile or frown. In the next stage, 5 sets of M=N/5 images are set as a gallery G and a probe p goes through all images in G, attempting to find the best match best on several criteria such as model determinant or sum of differences. To measure determinant difference it is possible to add the new residual (between p and any image in G), then concatenate it to the set of observations that build the model. This is how it is implemented at the moment. Subsequent experiments can extend to compare other aspects of recognition using the same framework/pipeline. Measurement of performance should be easy if the correct matches are recorded for a random permutation of the set and then paired for some threshold (or best match) based on the gallery.

``The trust of the innocent is the liar's most useful tool.''
- Stephen King.

Roy Schestowitz 2012-01-08