From Al-Osaimi et al., IJCV 2008
Summary: My attempt to reproduce some of the results of F. Al-Osaimi et al. and furthermore improve them using other methods and different datasets (with a 3-D scanner at our disposal)
HIS post provides some background about my next (current) research project, which deals with non-medical applications. The previous project dealt with cardiac imaging and I’ve packaged that code and published it along with other data that may be useful.
The group of A. Mian has done some fantastic work recently on 3-D face recognition and I shall attempt to reproduce some results with a NIST database. In their paper “An Expression Deformation Approach to Non-rigid 3D Face Recognition,” F. Al-Osaimi, M. Bennamoun, and A. Mian explain some good results from experiements that apply PCA to face images (paper published online in September 2008 by a leading computer vision journal, but access is restricted, so there is no link, either… unless one uses this copy).
Since I have extensive experience with NRR, PCA, and statistical models in general, this project suits me better than some previous ones. I have done limited work on analysis applied to sets of face images that are only rigidly or affinely registered.
The paper from the group in question is 22 pages long in the raw form and about 15 in IJCV. The abstract describes an idea and quantifies some results using known benchmarks and the “FRGC v2.0 dataset”. Then, the method is alluded to vaguely and not formalised until later. The phrasing could be improved somewhat to avoid repetition, e.g. in the following paragraph containing parts like: “2D face recognition has been extensively researched in the last two decades. However, unlike 3D face recognition its accuracy is adversely affected by many factors such as illumination and scale variations. In addition, 2D images undergoes affine transformations during acquisition. Moreover, handling pose variations in 3D scans is more feasible than 2D images. It is believed that 3D face recognition has the potential for more accuracy than 2D face recognition (Bowyer et al. 2006). On the other hand, the acquisition of 2D images is less intrusive than 3D acquisition. However, 3D acquisition technologies are continuously becoming cheaper and less intrusive (The International Conference on 3D Digital Imaging and Modeling, 1997–2007).”
“Most of the approaches in the literature are rigid,” says the text in page 2, just before the overview which states: “The main contribution of this paper is a non-rigid 3D face recognition approach. This approach robustly models the expression patterns of the human face and applies the model to morph out facial expressions from a 3D scan of a probe face before matching. Robust expression modeling and subsequent morphing gives our approach a better ability in differentiating between expression deformations and interpersonal disparities. Consequently, more interpersonal disparities are preserved for the matching stage leading to better recognition performance.”
The background section is followed by some classification of existing work, concluding with: “Our approach also falls into this category i.e. non-rigid 3D face recognition.”
1.1 presents a very good summary of related work and 1.2 a clear overview of the method and the ideas behind it, accompanied by a helpful diagram at the bottom of page 3 (Figure 1). The strategy is to use pairs of image of the same individuals, normalising them a bit, and then applying PCA to reduce the dimensionality that characterises expression variation.
Section 2 in page 4 starts by describing pre-processing steps that are essential yet specific to the limitation of the FRGC v.20 dataset. Page 5 starts presenting some visual examples of the approach, with some equations relating to PCA (along with more visual examples) in pages 6 and 7.
Section 3 begins to deal with some other experiments that are not just dealing with models in synthesis mode. The same dataset is being used (with about 5,000 3-D faces), but more data gets added to it. To quote, “The dataset is composed of two partitions: the training partition (943 scans) and the evaluation partition (4007 scans). [..] The FRGC dataset was augmented by 3006 scans that were acquired using a Minolta vivid scanner in our laboratory.”
Parameters and set sizes (those which are included) get tested in very large-scale experiments that yield ROC curves. These curves help show how to set the different parameters and enable one to measure advantages of one algorithm over another. Page 13 has some comparisons to other methods from the literature, with numbers summarised in a chart.
This is truly inspiring work and I shall spend the next few weeks learning from it as well as implementing something similar.