PDF version of this entire document

Preparation and Preprocessing

Bash scripts were written to locate all of the relevant files in the FRGC dataset12. About 5,000 3-D faces are contained inside. The packages comes with information and programs of interest to those who may find themselves working with 70 GB of data and some accompanying metadata. Given the http://www.frvt.org/FRVT2006/Face Recognition Vendor Test (FRVT) of 2006 and http://www.frvt.org/FRGC/overview of the FRGC it should be possible to know what is available and where. The latter is an official FRGC Web site. The large package comes with associated applications and scripts written in Java, C++, Perl, etc.

An existing MATLAB/GNU Octave implementation identifies all 3-D images in the dataset. These are located/scattered in many different paths, dependent on time/place. Scripts were adapted to decompressed the files and cycle through them, e.g. to pre-process them in series.

Figure: Translation of the given (cropped) face applied so as to position it with the nose tip at the front and at the centre
Image realignment

Code was written to perform the pre-processing steps as specified in the corresponding paper (which describes an entire Ph.D. thesis from Australia13, and very densely so).

The code was made modular with dozens of options to control what is done and how it is done (through a settings files containing all the parameters), as well as how the data is presented to the user throughout runtime.

  1. Each image is taken in turn while the program is performing some analysis that includes a histogram (no manual selection as it would be laborious for thousands of data instances) and visualisation in 2- and 3-D.
  2. The image is studied to separate a person from the background and remove all data points associated with the background.
  3. The remaining sets of points are made more uniform by filling holes (using local statistics), removing spikes, and smoothing the surface using one among a set of possible methods. We identified better ways of eliminating holes as well as spikes (more generally just noise) on the face surface and then tested the results on a larger sample of 3-D images (the majority is handled perfectly well).
  4. The tip of the nose is found and the image is normalised by making its Z coordinate (depth) zero, then centering it by shifting XY space such that the tip of the nose is at (0,0,0). In order to normalise - so to speak - what remains visible before ICP is invoked to align the data, 3 methods were implemented to select only a region which can be consistent across data sets. The ears and hair, for example, are not wanted for statistical analysis, so they can be removed by discarding all points associated with them. See Figure [*] for a visual example in 2-D (although rough, there is a similar screengrab shot in Figure [*]).
  5. We have measured the density in Y and X in order to normalise distances, such as the distance from the nose to the chin. More options were added to the control file/wrapper (nearly 20 at the time of writing) and additional function now deals with cropping the face using one of three methods. The best method is capable of isolating the face surface irrespective of the size of the head, which is being centred and brought into alignment at the front. There are then options which define how the face gets cropped to maintain just rigid surface such as the forehead, nose, and eye area (assumes no blinking and eyebrow-raising expressions). The algorithm sets everything necessary - to the extent possible - for ICP to nicely deal with alignment to a common frame of reference.

Roy Schestowitz 2012-01-08