PDF version of this entire document

PCA-based Models

Model construction with the aid of PCA is far from new and it is inspired by other strands of work [17]. In prior work which delved into face modeling Cootes et al. found the mean shape and restricted set of PCA axes that provide a concise description of the training set of shapes. This was used to build a 2-D model of shape and this model could also be used to generate new shapes. Let such a new shape be x, generated from a set of shape parameters $\mathbf{b}_{s}$:


\begin{displaymath}
\mathbf{x}=\mathbf{\overline{x}}+\mathbf{P}_{s}\mathbf{b}_{s}.
\end{displaymath} (1)

The matrix $\mathbf{P}_{s}$ contains the eigenvectors of the covariance matrix of the training data.

The generated shapes can be constrained to be similar to those seen in the training set by constraining the allowed shape parameters, $\mathbf{b}_{s}$, to be similar to those extracted or learned from the training shapes. Typically, the distribution of training set shape parameters is modelled by a multivariate Gaussian pdf, and new shapes are generated by sampling from this pdf. To further enhance this type of models, appearance models were later developed. The appearance models encapsulate textural information about the variation across this set of shapes. It is possible to extend the former method and construct models that encapsulates not just the variation of the shape of objects in images, but also the variation in the appearance of the object itself. Appearance models were developed by Edwards et al. []. Their greatest contribution, advantage, and essence lie within the fact that they incorporate textural information rather than shape alone. Texture is a made out of grey-level pixel intensities. Incorporation of full colour is possible as well []. Colour can be simply thought of as an extension of the single grey-scale band. It can be divided into bands using the most common separability: red, green, and blue components6.

A shape model can be thought of as providing very limited information about the appearance of an object within an image, in that it describes the shape of an object, where it is implicitly understood that the shape of an object corresponds to strong edges in an image.

Appearance models describe not only the shape of an object, but the image intensities within the outline of the object as well. In the following subsections, three steps are discussed in turn: modelling shape, modelling intensity, and combined models.

The first step is building of shape models, which use a finite number of modes for representation. Shape models that are built from the outlines of objects in images enable those images to be brought into a state of alignment. We can warp the shapes within an image to match the mean shape. By interpolation, we can extend this warp to the entire interior of the object. This means that corresponding parts of shapes in those images will be easy to identify and then use, e.g. in order to sample intensities. To model texture, differences in shapes are removed by morphing each training image to the mean shape7. A shape-free texture patch can then be estimated from the image by sampling on a regular grid and forming a vector $\mathbf{g}$. Statistical analysis proceeds as for shapes and it results in the following linear expression for texture

\begin{displaymath}
\mathbf{g}=\overline{\mathbf{g}}+\mathbf{P}_{g}\mathbf{b}_{g}.
\end{displaymath} (2)

$\mathbf{g}$ is the intensity vector. $\mathbf{\mathbf{P}_{g}}$ contains the eigenvectors of the covariance matrix of the training data and $\mathbf{b}_{g}$ controls the intensity. An example appearance model is shown in Figure [*]. The process is hardly different from dimensionality reduction in the case of shape. The models in Equation [*] and Equation [*] have a linear form, so they are quite compact. This is a highly desirable property which makes the models flexible and manageable.

However, at the moment, the two components of the model, namely the shape $\mathbf{x}$ and the shape-free texture $\mathbf{g}$, are independent. In real images, shape and texture are not necessarily independent. One simple example to think of is an image of an individual's face. When the person changes expression, the shape of the face changes. But the texture (i.e. positions of highlights and shadows) obviously changes too, in a way that is correlated with the shape change. Hence it is desirable to merge the shape and texture models, so as to obtain a new model that is aware of both types of variation. This combined model can then also incorporate any correlations between shape and texture.

The parameters $\mathbf{b}_{s}$ and $\mathbf{b}_{g}$ are aggregated to form a single column vector


\begin{displaymath}
\left\{ \begin{array}{cc}
\mathbf{b}_{s}\\
\mathbf{b}_{g}\end{array}\right\} .
\end{displaymath} (3)

The new vector is a simple concatenation of the two. However, since the values of intensity and shape can be very different in magnitude, weighting is needed. Such weighting brings equilibrium, under which both shape and intensity maintain a sufficiently-noticeable effect and impact on the model they jointly build. A weighing matrix resolves the problem introduced here and it is, by convention, named $\mathbf{W}_{s}$8. With weighing in place, aggregation takes the form


\begin{displaymath}
\left\{ \begin{array}{cc}
\mathbf{\mathbf{W}_{\mathbf{\mathbf{s}}}}\mathbf{b}_{s}\\
\mathbf{b}_{g}\end{array}\right\}
\end{displaymath} (4)

where $\mathbf{W}_{s}$ is set to minimise inconsistencies due to scale. By applying another PCA step to the aggregated data, the following combined model is obtained


\begin{displaymath}
\begin{array}{cc}
\mathbf{x}_{i}=\bar{\mathbf{x}}+\mathbf{Q}...
..._{i}=\mathbf{\bar{g}}+\mathbf{Q}_{g}\mathbf{c}_{i}\end{array}.
\end{displaymath} (5)

The appearance (shape and texture) is now purely controlled by the new set of parameters, $\mathbf{c}$. There is no need to choose values for two `families' of distinct parameters. This combined model reaps the benefits of the dimensionality reduction performed, which is based on shape as well appearance. This means that this new model encompasses all the variation learned and the correlation between these two distinct components. Since PCA was applied, the number $n$ of parameters $\mathbf{c}_{i}$ is expected to be smaller than the number of parameters in $\mathbf{b}_{s}$ and $\mathbf{b}_{g}$ put together.

Figure: The effect of varying the first (top row), second, and third parameter of a brain appearance model by $\pm2.5$ standard deviations
Image aam_brain_set_modes_1-3

The work we shall deal with will not require texture or even a complex model such as the above. Prior work, however, necessitates understanding of possible future extensions.

``Developments in medical technology have long been confined to procedural or pharmaceutical advances, while neglecting a most basic and essential component of medicine: patient information management.''
- John Doolittle.

Roy Schestowitz 2012-01-08