Motivation

In image analysis, segmentation is a key step. Segmentation identifies different regions of a particular image, so understanding of structure can be gained. There are two broad approaches to solving the segmentation problem. The first is the bottom-up approach, which involves grouping of pertinent parts of an image. The grouping then enables one to tell apart different regions and understand the relationship between them. The second approach to segmentation is the top-down approach. It relies on a model which represents some prior knowledge (or assumptions) about the image being interpreted. The model organises image evidence to provide an `explanation' of the image, from which statistics can be obtained directly.

There are advantages and disadvantages to each approach. The bottom-up approach makes few assumptions, but it can be difficult to achieve reliable results for complex images. The top-down approach uses prior knowledge that includes reliable segmentation, but cannot deal with situations where the model does not apply.

This chapter takes a look at the second approach to image segmentation. It describes models of shape and appearance, particularly the approach of Cootes and Edwards [,,,] who introduced models that capture variation in both shape and texture (in the graphics sense). These models have been used extensively in medical image analysis in applications ranging from brain morphometry to cardiac time-series analysis [,,]. There is also their extensive use in non-medical imaging applications such as face recognition.

The approach is an example of `interpretation by synthesis' - it uses a `generative' model that is capable of synthesising examples of the class of images to be interpreted. Interpretation of a given image proceeds by placing the synthesised image of an object containing an actual example of this object within the image being interpreted.

To use an example, Figure $[*]$ illustrates this approach. The model $\mathbf{M}$ is positioned in the image close enough to its target, which is the spine. The model contains knowledge about ways in which it is allowed to deform. Model parameters are repeatedly changed until the model (ideally) overlaps the target images. Then, by extracting model parameters, knowledge can be gained about the image itself. This method has proven useful in various areas, including industrial inspection, motion analysis [], face recognition [,], and medical image understanding [].

**Figure:** target spine image $\mathbf{T}$ , overlaid by a high-level representation (the model $\mathbf{M}$ ), which searches for an improved fit in the target image by transforming itself.
$\includegraphics[scale=0.7]{Graphics/model}$

The remainder of this chapter introduces statistical shape models, explains how they can be extended to full appearance models, and then describes their use in image interpretation.