Target Matching

$next$ $up$ $previous$ $contents$ $index$
Next: Existing Extensions Up: Model Fitting Previous: Learning the Correlations Contents Index

Target Matching

The final stage, which is arguably the most fascinating one, involves the use of the model above, as well as the correlations learned and recorded for that model. It is possible to carry out a search which is driven by the calculated difference between the model and a given target image. In pragmatic terms, this means that fitting of the existing model will slowly be improved until the model approximately covers the target^2.32. It is all done purely by changing the values of the model parameters. The model state, having explored many false states, then holds (in the form of parameter values) some information about the target image and this information can be further analysed. One parameter in a model of faces, for example, could describe the vertical angle of given faces. This is also where the power of a statistical model lies - being able to describe something compound in a very compact form.

The search for model match is reliant on error (or conversely similarity) measures which are repeatedly calculated after each attempted parameterisation of the model. Having applied some change to the parameters, a new estimate of difference is obtained. Each such change in parameter values is primarily guided by the matrices described on page $[*]$ . These express the correlation between variation modes (the similarity transformations as well as modes of appearance change) and the intensity values which describe difference (or match discrepancy).

The model, as shown in Figure $[*]$ (or earlier on in Figure cap:A-target-image-2), is initially placed somewhere inside the image frame, with reasonable proximity to its target. If the model is placed too far from its to-be target, there is a danger that it will be unable to converge to the target correctly. It will most likely get stuck in a local minimum (the global minimum being out of reach as Section $[*]$ explains) and the outcome can be severe in a more crucial practice such as medical imaging (or perhaps more drastically, computer-guided or -aided surgery). The reason why good initialisation is essential is that significantly large displacements are rarely learned off-line and the difference between the target and the model is quite meaningless unless there is at least some partial overlap or commonality.

The algorithm which is used to perform the search quite rapidly has a general form that resembles the following:

Place the appearance model M somewhere in the image, preferably at the centre where the target of interest (to be denoted by I) is likely to lie^2.33.
For the appearance model in its current state and the static target, perform the following:
- Calculate the differences between the model and the target. This can be done by synthesising M and calculating M - I.
- Using the correlations learned off-line^2.34, set new values for the parameters $\mathbf{c}_{i}$ of M.
- Compute the new difference measures between the model and the target (as previously).
  - Save the new state of the appearance model if the difference has been lowered, i.e. similarity is being approached.
  - If not, try re-adjusting the parameter change, potentially with inclusion of a scaling coefficient and so forth. This often achieves good results, although it is a heuristics-driven technique.
Iterate while no convergence has been reached and improvements are still observed at times.

More advanced methodologies and algorithms are used at present, but better clarity is achieved by adhering to simplicity.

**Figure:** Model and target fitting.
$\includegraphics[%% scale=0.7]{./Graphics/aam.eps}$

The technique of matching an appearance model to a target image is well-depicted by a staged simulation, a video clip or a large sequence of images resembling the one in Figure $[*]$ . Somewhat remarkably, only a few dozens of iterations are required in order to get good matching outcomes. This of course depends on the algorithm, the magnitude of the problem and its innate involution.

As a superficial example, fitting of a perfectly round ball versus a human hand is an interesting problem. Assuming that there is a good contrast between the ball and the background, there should be few false alarms for good fits. An inspection of the difference image is then almost trivial for human appraisal in this case, while fingers become deceiving in the case of hands. In accordance with these very same arguments, the process of correlation-learning should often be custom-built. It should at least treat the problems with respect to its complexity because sensitivity to change and matching (much like recovery) abilities vary greatly in reality.

$next$ $up$ $previous$ $contents$ $index$
Next: Existing Extensions Up: Model Fitting Previous: Learning the Correlations Contents Index

2004-08-02