Progress Report

February 22nd, 2005

Overview

This progress report focuses on technical aspects
More visual results than before
Discussions and analysis are most important

Recapitulation of Main Ideas

Demonstrate a model evaluation method which works
Reveal the behaviour of evaluation as sigma (of noise) changes
Find out how different distance measures affect evaluation
Devise the algorithm/s to evaluate models
Compare Tim's models (derived automatically from registration)
Unveil 'goodness' of particular registration (automatic landmarking) algorithms

A Peek at Important Points

Models were evaluated without knowledge of which one is which
Evaluation was unbiassed
No attempts to adjust algorithm to fit anticipated results
Conclusion I: Elimination of redundant points improves the model
Conclusion II: Group-wise better than pair-wise

Comparison of Distance Measures

Distance measures affect ability to differentiate models
Many choices can be made
In the past few days:
- Measures were judged visually at the start
- Measures were compared later using a real example

Distance Measures

Intensity differences

Distance Measures

Squared intensity differences

Distance Measures

Shuffle distance, window size 3x3 (radius 1)
Intensity differences

Distance Measures

Shuffle distance, window size 3x3 (radius 1)
Squared intensity differences

Distance Measures

Shuffle distance, window size 5x5 (radius 2)
Intensity differences

Distance Measures

Shuffle distance, window size 5x5 (radius 2)
Squared intensity differences

Example Distance Matrix

500 syntheses, 104 images in training set
Shuffle distance, window size 7x7

Testing the Model Evaluation Method

Get hand-annotated dataset
Add an increasing amount of noise to annotation
Expect specificity to increase as function of noise
Generalisability likewise

Results

Behaviour as expected/hoped for
Noise increased nearly up to the point where folding is introduced
Awkward behaviour is inflicted upon model due to excess folding

Results

Specificity increases as model exacerbates

Results

Same with generalisability
Relatively smooth curve

Results

The matrix of distances for Sigma_perturbation := 0..7
Brighter colours indicate greater distance, i.e. difference

Practical Tests

Data for models passed on without any description from Tim
Many large experiments performed
Results do not necessarily correspond to same algorithm since code was progressively improved
Nonetheless, results remain consistent

Description of Three Experiments (Tim)

Experiment 1: Pairwise registration with a regular grid of 16 x 16 points
Experiment 2: Pairwise registration with a grid of 16 x 16 points, but removing those in low variance regions
Experiment 3: Groupwise registration using grid as in Experiment 2

Practical Tests: Visual

Specificity

Practical Tests: Visual

Generalisability

Summary

The work can be broken down into 3 stages:
- Reasoning about the model evaluation framework
- Showing that it works in principle
- Using it to evaluate Tim's results

Present and Future

Another 4 'anonymous' models are evaluated at present
IPMI: section about evaluation to complete?