Progress Report
January 17th, 2006
Overview
- Tasks which were agreed on
- ISBI submission
- Summary of E-mail exchange - figures and explanations (chronological)
- Miscellaneous points of progress
- Steps ahead
- Note: these slides are poorly organised, albeit they are very detailed
Tasks/Agreed Upon Previously
- 10 instantiations for
synthetic -> synthetic
calculations
- Different subsets to be used in the above instantiations to learn about the standard deviation
- Training -> synthetic direction to fix (reverse it) in entropy calculations
- Brain shape - too wide, so investigate the oddities
- ISBI camera-ready - Final ISBI paper by February 1st - circulate, submit, register
- Paper submission draft (TMI) - text has seen no change since the last meeting
- Error propagation in entropy calculations - document composed and sent
ISBI Paper
- Accepted for presentation
- Type of presentation (oral/poster) for the paper will be determined in the upcoming weeks when they prepare the final program
- Will be informed of the decision by the end of this month
- The deadline for final paper submission is February 1st
- Reviewers' comment taken into account already
- Changes made particularly where the reviewers comprehended the paper and made suggestions
ISBI: Draft for Camera-Ready Version
- Camera-ready candidate sent
- Applied the requested changes to the version we had originally submitted
- Need to earn some space and reduce it to just 4 pages, space lost due to enlarged figure
- The current version is something to encourage comments, suggestions and corrections
ISBI Paper - Modifications
- Modified the paper to comply with the suggestion made by the reviewer/s
- Made the changes highlighted in the printed copy after we had submitted (grammar, margins)
- There are acknowledgements which needed to be added as per Bill's request/suggestion
- This revised version also replaces a single figure
- It is that figure which illustrates the shuffle distance calculation
Document on Error Estimation in Entropy
- This is a document which needed to be produced at some stage
- The explanations are rather fuzzy, but it is not a formal deliverable
- Explanation is included about the general concepts, not just the errors
Monday Meeting Talk
- Already E-mailed and spoken to him before with regards to an available slot
- Sent another message to Karl
- He just needed to be reminded, so hopefully the rota will shortly be updated
- A slot has been promised, but currently the rota is quite heavily-occupied and needs re-shuffling
Specificity in a Mockup Case
- Plot which illustrates the effect of separating 2 synthetic distributions
- Deals with simplified dispertion of the distributions and its effect on Specificity
- The parameters chosen:
number_of_examples=100
numbers_of_dimensions=100
number_of_repetitions=10
- 10 steps (sample points), ranging from an offset (shift) of 0.1 to 1 inclusive
- Euclidean distance only at this stage
Specificity in a Mockup Case - Ctd.
- Need to ensure that overlap between the distributions is decreased in a better way
- Also need to make the entropy function more generalisable so that it handles this synthetic set
- As regards the direction of the training and synthetic sets, they were reversed at the earlier stage
- Previously calculated training to synthetics
- From there onwards we calculated it the other way around
- All the matrix data was there, but this invalidated some plots we had been looking at
Specificity - Distribution Moves Diagonally
Specificity - Distribution Moves at Random
Changing the size of the Variance - Effect on Specificity
Changing the size of the Variance - Effect on Entropy
Entropy Versus Distribution Overlap
- Separation of the distributions is now handled by moving one of them in a random direction (the unit vector) and distance of some value 'a'
- Re-plotted Specificity and implemented the necessary entropy 'hooks', which enable to plot it for different values of alpha
- The values of alpha in this plot correspond to the greyscale value of the line colours, which lies in the range
[0,1]
Entropy Variation - Different Alpha Values
Specificity with Error Bars
- The chosen offsets are chosen from a normal distribution
- Distribution with mean zero, variance one and standard deviation one
Entropy Curves With Errors
- These errors (due to cross-instantiation differences) are so small that they arouse suspicion
- Apparently, these were calculated the right way at this early stage
- Later results indicate that the errors are probably correct
Entropy and Specificity Compared
- A couple of curves which show that entropy possibly offers a better discriminant
- The errors, however, are related to inter-instantiation variance
- They do not, however, account for the error in the measures themselves
Comparing Specificity and Entropy
From 100 Dimensions to 10,000 Dimensions
- For comparison, shown are the plots derived from data whose number of dimension is 100 and 10,000
10,000 Dimensions Example - Specificity
10,000 Dimensions Example - Entropy
Distribution Size, Displacement
- The figures show the effect of moving the clouds apart by an excessive amount where the increase becomes linear
- Also shown is the effect of increasing variance of one of the distributions, by up to 60%
- By looking at the graphs vertically (concentrating on one given value of the X axis), one can see that the degradation is not linear
- As witnessed before, the gap between those points increases (easier to see in the Specificity plots)
- Specificity is measured over a matrix with Euclidean distances
- Alpha for graph entropy is set to
0.3
, quite arbitrarily so
Training set, Synthetic Set and Synthetic Set Subsample
- Set the size of the sets to:
- Training set: 50 images
- Synthetic set: 1000 images
- Synthetic set subsample: 50 images
- The plots of entropy indicate that the choice of alpha is rather important
- Also included are plots from an experiment where the synthetic set is of size 50
- This makes the size more balanced, which in turn makes the distances comparable and the curves intersect
Small-Scale Example - Specificity
Large-Scale Example - Specificity
Small-Scale Example - Total Entropy
Large-Scale Example - Total Entropy
3 Entropy Plots Combined
- Let us look more closely at the calculation of the entropy
- Written explanation (legend) is included in the first among the two images
- They show (H(G(A->A)), (H(G(A->B)) and (H(G(A->B)) -(H(G(A->A)), each using a different colour
- The smaller example also uses greyscale to indicate the value of alpha
The Effect of Varying Alpha
Alpha Variation in Larger Experiments
Sensitivity Implemented
- At this stage, very coarse sensitivity plots were produces, which do not cater for valid comparisons
- Some time that day, all the machines were snatched by someone else who uses their full power
- Others rarely 'nice' their jobs, so peers can only get a tiny fraction of the CPU capacity
Sensitivity of the Specificity - Small Sample
Sensitivity of the Entropy - Small Sample
Sensitivity Plots - Observation
- This seems too strange to be correct
- All the sensitivity plots are almost identical
- This behaviour repeated itself in subsequent experiments that were larger
Implementation - Makeover
- A worthwhile thing was making the code more encapsulated
- Added the following arguments (with example value assignments):
n_steps=10;
deformation_extent_a_min=0.1;
deformation_extent_a_step=1/n_steps;
deformation_extent_a_max=1;
Implementation - Subsequent Experiments
- Produced smooth curves that show how truly well-behaved the measures
are, at least when synthetic data is handled
- One can control other parameters as well, e.g.:
number_of_examples_static=50; % AKA training set
number_of_examples_deformed=1000; % synthetic set
numbers_of_dimensions=100;
number_of_repetitions=10;
- Figure outputs which have been produced so far are all reproducible
- Needs tweaking somewhere among the options
Further Experiments
- Work on producing all the distance matrices, which will be necessary (as many as we can)
- Produced plots to show sensitivity based on a large experiment where:
- We have 10 instantiations
- The training (static) distribution) set is of size 50
- The synthetic (dynamic) set is of size 1000
- The numbers of dimensions is 1000
Notes on Incorrect Plots
- Taken a careful look at some past documents, the code, and the results
- Also re-ran a few smaller experiments
- After a while of debugging it turned out that there was a missing pair of brackets
- These were needed in order for subtraction to take precedence over division
- Took a while to re-produce sensitivity plots for large experiments
- Shown in the subsequent slide is something coarser or well as finer
Coarse Sensitivity Curve - Specificity
Coarse Sensitivity Curves - Entropy
Corrected Plots
- Re-plotted the sensitivity of entropy, which is now based on larger experiments
T size=50;
S_i size=50;
S_0 size=1000;
- 100 dimensions;
- 10 repetitions
- Shown are curves of entropy with
alpha=0.5
and alpha=0.9
Entropy Sensitivity (Large Sample)