Outline/Thoughts About Operation

Remaining tasks that will be taken care of over time are hard to name very specifically because they depend heavily on progress and results which this progress brings about. The plan going forward is to complete (by coding) the missing pieces of our conceptual framework at an appropriate capacity, put all the datasets (pre-processed) in a suitable frame of reference using the ICP2 implementation, then feed the data from the images into PCA and start building models for feasibility tests. Then, data may need to be classified based on criteria such as neutral and non-neutral, in order to achieve some sort of separability. For the time being, this division is pre-supplied, which simplifies everything.

Once the pre-processing step works reasonably well with many arbitrary images, running the algorithm on thousands of images would be worthwhile, with all images then stored offline (saved to disk) for quicker experiments to be performed on them later. We only need to open images and make the experiments a dual-phase process (data preparation separated from modeling).

With an existing implementation of model-building, dealing with large sets should be possible, albeit it can consume a lot of computer resources. It is desirable to plan very carefully what sets of experiments are wanted here. Should we reproduce the experiments from Mian's group or branch our to exploring different aspects of the problem, perhaps building a hybrid of algorithms by fusing in some homebrew code that makes use of what previously worked well and therefore yields unique work that has a more local 'flavour' rather than a reproduction of what's presented by IJCV? Novelty is required for papers, but a plan needs to be outlined along with an hypothesis. By now, some of these questions have been answered and will be further explored later. These constitute a documentation of early discussions.

In terms of timeframe, things have progressed reasonably well so far, despite major limitations in terms of resources (overly occupied server cores), no fixed computer in lab, no local access to MATLAB at home, except GNU Octave. With all sorts of accounts-related issues that consumed a lot of time and with all the data now in place, things should progress more smoothly from now on. The literature is also well understood and images with known properties are in place (all ~100 gigabytes of them). This is beyond the scope of this document though. Some areas already addressed or still being addressed are:

automatic classification of neutral or not neutral
testing efficiently for spikes and holes handling (see Figure $[*]$ )
convert to centimeters and make code resistant to scale changes by making it more adaptable to given measurements
replicate the results of prior work, if possible (may require over-occupation with work that has already been done)
acquiring sufficient computing resource (Amazon, Google, local servers, clusters, etc.) for larger experiments to come

Roy Schestowitz 2012-01-08