PDF version of this entire document

Data

All experiments are using a Caltech-produced database of the rear of cars. There may be more options out there, but not many. Training is done on those images and testing is done on unseen videos or images similar to the training set (but not overlapping or identical). I have begun tracing some cars which were not included in the training set and many examples are shown in video form (auxiliary); it works reasonably well despite training on just 15-50 positives and 1000 negatives (commonly called background). The pace of processing is at around 5 frames per second when it is done properly. This can be sped up easily if the will is there, especially because the current code is wasteful (debugging bits are the main culprits).

Figure: Examples of background images (negatives)
Image negatives

Figure: Examples of positives
Image positives

Figure: Examples of positives with multiple scales (many rears of many cars)
<Image positives-many

To get more positives I will need to do manual work or just write a good script. It would not be so trivial as it requires a scanner/parser to pick up and collate pertinent tokens, then do some maths because formats vary semantically, not just structurally (PASCAL being inherently different from OpenCV input is an impediment). Figures [*]-[*] show example images from the data set used for the following experiments.

Roy Schestowitz 2012-07-02