Prior to use of cascade classifiers I attempted to exploit sliding window-based template matching, where a score for match is calculated in one of 6 possible ways (provided by and implemented in OpenCV). This idea was quickly dropped because of performance issues that had been identified by debugging and profiling executed code. Matching templates proved to be the most problematic part based on this profiling. Matching templates takes about 5 seconds, or an order of magnitude longer than all the other stages combined4. Here is an example sequence of operations as echoed in System.out:
05-28 23:42:41.558: I/System.out(10194): Fetching template from the Internet
05-28 23:42:41.628: I/System.out(10194): Resizing template
05-28 23:42:41.638: I/System.out(10194): Height: 20
05-28 23:42:41.638: I/System.out(10194): Width: 20
05-28 23:42:56.058: I/System.out(10194): Starting Canny edge detection
05-28 23:42:56.248: I/System.out(10194): Starting Gaussian Blur
05-28 23:42:56.338: I/System.out(10194): Encoding conversions
05-28 23:42:56.338: I/System.out(10194): Matching template
05-28 23:43:02.598: I/System.out(10194): Frame completed
05-28 23:43:02.728: I/System.out(10194): Starting Canny edge detection
05-28 23:43:02.888: I/System.out(10194): Starting Gaussian Blur
05-28 23:43:02.968: I/System.out(10194): Encoding conversions
05-28 23:43:02.978: I/System.out(10194): Matching template
05-28 23:43:09.108: I/System.out(10194): Frame completed
05-28 23:43:09.228: I/System.out(10194): Starting Canny edge detection
05-28 23:43:09.408: I/System.out(10194): Starting Gaussian Blur
05-28 23:43:09.498: I/System.out(10194): Encoding conversions
05-28 23:43:09.498: I/System.out(10194): Matching template
05-28 23:43:15.718: I/System.out(10194): Frame completed
05-28 23:43:15.818: I/System.out(10194): Starting Canny edge detection
05-28 23:43:15.988: I/System.out(10194): Starting Gaussian Blur
05-28 23:43:16.078: I/System.out(10194): Encoding conversions
05-28 23:43:16.078: I/System.out(10194): Matching template
05-28 23:43:22.018: I/System.out(10194): Frame completed
05-28 23:43:22.198: I/System.out(10194): Starting Canny edge detection
05-28 23:43:22.368: I/System.out(10194): Starting Gaussian Blur
05-28 23:43:22.448: I/System.out(10194): Encoding conversions
05-28 23:43:22.458: I/System.out(10194): Matching template
05-28 23:43:29.078: I/System.out(10194): Frame completed
05-28 23:43:29.458: I/System.out(10194): Starting Canny edge detection
05-28 23:43:30.028: I/System.out(10194): Starting Gaussian Blur
05-28 23:43:30.248: I/System.out(10194): Encoding conversions
05-28 23:43:30.258: I/System.out(10194): Matching template
05-28 23:43:40.278: I/System.out(10194): Frame completed
For real-time applications one needs to carefully think how to thin down the pipeline. I am using only one CPU core, so it is similar enough to ARM-based smartphones hardware (more on the hardware in Section 4).
For visual augmentation of the above profiling information I wrote a shell script which I have made freely available5. Its goal is to turn this text into a graph, eventually displayed using gnuplot and other command-like GNU tools. The bumps help show just how considerable the toll of template matching really was at the time (see Figure ).
|
Roy Schestowitz 2012-07-02