Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition

Henriques, Joao F.; Carreira, Joao; Caseiro, Rui; Batista, Jorge

Joao F. Henriques, Joao Carreira, Rui Caseiro, Jorge Batista; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2013, pp. 2760-2767

Abstract

Competitive sliding window detectors require vast training sets. Since a pool of natural images provides a nearly endless supply of negative samples, in the form of patches at different scales and locations, training with all the available data is considered impractical. A staple of current approaches is hard negative mining, a method of selecting relevant samples, which is nevertheless expensive. Given that samples at slightly different locations have overlapping support, there seems to be an enormous amount of duplicated work. It is natural, then, to ask whether these redundancies can be eliminated. In this paper, we show that the Gram matrix describing such data is block-circulant. We derive a transformation based on the Fourier transform that block-diagonalizes the Gram matrix, at once eliminating redundancies and partitioning the learning problem. This decomposition is valid for any dense features and several learning algorithms, and takes full advantage of modern parallel architectures. Surprisingly, it allows training with all the potential samples in sets of thousands of images. By considering the full set, we generate in a single shot the optimal solution, which is usually obtained only after several rounds of hard negative mining. We report speed gains on Caltech Pedestrians and INRIA Pedestrians of over an order of magnitude, allowing training on a desktop computer in a couple of minutes.

Related Material

[pdf]

[bibtex]

@InProceedings{Henriques_2013_ICCV,
author = {Henriques, Joao F. and Carreira, Joao and Caseiro, Rui and Batista, Jorge},
title = {Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2013}
}