Harvesting Discriminative Meta Objects With Deep CNN Features for Scene Classification

Ruobing Wu, Baoyuan Wang, Wenping Wang, Yizhou Yu; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1287-1295

Abstract


Recent work on scene classification still makes use of generic CNN features in a rudimentary manner. In this paper, we present a novel pipeline built upon deep CNN features to harvest discriminative visual objects and parts for scene classification. We first use a region proposal technique to generate a set of high-quality patches potentially containing objects, and apply a pre-trained CNN to extract generic deep features from these patches. Then we perform both unsupervised and weakly supervised learning to screen these patches and discover discriminative ones representing category-specific objects and parts. We further apply discriminative clustering enhanced with local CNN fine-tuning to aggregate similar objects and parts into groups, called meta objects. A scene image representation is constructed by pooling the feature response maps of all the learned meta objects at multiple spatial scales. We have confirmed that the scene image representation obtained using this new pipeline is capable of delivering state-of-the-art performance on two popular scene benchmark datasets, MIT Indoor 67 [22] and Sun397 [31].

Related Material


[pdf]
[bibtex]
@InProceedings{Wu_2015_ICCV,
author = {Wu, Ruobing and Wang, Baoyuan and Wang, Wenping and Yu, Yizhou},
title = {Harvesting Discriminative Meta Objects With Deep CNN Features for Scene Classification},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2015}
}