Look and Think Twice: Capturing Top-Down Visual Attention With Feedback Convolutional Neural Networks

Chunshui Cao, Xianming Liu, Yi Yang, Yinan Yu, Jiang Wang, Zilei Wang, Yongzhen Huang, Liang Wang, Chang Huang, Wei Xu, Deva Ramanan, Thomas S. Huang; The IEEE International Conference on Computer Vision (ICCV), 2015, pp. 2956-2964

Abstract


While feedforward deep convolutional neural networks (CNNs) have been a great success in computer vision, it is important to remember that the human visual contex contains generally more feedback connections than foward connections. In this paper, we will briefly introduce the background of feedbacks in the human visual cortex, which motivates us to develop a computational feedback mechanism in the deep neural networks. The proposed networks perform inference from image features in a bottom-up manner as traditional convolutional networks; while during feedback loops it sets up high-level semantic labels as the agoala to infer the activation status of hidden layer neurons. The feedback networks help us better visualize and understand on how deep neural networks work as well as capture visual attention on expected objects, even in the images with cluttered background and multiple objects.

Related Material


[pdf]
[bibtex]
@InProceedings{Cao_2015_ICCV,
author = {Cao, Chunshui and Liu, Xianming and Yang, Yi and Yu, Yinan and Wang, Jiang and Wang, Zilei and Huang, Yongzhen and Wang, Liang and Huang, Chang and Xu, Wei and Ramanan, Deva and Huang, Thomas S.},
title = {Look and Think Twice: Capturing Top-Down Visual Attention With Feedback Convolutional Neural Networks},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2015}
}