Deeply Learned Attributes for Crowded Scene Understanding

Jing Shao, Kai Kang, Chen Change Loy, Xiaogang Wang; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 4657-4666

Abstract


Crowded scene understanding is a fundamental problem in computer vision. In this study, we develop a multi-task deep model to jointly learn and combine appearance and motion features for crowd understanding. We propose crowd motion channels as the input of the deep model and the channel design is inspired by generic properties of crowd systems. To well demonstrate our deep model, we construct a new large-scale WWW Crowd dataset with 10000 videos from 8257 crowded scenes, and build an attribute set with 94 attributes on WWW. We further measure user study performance on WWW and compare this with the proposed deep models. Extensive experiments show that our deep models display significant performance improvements in cross-scene attribute recognition compared to strong crowd-related feature-based baselines, and the deeply learned features behave an superior performance in multi-task learning.

Related Material


[pdf]
[bibtex]
@InProceedings{Shao_2015_CVPR,
author = {Shao, Jing and Kang, Kai and Change Loy, Chen and Wang, Xiaogang},
title = {Deeply Learned Attributes for Crowded Scene Understanding},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2015}
}