Multi-source Multi-scale Counting in Extremely Dense Crowd Images

Haroon Idrees, Imran Saleemi, Cody Seibert, Mubarak Shah; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013, pp. 2547-2554


We propose to leverage multiple sources of information to compute an estimate of the number of individuals present in an extremely dense crowd visible in a single image. Due to problems including perspective, occlusion, clutter, and few pixels per person, counting by human detection in such images is almost impossible. Instead, our approach relies on multiple sources such as low confidence head detections, repetition of texture elements (using SIFT), and frequency-domain analysis to estimate counts, along with confidence associated with observing individuals, in an image region. Secondly, we employ a global consistency constraint on counts using Markov Random Field. This caters for disparity in counts in local neighborhoods and across scales. We tested our approach on a new dataset of fifty crowd images containing 64K annotated humans, with the head counts ranging from 94 to 4543. This is in stark contrast to datasets used for existing methods which contain not more than tens of individuals. We experimentally demonstrate the efficacy and reliability of the proposed approach by quantifying the counting performance.

Related Material

author = {Idrees, Haroon and Saleemi, Imran and Seibert, Cody and Shah, Mubarak},
title = {Multi-source Multi-scale Counting in Extremely Dense Crowd Images},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2013}