Event Retrieval in Large Video Collections with Circulant Temporal Encoding

Jerome Revaud, Matthijs Douze, Cordelia Schmid, Herve Jegou; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013, pp. 2459-2466

Abstract


This paper presents an approach for large-scale event retrieval. Given a video clip of a specific event, e.g., the wedding of Prince William and Kate Middleton, the goal is to retrieve other videos representing the same event from a dataset of over 100k videos. Our approach encodes the frame descriptors of a video to jointly represent their appearance and temporal order. It exploits the properties of circulant matrices to compare the videos in the frequency domain. This offers a significant gain in complexity and accurately localizes the matching parts of videos. Furthermore, we extend product quantization to complex vectors in order to compress our descriptors, and to compare them in the compressed domain. Our method outperforms the state of the art both in search quality and query time on two large-scale video benchmarks for copy detection, T RECVID and CC WEB . Finally, we introduce a challenging dataset for event retrieval, EVVE, and report the performance on this dataset.

Related Material


[pdf]
[bibtex]
@InProceedings{Revaud_2013_CVPR,
author = {Revaud, Jerome and Douze, Matthijs and Schmid, Cordelia and Jegou, Herve},
title = {Event Retrieval in Large Video Collections with Circulant Temporal Encoding},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2013}
}