Lp-Norm IDF for Large Scale Image Search

Liang Zheng, Shengjin Wang, Ziqiong Liu, Qi Tian; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013, pp. 1626-1633


The Inverse Document Frequency (IDF) is prevalently utilized in the Bag-of-Words based image search. The basic idea is to assign less weight to terms with high frequency, and vice versa. However, the estimation of visual word frequency is coarse and heuristic. Therefore, the effectiveness of the conventional IDF routine is marginal, and far from optimal. To tackle this problem, this paper introduces a novel IDF expression by the use of L p -norm pooling technique. Carefully designed, the proposed IDF takes into account the term frequency, document frequency, the complexity of images, as well as the codebook information. Optimizing the IDF function towards optimal balancing between TF and pIDF weights yields the so-called L p -norm IDF (pIDF). We show that the conventional IDF is a special case of our generalized version, and two novel IDFs, i.e. the average IDF and the max IDF, can also be derived from our formula. Further, by counting for the term-frequency in each image, the proposed L p -norm IDF helps to alleviate the visual word burstiness phenomenon. Our method is evaluated through extensive experiments on three benchmark datasets (Oxford 5K, Paris 6K and Flickr 1M). We report a performance improvement of as large as 27.1% over the baseline approach. Moreover, since the L p -norm IDF is computed offline, no extra computation or memory cost is introduced to the system at all.

Related Material

author = {Zheng, Liang and Wang, Shengjin and Liu, Ziqiong and Tian, Qi},
title = {Lp-Norm IDF for Large Scale Image Search},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2013}