Cartesian K-Means

Mohammad Norouzi, David J. Fleet; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013, pp. 3017-3024

Abstract


A fundamental limitation of quantization techniques like the k-means clustering algorithm is the storage and runtime cost associated with the large numbers of clusters required to keep quantization errors small and model fidelity high. We develop new models with a compositional parameterization of cluster centers, so representational capacity increases super-linearly in the number of parameters. This allows one to effectively quantize data using billions or trillions of centers. We formulate two such models, Orthogonal k-means and Cartesian k-means. They are closely related to one another, to k-means, to methods for binary hash function optimization like ITQ [5], and to Product Quantization for vector quantization [7]. The models are tested on largescale ANN retrieval tasks (1M GIST, 1B SIFT features), and on codebook learning for object recognition (CIFAR-10).

Related Material


[pdf]
[bibtex]
@InProceedings{Norouzi_2013_CVPR,
author = {Norouzi, Mohammad and Fleet, David J.},
title = {Cartesian K-Means},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2013}
}