Data structures for fast Bregman k-NN queries
Abstract
k-nearest neighbor (k-NN) search is a crucial tool in many challenging applications
of computer vision, ranging from image and information retrieval, to classification and data mining.
E.g., k-NN can be used as the basis for non-parametric kernel density classification.
Although many efforts have been made to speed up k-NN retrieval, this computational task
still remains unsatisfactorily expensive and critical in applications.
Namely, we have extended two well-known data structures for fast indexing and k-nearest
neighbor retrieval to the information-theoretic class of Bregman divergences
(symmetrized or not): metric ball tree and vantage point tree.
On the one hand, we propose an effective Bregman ball tree (BB-tree)
construction algorithm that adapts locally its “arity” degree to the inherent geometric characteristics
of the datasets. As attested by extensive experiments, this allows one to meet more often pruning
conditions whenever traversing the tree for exact or approximate NN queries without
yet penalizing the construction time. Then, since symmetric measures are usually preferred
for applications in content-based information retrieval and categorization, we furthermore extend
the BB-tree to the case of symmetrized Bregman divergences, such as the Jensen-Shannon divergence
(i.e., the symmetrical Kullback-Leibler divergence). Exact and approximate 1-NN search experiments
using high-dimensional data arising from real-world images (SIFT signatures, color histograms)
illustrate that our method improves over the state of the art significantly (up to an order of magnitude).
On the other hand, we generalize the vantage point tree (vp-tree) to the same class
of distorsion measures. This data structure was originally introduced for information retrieval
in metric spaces, and has recently shown very good performances for the retrieval of image
patches with respect to the L2 metric.
Related publications
| Frank Nielsen, Paolo Piro, Michel Barlaud. "Bregman Vantage Point Trees for Efficient Nearest Neighbor Queries". In ICME '09, New York City, US, July 2009. |
|
| Frank Nielsen, Paolo Piro, Michel Barlaud. "Tailored Bregman Ball Trees for Effective Nearest Neighbors". In EuroCG '09, Bruxelles, Belgium, March 2009. |
|
Code
BBtree++C source code of our improved Bregman Ball Tree.Our implementation is based on code provided by L. Cayton and was tested under Linux-32 platform. |
|
BVPtreeC source code of our implementation of Bregman Vantage Point Tree.Tested for compilation under Linux-32 platform. |
|
