Adaptive hash retrieval with kernel based similarity

View/ Open
File version
Accepted Manuscript (AM)
Author(s)
Bai, Xiao
Yan, Cheng
Yang, Haichuan
Bai, Lu
Zhou, Jun
Hancock, Edwin Robert
Griffith University Author(s)
Year published
2018
Metadata
Show full item recordAbstract
Indexing methods have been widely used for fast data retrieval on large scale datasets. When the data are
represented by high dimensional vectors, hashing is often used as an efficient solution for approximate
similarity search. When a retrieval task does not involve supervised training data, most hashing methods
aim at preserving data similarity defined by a distance metric on the feature vectors. Hash codes generated
by these approaches normally maintain the Hamming distance of the data in accordance with the
similarity function, but ignore the local details of the distribution of data. This objective is not suitable ...
View more >Indexing methods have been widely used for fast data retrieval on large scale datasets. When the data are represented by high dimensional vectors, hashing is often used as an efficient solution for approximate similarity search. When a retrieval task does not involve supervised training data, most hashing methods aim at preserving data similarity defined by a distance metric on the feature vectors. Hash codes generated by these approaches normally maintain the Hamming distance of the data in accordance with the similarity function, but ignore the local details of the distribution of data. This objective is not suitable for k-nearest neighbor search since the similarity to the nearest neighbors can vary significantly for different data samples. In this paper, we present a novel adaptive similarity measure which is consistent with k-nearest neighbor search, and prove that it leads to a valid kernel if the original similarity function is a kernel function. Next we propose a method which calculates hash codes using the kernel function. With a low-rank approximation, our hashing framework is more effective than existing methods that preserve similarity over an arbitrary kernel. The proposed similarity function, hashing framework, and their combination demonstrate significant improvement when compared with several alternative state-of-the-art methods.
View less >
View more >Indexing methods have been widely used for fast data retrieval on large scale datasets. When the data are represented by high dimensional vectors, hashing is often used as an efficient solution for approximate similarity search. When a retrieval task does not involve supervised training data, most hashing methods aim at preserving data similarity defined by a distance metric on the feature vectors. Hash codes generated by these approaches normally maintain the Hamming distance of the data in accordance with the similarity function, but ignore the local details of the distribution of data. This objective is not suitable for k-nearest neighbor search since the similarity to the nearest neighbors can vary significantly for different data samples. In this paper, we present a novel adaptive similarity measure which is consistent with k-nearest neighbor search, and prove that it leads to a valid kernel if the original similarity function is a kernel function. Next we propose a method which calculates hash codes using the kernel function. With a low-rank approximation, our hashing framework is more effective than existing methods that preserve similarity over an arbitrary kernel. The proposed similarity function, hashing framework, and their combination demonstrate significant improvement when compared with several alternative state-of-the-art methods.
View less >
Journal Title
Pattern Recognition
Copyright Statement
© 2017 Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Licence (http://creativecommons.org/licenses/by-nc-nd/4.0/) which permits unrestricted, non-commercial use, distribution and reproduction in any medium, providing that the work is properly cited.
Note
This publication has been entered into Griffith Research Online as an Advanced Online Version.
Subject
Artificial intelligence