Practical protocol for Yao’s millionaires problem enables secure multi-party computation of metrics and efficient privacy-preserving k-NN for large data sets
Author(s)
Amirbekyan, Artak
Estivill-Castro, Vladimir
Year published
2009
Metadata
Show full item recordAbstract
Finding the nearest k objects to a query object is a fundamental operation for many data mining algorithms. With the recent interest in privacy, it is not surprising that there is strong interest in k-NN queries to enable clustering, classification and outlier-detection tasks. However, previous approaches to privacy-preserving k-NN have been costly and can only be realistically applied to small data sets. In this paper, we provide efficient solutions for k-NN queries for vertically partitioned data. We provide the first solution for the L(inf) (or Chessboard) metric as well as detailed privacy-preserving computation of all ...
View more >Finding the nearest k objects to a query object is a fundamental operation for many data mining algorithms. With the recent interest in privacy, it is not surprising that there is strong interest in k-NN queries to enable clustering, classification and outlier-detection tasks. However, previous approaches to privacy-preserving k-NN have been costly and can only be realistically applied to small data sets. In this paper, we provide efficient solutions for k-NN queries for vertically partitioned data. We provide the first solution for the L(inf) (or Chessboard) metric as well as detailed privacy-preserving computation of all other Minkowski metrics. We enable privacy-preserving L(inf) by providing a practical approach to the Yao's millionaire problem with more than two parties. This is based on a pragmatic and implementable solution to Yao's millionaire problem with shares. We also provide privacy-preserving algorithms for combinations of local metrics into a global metric that handles the large dimensionality and diversity of attributes common in vertically partitioned data. To manage very large data sets, we provide a privacy-preserving SASH (a very successful data structure for associative queries in high dimensions). Besides providing a theoretical analysis, we illustrate the efficiency of our approach with an empirical evaluation.
View less >
View more >Finding the nearest k objects to a query object is a fundamental operation for many data mining algorithms. With the recent interest in privacy, it is not surprising that there is strong interest in k-NN queries to enable clustering, classification and outlier-detection tasks. However, previous approaches to privacy-preserving k-NN have been costly and can only be realistically applied to small data sets. In this paper, we provide efficient solutions for k-NN queries for vertically partitioned data. We provide the first solution for the L(inf) (or Chessboard) metric as well as detailed privacy-preserving computation of all other Minkowski metrics. We enable privacy-preserving L(inf) by providing a practical approach to the Yao's millionaire problem with more than two parties. This is based on a pragmatic and implementable solution to Yao's millionaire problem with shares. We also provide privacy-preserving algorithms for combinations of local metrics into a global metric that handles the large dimensionality and diversity of attributes common in vertically partitioned data. To manage very large data sets, we provide a privacy-preserving SASH (a very successful data structure for associative queries in high dimensions). Besides providing a theoretical analysis, we illustrate the efficiency of our approach with an empirical evaluation.
View less >
Journal Title
Knowledge and Information Systems
Volume
21
Issue
3
Subject
Distributed computing and systems software not elsewhere classified
Information systems