Practical protocol for Yao’s millionaires problem enables secure multi-party computation of metrics and efficient privacy-preserving k-NN for large data sets
MetadataShow full item record
Finding the nearest k objects to a query object is a fundamental operation for many data mining algorithms. With the recent interest in privacy, it is not surprising that there is strong interest in k-NN queries to enable clustering, classification and outlier-detection tasks. However, previous approaches to privacy-preserving k-NN have been costly and can only be realistically applied to small data sets. In this paper, we provide efficient solutions for k-NN queries for vertically partitioned data. We provide the first solution for the L(inf) (or Chessboard) metric as well as detailed privacy-preserving computation of all other Minkowski metrics. We enable privacy-preserving L(inf) by providing a practical approach to the Yao's millionaire problem with more than two parties. This is based on a pragmatic and implementable solution to Yao's millionaire problem with shares. We also provide privacy-preserving algorithms for combinations of local metrics into a global metric that handles the large dimensionality and diversity of attributes common in vertically partitioned data. To manage very large data sets, we provide a privacy-preserving SASH (a very successful data structure for associative queries in high dimensions). Besides providing a theoretical analysis, we illustrate the efficiency of our approach with an empirical evaluation.
Knowledge and Information Systems
Analysis of Algorithms and Complexity
Pattern Recognition and Data Mining
Distributed Computing not elsewhere classified