Privacy-Preserving Gradient Descent for Distributed Genome-Wide Analysis

Loading...
Thumbnail Image
File version

Accepted Manuscript (AM)

Author(s)
Zhang, Y
Bai, G
Li, X
Curtis, C
Chen, C
Ko, RKL
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2021
Size
File type(s)
Location

Darmstadt, Germany

License
Abstract

Genome-wide analysis, which provides perceptive insights into complex diseases, plays an important role in biomedical data analytics. It usually involves large-scale human genomic data, and thus may disclose sensitive information about individuals. While existing studies have been conducted against data exfiltration by external malicious actors, this work focuses on the emerging identity tracing attack that occurs when a dishonest insider attempts to re-identify obtained DNA samples. We propose a framework named υFRAG to facilitate privacy-preserving data sharing and computation in genome-wide analysis. υFRAG mitigates privacy risks by using vertical fragmentations to disrupt the genetic architecture on which the adversary relies for re-identification. The fragmentation significantly reduces the overall amount of information the adversary can obtain. Notably, it introduces no sacrifice to the capability of genome-wide analysis—we prove that it preserves the correctness of gradient descent, the most popular optimization approach for training machine learning models. We also explore the efficiency performance of υFRAG through experiments on a large-scale, real-world dataset. Our experiments demonstrate that υFRAG outperforms not only secure multiparty computation (MPC) and homomorphic encryption (HE) protocols with a speedup of more than 221x for training neural networks, but also noise-based differential privacy (DP) solutions and traditional non-private algorithms in most settings.

Journal Title
Conference Title

Lecture Notes in Computer Science

Book Title
Edition
Volume

12973

Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© Springer Nature Switzerland AG 2021. This is the author-manuscript version of this paper. Reproduced in accordance with the copyright policy of the publisher.The original publication is available at www.springerlink.com

Item Access Status
Note
Access the data
Related item(s)
Subject

Genomics

Clinical sciences

Persistent link to this record
Citation

Zhang, Y; Bai, G; Li, X; Curtis, C; Chen, C; Ko, RKL, Privacy-Preserving Gradient Descent for Distributed Genome-Wide Analysis, Lecture Notes in Computer Science, 2021, 12973, pp. 395-416