Multiethnic polygenic risk scores improve risk prediction in diverse populations
File version
Accepted Manuscript (AM)
Author(s)
Loh, Po-Ru
Price, Alkes L
Pinidiyapathirage, Janani
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
Size
File type(s)
Location
License
Abstract
Methods for genetic risk prediction have been widely investigated in recent years. However, most available training data involves European samples, and it is currently unclear how to accurately predict disease risk in other populations. Previous studies have used either training data from European samples in large sample size or training data from the target population in small sample size, but not both. Here, we introduce a multiethnic polygenic risk score that combines training data from European samples and training data from the target population. We applied this approach to predict type 2 diabetes (T2D) in a Latino cohort using both publicly available European summary statistics in large sample size (Neff= 40k) and Latino training data in small sample size (Neff= 8k). Here, we attained a >70% relative improvement in prediction accuracy (from R2= 0.027 to 0.047) compared to methods that use only one source of training data, consistent with large relative improvements in simulations. We observed a systematically lower load of T2D risk alleles in Latino individuals with more European ancestry, which could be explained by polygenic selection in ancestral European and/or Native American populations. We predict T2D in a South Asian UK Biobank cohort using European (Neff= 40k) and South Asian (Neff= 16k) training data and attained a >70% relative improvement in prediction accuracy, and application to predict height in an African UK Biobank cohort using European (N = 113k) and African (N = 2k) training data attained a 30% relative improvement. Our work reduces the gap in polygenic risk prediction accuracy between European and non-European target populations.
Journal Title
Genetic Epidemiology
Conference Title
Book Title
Edition
Volume
41
Issue
8
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
This is the peer reviewed version of the following article: Márquez-Luna, C, Loh, P-R, South Asian Type 2 Diabetes (SAT2D) Consortium, The SIGMA Type 2 Diabetes Consortium, Price, AL. Multi-ethnic polygenic risk scores improve risk prediction in diverse populations. Genet Epidemiol. 2017; 41: 811– 823, which has been published in final form at https://doi.org/10.1002/gepi.22083. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions. This article may not be enhanced, enriched or otherwise transformed into a derivative work, without express permission from Wiley or by statutory rights under applicable legislation. Copyright notices must not be removed, obscured or modified. The article must be linked to Wiley’s version of record on Wiley Online Library and any embedding, framing or otherwise making available the article or pages thereof by third parties from platforms, services and websites other than Wiley Online Library must be prohibited.
Item Access Status
Note
Access the data
Related item(s)
Subject
Genetics
Health services and systems
Public health
Science & Technology
Life Sciences & Biomedicine
Genetics & Heredity
Mathematical & Computational Biology
genome-wide association study
Persistent link to this record
Citation
Marquez-Luna, C; Loh, P-R; Price, AL; Pinidiyapathirage, Janani, Multiethnic polygenic risk scores improve risk prediction in diverse populations, Genetic Epidemiology 2017, 41 (8), pp. 811-823