TY - JOUR
T1 - Portability of 245 polygenic scores when derived from the UK Biobank and applied to 9 ancestry groups from the same cohort
AU - Privé, Florian
AU - Aschard, Hugues
AU - Carmi, Shai
AU - Folkersen, Lasse
AU - Hoggart, Clive
AU - O'Reilly, Paul F.
AU - Vilhjálmsson, Bjarni J.
N1 - Publisher Copyright:
© 2021 American Society of Human Genetics
PY - 2022/1/6
Y1 - 2022/1/6
N2 - The low portability of polygenic scores (PGSs) across global populations is a major concern that must be addressed before PGSs can be used for everyone in the clinic. Indeed, prediction accuracy has been shown to decay as a function of the genetic distance between the training and test cohorts. However, such cohorts differ not only in their genetic distance but also in their geographical distance and their data collection and assaying, conflating multiple factors. In this study, we examine the extent to which PGSs are transferable between ancestries by deriving polygenic scores for 245 curated traits from the UK Biobank data and applying them in nine ancestry groups from the same cohort. By restricting both training and testing to the UK Biobank data, we reduce the risk of environmental and genotyping confounding from using different cohorts. We define the nine ancestry groups at a sub-continental level, based on a simple, robust, and effective method that we introduce here. We then apply two different predictive methods to derive polygenic scores for all 245 phenotypes and show a systematic and dramatic reduction in portability of PGSs trained using Northwestern European individuals and applied to nine ancestry groups. These analyses demonstrate that prediction already drops off within European ancestries and reduces globally in proportion to genetic distance. Altogether, our study provides unique and robust insights into the PGS portability problem.
AB - The low portability of polygenic scores (PGSs) across global populations is a major concern that must be addressed before PGSs can be used for everyone in the clinic. Indeed, prediction accuracy has been shown to decay as a function of the genetic distance between the training and test cohorts. However, such cohorts differ not only in their genetic distance but also in their geographical distance and their data collection and assaying, conflating multiple factors. In this study, we examine the extent to which PGSs are transferable between ancestries by deriving polygenic scores for 245 curated traits from the UK Biobank data and applying them in nine ancestry groups from the same cohort. By restricting both training and testing to the UK Biobank data, we reduce the risk of environmental and genotyping confounding from using different cohorts. We define the nine ancestry groups at a sub-continental level, based on a simple, robust, and effective method that we introduce here. We then apply two different predictive methods to derive polygenic scores for all 245 phenotypes and show a systematic and dramatic reduction in portability of PGSs trained using Northwestern European individuals and applied to nine ancestry groups. These analyses demonstrate that prediction already drops off within European ancestries and reduces globally in proportion to genetic distance. Altogether, our study provides unique and robust insights into the PGS portability problem.
KW - ancestry
KW - polygenic scores
KW - portability
UR - http://www.scopus.com/inward/record.url?scp=85122002030&partnerID=8YFLogxK
U2 - 10.1016/j.ajhg.2021.11.008
DO - 10.1016/j.ajhg.2021.11.008
M3 - Article
C2 - 34995502
AN - SCOPUS:85122002030
SN - 0002-9297
VL - 109
SP - 12
EP - 23
JO - American Journal of Human Genetics
JF - American Journal of Human Genetics
IS - 1
ER -