Abstract
We evaluated whether predicted continuous disease representations could enhance genetic discovery beyond case-control genome-wide association study (GWAS) phenotypes across eight complex diseases in up to 485,448 UK Biobank participants. Predicted phenotypes had high genetic correlations with case-control phenotypes (median rg = 0.66) but identified more independent associations (median 306 versus 125). While some predicted phenotype associations were spurious, multi-trait analysis of GWAS-boosted case-control phenotypes identified a median of 46 additional variants per disease, of which a median of 73% replicated in FinnGen, 37% reached genome-wide significance in a UK Biobank/FinnGen meta-analysis, and 45% had supporting evidence. Predicted phenotypes also identified 14 genes targeted by phase I–IV drugs not identified by case-control phenotypes, and combined polygenic risk scores (PRSs) using both phenotypes improved prediction performance, with a median 37% increase in Nagelkerke's R2. Predicted phenotypes represent composite biomarkers complementing case-control approaches in genetic discovery, drug target prioritization, and risk prediction, though efficacy varies across diseases.
| Original language | English |
|---|---|
| Article number | 101115 |
| Journal | Cell Reports Methods |
| Volume | 5 |
| Issue number | 8 |
| DOIs | |
| State | Published - 18 Aug 2025 |
Keywords
- CP: computational biology
- CP: genetics
- electronic health records
- genome-wide association study
- machine learning
Fingerprint
Dive into the research topics of 'Genetic analyses of eight complex diseases using predicted continuous representations of disease'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver