Phenotypic signatures in clinical data enable systematic identification of patients for genetic testing

Theodore J. Morley, Lide Han, Victor M. Castro, Jonathan Morra, Roy H. Perlis, Nancy J. Cox, Lisa Bastarache, Douglas M. Ruderfer

Research output: Contribution to journalArticlepeer-review

17 Scopus citations

Abstract

Around 5% of the population is affected by a rare genetic disease, yet most endure years of uncertainty before receiving a genetic test. A common feature of genetic diseases is the presence of multiple rare phenotypes that often span organ systems. Here, we use diagnostic billing information from longitudinal clinical data in the electronic health records (EHRs) of 2,286 patients who received a chromosomal microarray test, and 9,144 matched controls, to build a model to predict who should receive a genetic test. The model achieved high prediction accuracies in a held-out test sample (area under the receiver operating characteristic curve (AUROC), 0.97; area under the precision–recall curve (AUPRC), 0.92), in an independent hospital system (AUROC, 0.95; AUPRC, 0.62), and in an independent set of 172,265 patients in which cases were broadly defined as having an interaction with a genetics provider (AUROC, 0.9; AUPRC, 0.63). Patients carrying a putative pathogenic copy number variant were also accurately identified by the model. Compared with current approaches for genetic test determination, our model could identify more patients for testing while also increasing the proportion of those tested who have a genetic disease. We demonstrate that phenotypic patterns representative of a wide range of genetic diseases can be captured from EHRs to systematize decision-making for genetic testing, with the potential to speed up diagnosis, improve care and reduce costs.

Original languageEnglish
Pages (from-to)1097-1104
Number of pages8
JournalNature Medicine
Volume27
Issue number6
DOIs
StatePublished - Jun 2021
Externally publishedYes

Fingerprint

Dive into the research topics of 'Phenotypic signatures in clinical data enable systematic identification of patients for genetic testing'. Together they form a unique fingerprint.

Cite this