Phenotype Similarity Regression for Identifying the Genetic Determinants of Rare Diseases

Daniel Greene, Sylvia Richardson, Ernest Turro

Research output: Contribution to journalArticlepeer-review

27 Scopus citations


Rare genetic disorders, which can now be studied systematically with affordable genome sequencing, are often caused by high-penetrance rare variants. Such disorders are often heterogeneous and characterized by abnormalities spanning multiple organ systems ascertained with variable clinical precision. Existing methods for identifying genes with variants responsible for rare diseases summarize phenotypes with unstructured binary or quantitative variables. The Human Phenotype Ontology (HPO) allows composite phenotypes to be represented systematically but association methods accounting for the ontological relationship between HPO terms do not exist. We present a Bayesian method to model the association between an HPO-coded patient phenotype and genotype. Our method estimates the probability of an association together with an HPO-coded phenotype characteristic of the disease. We thus formalize a clinical approach to phenotyping that is lacking in standard regression techniques for rare disease research. We demonstrate the power of our method by uncovering a number of true associations in a large collection of genome-sequenced and HPO-coded cases with rare diseases.

Original languageEnglish
Pages (from-to)490-499
Number of pages10
JournalAmerican Journal of Human Genetics
Issue number3
StatePublished - 3 Mar 2016
Externally publishedYes


Dive into the research topics of 'Phenotype Similarity Regression for Identifying the Genetic Determinants of Rare Diseases'. Together they form a unique fingerprint.

Cite this