Missense variant pathogenicity predictors generalize well across a range of function-specific prediction challenges

Vikas Pejaver, Sean D. Mooney, Predrag Radivojac

Research output: Contribution to journalArticlepeer-review

38 Scopus citations

Abstract

The steady advances in machine learning and accumulation of biomedical data have contributed to the development of numerous computational models that assess the impact of missense variants. Different methods, however, operationalize impact differently. Two common tasks in this context are the prediction of the pathogenicity of variants and the prediction of their effects on a protein's function. These are related but distinct problems, and it is unclear whether methods developed for one are optimized for the other. The Critical Assessment of Genome Interpretation (CAGI) experiment provides a means to address this question empirically. To this end, we participated in various protein-specific challenges in CAGI with two objectives in mind. First, to compare the performance of methods in the MutPred family with the state-of-the-art. Second and more importantly, to investigate the applicability of general-purpose pathogenicity predictors to the classification of specific function-altering variants without additional training or calibration. We find that our pathogenicity predictors performed competitively with other methods, outputting score distributions in agreement with experimental outcomes. Overall, we conclude that binary classifiers learned from disease-causing mutations are capable of modeling important aspects of the underlying biology and the alteration of protein function resulting from mutations.

Original languageEnglish
Pages (from-to)1092-1108
Number of pages17
JournalHuman Mutation
Volume38
Issue number9
DOIs
StatePublished - Sep 2017
Externally publishedYes

Keywords

  • CAGI, functional effect prediction, generalization, machine learning, MutPred
  • MutPred2, pathogenicity prediction, severity

Fingerprint

Dive into the research topics of 'Missense variant pathogenicity predictors generalize well across a range of function-specific prediction challenges'. Together they form a unique fingerprint.

Cite this