Building a Hybrid Physical-Statistical Classifier for Predicting the Effect of Variants Related to Protein-Drug Interactions

Bo Wang, Chengfei Yan, Shaoke Lou, Prashant Emani, Bian Li, Min Xu, Xiangmeng Kong, William Meyerson, Yucheng T. Yang, Donghoon Lee, Mark Gerstein

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

A key issue in drug design is how population variation affects drug efficacy by altering binding affinity (BA) in different individuals, an essential consideration for government regulators. Ideally, we would like to evaluate the BA perturbations of millions of single-nucleotide variants (SNVs). However, only hundreds of protein-drug complexes with SNVs have experimentally characterized BAs, constituting too small a gold standard for straightforward statistical model training. Thus, we take a hybrid approach: using physically based calculations to bootstrap the parameterization of a full model. In particular, we do 3D structure-based docking on ∼10,000 SNVs modifying known protein-drug complexes to construct a pseudo gold standard. Then we use this augmented set of BAs to train a statistical model combining structure, ligand and sequence features and illustrate how it can be applied to millions of SNVs. Finally, we show that our model has good cross-validated performance (97% AUROC) and can also be validated by orthogonal ligand-binding data. Genetic variation may affect drug efficacy by altering its binding affinity to the protein target. GenoDock, developed by Wang et al., is a statistical model to predict the impacts of SNVs on protein-drug interactions by combining genomic, structural and physicochemical features.

Original languageEnglish
Pages (from-to)1469-1481.e3
JournalStructure
Volume27
Issue number9
DOIs
StatePublished - 3 Sep 2019
Externally publishedYes

Keywords

  • drug resistance
  • machine learning
  • nsSNV
  • protein-drug interactions

Fingerprint

Dive into the research topics of 'Building a Hybrid Physical-Statistical Classifier for Predicting the Effect of Variants Related to Protein-Drug Interactions'. Together they form a unique fingerprint.

Cite this