Sequence conservation in the prediction of catalytic sites

Yongchao Dou, Xingbo Geng, Hongyun Gao, Jialiang Yang, Xiaoqi Zheng, Jun Wang

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


Predicting catalytic sites of a given enzyme is an important open problem of Bioinformatics. Recently, many machine learning-based methods have been developed which have the advantage that they can account for many sequential or structural features. We found that although many kinds of features are incorporated, protein sequence conservation is the main part of information they used and should play an important role in the future. So we tested several conservation features in their ability to predict catalytic sites by using the Support Vector Machine classifier. Our results suggest that position specific scoring matrix performs better than other features and incorporating conservation information of sequentially adjacent sites is more effective than that of structurally adjacent ones. Moreover, although conservation information is effective in predicting catalytic sites, it is a difficult problem to optimize the combination of conservation features and other ones.

Original languageEnglish
Pages (from-to)229-239
Number of pages11
JournalProtein Journal
Issue number4
StatePublished - Apr 2011
Externally publishedYes


  • Catalytic site prediction
  • Neighboring sites
  • Sequence conservation
  • Support vector machine


Dive into the research topics of 'Sequence conservation in the prediction of catalytic sites'. Together they form a unique fingerprint.

Cite this