Improvement of statistical potentials and threading score functions using information maximization

Armando D. Solis, S. Rackovsky

Research output: Contribution to journalArticlepeer-review

26 Scopus citations

Abstract

We show that statistical potentials and threading score functions, derived from finite data sets, are informatic functions, and that their performance depends on the manner in which data are classified and compressed. The choice of sequence and structural parameters affects estimates of the conditional probabilities P(C/S), the quantification of the effect of sequence S on conformation C, and determines the amount of information extracted from the data set, as measured by information gain. The mathematical link between information gain and mean conformational energy, established in this work using the local backbone potential as model, demonstrates that manipulation of descriptive parameters also alters the "energy" values assigned to native conformation and to decoy structures in the test pool, and consequently, the performance of such statistical potential functions in fold recognition exercises. We show that sequence and structural partitions that maximize information gain also minimize the mean energy of the ensemble of native conformations. Moreover, we establish an informatic basis for the placement of the native score within an energy spectrum given by the decoy pool in a threading exercise. We discover that, among all informatic quantities, information gain is the best predictor of threading success, even better than the standard Z-score. Consequently, the choices of sequence and structural descriptors, extent of compression, and levels of discretization that maximize information gain must also produce the best potential functions. Strategies to optimize these parameters with respect to information extraction are therefore relevant to building better statistical potentials. Last, we demonstrate that the backbone torsion potential, defined by the trimer sequence, can be an effective tool in greatly reducing the set of possible conformations from a vast decoy pool.

Original languageEnglish
Pages (from-to)892-908
Number of pages17
JournalProteins: Structure, Function and Bioinformatics
Volume62
Issue number4
DOIs
StatePublished - 1 Mar 2006

Keywords

  • Information gain
  • Information theory
  • Local potential
  • Protein structure
  • Statistical potentials
  • Threading

Fingerprint

Dive into the research topics of 'Improvement of statistical potentials and threading score functions using information maximization'. Together they form a unique fingerprint.

Cite this