Protein sequence randomness and sequence/structure correlations

R. S. Rahman, S. Rackovsky

Research output: Contribution to journalArticlepeer-review

12 Scopus citations

Abstract

We investigated protein sequence/structure correlation by constructing a space of protein sequences, based on methods developed previously for constructing a space of protein structures. The space is constructed by using a representation of the amino acids as vectors of 10 property factors that encode almost all of their physical properties. Each sequence is represented by a distribution of overlapping sequence fragments. A distance between any two sequences can be calculated. By attaching a weight to each factor, intersequence distances can be varied. We optimize the correlation between corresponding distances in the sequence and structure spaces. The optimal correlation between the sequence and structure spaces is significantly better than that which results from correlating randomly generated sequences, having the overall composition of the data base, with the structure space. However, sets of randomly generated sequences, each of which approximates the composition of the real sequence it replaces, produce correlations with the structure space that are as good as that observed for the actual protein sequences. A connection is proposed with previous studies of the protein folding code. It is shown that the most important property factors for the correlation of the sequence and structure spaces are related to helix/bend preference, side chain bulk, and beta-structure preference.

Original languageEnglish
Pages (from-to)1531-1539
Number of pages9
JournalBiophysical Journal
Volume68
Issue number4
DOIs
StatePublished - 1995
Externally publishedYes

Fingerprint

Dive into the research topics of 'Protein sequence randomness and sequence/structure correlations'. Together they form a unique fingerprint.

Cite this