Fundamental amino acid mass distributions and entropy costs in proteomes

Jean Lehmann, Albert Libchaber, Benjamin D. Greenbaum

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

We examine whether the frequency of amino acids across an organism's proteome is primarily determined by optimization to function or other factors, such as the structure of the genetic code. Considering all available proteins together, we first point out that the frequency of an amino acid in a proteome negatively correlates with its mass, suggesting that the genome preserves a fundamental distribution ruled by simple energetics. Given the universality of such distributions, one can use outliers, cysteine and leucine, to identify amino acids that deviate from this simple rule for functional purposes and examine those functions. We quantify the strength of such selection as the entropic cost outliers pay to defy the mass-frequency relation. Codon degeneracy of an amino acid partially explains the correlation between mass and frequency: light amino acids being typically encoded by highly degenerate codon families, with the exception of arginine. While degeneracy may be a factor in hard wiring the relationship between mass and frequency in proteomes, it does not provide a complete explanation. By examining extremophiles, we are able to show that this law weakens with temperature, likely due to protein stability considerations, thus the environment is essential.

Original languageEnglish
Pages (from-to)119-124
Number of pages6
JournalJournal of Theoretical Biology
Volume410
DOIs
StatePublished - 7 Dec 2016

Keywords

  • Amino acid statistics
  • Extremophiles
  • Genetic code origin
  • Genome evolution

Fingerprint

Dive into the research topics of 'Fundamental amino acid mass distributions and entropy costs in proteomes'. Together they form a unique fingerprint.

Cite this