Classification epitopes in groups based on their protein family

Edgar Ernesto Gonzalez Kozlova, Benjamin Thomas Viart, Ricardo Andrez Machado de Avila, Liza Figueredo Felicori, Carlos Chavez-Olortegui

Research output: Contribution to journalArticlepeer-review

10 Scopus citations


Background: The humoral immune system response is based on the interaction between antibodies and antigens for the clearance of pathogens and foreign molecules. The interaction between these proteins occurs at specific positions known as antigenic determinants or B-cell epitopes. The experimental identification of epitopes is costly and time consuming. Therefore the use of in silico methods, to help discover new epitopes, is an appealing alternative due the importance of biomedical applications such as vaccine design, disease diagnostic, anti-venoms and immune-therapeutics. However, the performance of predictions is not optimal been around 70% of accuracy. Further research could increase our understanding of the biochemical and structural properties that characterize a B-cell epitope. Results: We investigated the possibility of linear epitopes from the same protein family to share common properties. This hypothesis led us to analyze physico-chemical (PCP) and predicted secondary structure (PSS) features of a curated dataset of epitope sequences available in the literature belonging to two different groups of antigens (metalloproteinases and neurotoxins). We discovered statistically significant parameters with data mining techniques which allow us to distinguish neurotoxin from metalloproteinase and these two from random sequences. After a five cross fold validation we found that PCP based models obtained area under the curve values (AUC) and accuracy above 0.9 for regression, decision tree and support vector machine. Conclusions: We demonstrated that antigen's family can be inferred from properties within a single group of linear epitopes (metalloproteinases or neurotoxins). Also we discovered the characteristics that represent these two epitope groups including their similarities and differences with random peptides and their respective amino acid sequence. These findings open new perspectives to improve epitope prediction by considering the specific antigen's protein family. We expect that these findings will help to improve current computational mapping methods based on physico-chemical due it's potential application during epitope discovery.

Original languageEnglish
Article numberS7
JournalBMC Bioinformatics
Issue number19
StatePublished - 16 Dec 2015
Externally publishedYes


  • >Data mining
  • B cell epitopes
  • Epitope prediction
  • Metalloproteinases
  • Neurotoxins
  • Protein family


Dive into the research topics of 'Classification epitopes in groups based on their protein family'. Together they form a unique fingerprint.

Cite this