TY - JOUR
T1 - Gene Cluster Profile Vectors
T2 - A method to infer functionally related gene sets by grouping proximity-based gene clusters
AU - Pejaver, Vikas R.
AU - Kim, Sun
N1 - Funding Information:
This work was partially supported by a US National Science Foundation grant (NSF MCB-0731950) and the Microbial Systems Node in the METACyt Initiative from the Lilly Foundation. This article has been published as part of BMC Genomics Volume 12 Supplement 2, 2011: Selected articles from the IEEE International Conference on Bioinformatics and Biomedicine 2010. The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2164/12?issue=S2.
PY - 2011/7/27
Y1 - 2011/7/27
N2 - Background: Proximity-based methods and co-evolution-based phylogenetic profiles methods have been successfully used for the identification of functionally related genes. Proximity-based methods are effective for physically clustered genes while the phylogenetic profiles method is effective for co-occurring gene sets. However, both methods predict many false positives and false negatives. In this paper, we propose the Gene Cluster Profile Vector (GCPV) method, which combines these two methods by using phylogenetic profiles of whole gene clusters. The GCPV method is, currently, the only genome comparison based method that allows for the characterization of relationships between gene clusters based profiles of individual genes in clusters.Results: The GCPV method groups together reasonably related operons in E. coli about 60% of the time. The method is not sensitive to the choice of a reference genome set used and it outperforms the conventional phylogenetic profiles method. Finally, we show that the method works well for predicted gene clusters from C. crescentus and can serve as an important tool not only for understanding gene function, but also for elucidating mechanisms of general biological processes.Conclusions: The GCPV method has shown to be an effective and robust approach to the prediction of functionally related gene sets from proximity-based gene clusters or operons.
AB - Background: Proximity-based methods and co-evolution-based phylogenetic profiles methods have been successfully used for the identification of functionally related genes. Proximity-based methods are effective for physically clustered genes while the phylogenetic profiles method is effective for co-occurring gene sets. However, both methods predict many false positives and false negatives. In this paper, we propose the Gene Cluster Profile Vector (GCPV) method, which combines these two methods by using phylogenetic profiles of whole gene clusters. The GCPV method is, currently, the only genome comparison based method that allows for the characterization of relationships between gene clusters based profiles of individual genes in clusters.Results: The GCPV method groups together reasonably related operons in E. coli about 60% of the time. The method is not sensitive to the choice of a reference genome set used and it outperforms the conventional phylogenetic profiles method. Finally, we show that the method works well for predicted gene clusters from C. crescentus and can serve as an important tool not only for understanding gene function, but also for elucidating mechanisms of general biological processes.Conclusions: The GCPV method has shown to be an effective and robust approach to the prediction of functionally related gene sets from proximity-based gene clusters or operons.
UR - https://www.scopus.com/pages/publications/79960866606
U2 - 10.1186/1471-2164-12-S2-S2
DO - 10.1186/1471-2164-12-S2-S2
M3 - Article
C2 - 21989079
AN - SCOPUS:79960866606
SN - 1471-2164
VL - 12
JO - BMC Genomics
JF - BMC Genomics
IS - SUPPL.2
M1 - S2
ER -