TY - JOUR
T1 - Reconstruction of novel transcription factor regulons through inference of their binding sites
AU - Elmas, Abdulkadir
AU - Wang, Xiaodong
AU - Samoilov, Michael S.
N1 - Publisher Copyright:
© 2015 Elmas et al.
PY - 2015/9/21
Y1 - 2015/9/21
N2 - Background: In most sequenced organisms the number of known regulatory genes (e.g., transcription factors (TFs)) vastly exceeds the number of experimentally-verified regulons that could be associated with them. At present, identification of TF regulons is mostly done through comparative genomics approaches. Such methods could miss organism-specific regulatory interactions and often require expensive and time-consuming experimental techniques to generate the underlying data. Results: In this work, we present an efficient algorithm that aims to identify a given transcription factor's regulon through inference of its unknown binding sites, based on the discovery of its binding motif. The proposed approach relies on computational methods that utilize gene expression data sets and knockout fitness data sets which are available or may be straightforwardly obtained for many organisms. We computationally constructed the profiles of putative regulons for the TFs LexA, PurR and Fur in E. coli K12 and identified their binding motifs. Comparisons with an experimentally-verified database showed high recovery rates of the known regulon members, and indicated good predictions for the newly found genes with high biological significance. The proposed approach is also applicable to novel organisms for predicting unknown regulons of the transcriptional regulators. Results for the hypothetical protein D d e0289 in D. alaskensis include the discovery of a Fis-type TF binding motif. Conclusions: The proposed motif-based regulon inference approach can discover the organism-specific regulatory interactions on a single genome, which may be missed by current comparative genomics techniques due to their limitations.
AB - Background: In most sequenced organisms the number of known regulatory genes (e.g., transcription factors (TFs)) vastly exceeds the number of experimentally-verified regulons that could be associated with them. At present, identification of TF regulons is mostly done through comparative genomics approaches. Such methods could miss organism-specific regulatory interactions and often require expensive and time-consuming experimental techniques to generate the underlying data. Results: In this work, we present an efficient algorithm that aims to identify a given transcription factor's regulon through inference of its unknown binding sites, based on the discovery of its binding motif. The proposed approach relies on computational methods that utilize gene expression data sets and knockout fitness data sets which are available or may be straightforwardly obtained for many organisms. We computationally constructed the profiles of putative regulons for the TFs LexA, PurR and Fur in E. coli K12 and identified their binding motifs. Comparisons with an experimentally-verified database showed high recovery rates of the known regulon members, and indicated good predictions for the newly found genes with high biological significance. The proposed approach is also applicable to novel organisms for predicting unknown regulons of the transcriptional regulators. Results for the hypothetical protein D d e0289 in D. alaskensis include the discovery of a Fis-type TF binding motif. Conclusions: The proposed motif-based regulon inference approach can discover the organism-specific regulatory interactions on a single genome, which may be missed by current comparative genomics techniques due to their limitations.
KW - Motif discovery
KW - Regulon identification
KW - Sequential Monte Carlo filtering
KW - Transcription factor
UR - http://www.scopus.com/inward/record.url?scp=84941886302&partnerID=8YFLogxK
U2 - 10.1186/s12859-015-0685-y
DO - 10.1186/s12859-015-0685-y
M3 - Article
C2 - 26388177
AN - SCOPUS:84941886302
SN - 1471-2105
VL - 16
JO - BMC Bioinformatics
JF - BMC Bioinformatics
IS - 1
M1 - 299
ER -