TY - GEN
T1 - Reconstruction of novel transcription factor regulons through inference of their binding sites
AU - Elmas, Abdulkadir
AU - Wang, Xiaodong
AU - Samoilov, Michael S.
PY - 2013
Y1 - 2013
N2 - In most sequenced organisms the number of known regulatory genes (e.g., transcription factors (TFs)) vastly exceeds the number of experimentally- verified regulons which could be associated with them. At present, identification of TF regulons is mostly done through comparative genomics approaches. The nature of such methods causes them to frequently miss organism-specific regulatory interactions and often requires expensive and time-consuming experimental techniques to generate the underlying data. One approach to computationally addressing these problems is though discovery of transcription factor binding sites (TFBSs) and inference of corresponding regulons based on the location of such motifs across the genome. In this work, we present an efficient algorithm that aims to identify a given transcription factor's regulon through inference of its unknown binding sites, based on the discovery of its binding motif, which is also unknown and is estimated within the algorithm framework. The proposed approach relies on computational methods that utilize microarray gene expression data sets and fitness data sets which are available or may be straightforwardly obtained for many organisms. We computationally constructed the profiles of putative regulons for the TFs LexA and PurR in E. coli K12 and identified their binding motifs. Comparisons with an experimentally-verified database showed high recovery rates of the known regulon members (93% recovery for LexA regulon), and indicated good predictions for the newly found regulon genes with high biological significances. The results also show that the algorithm can predict genome-wide transcription factor binding motifs, which display high homology to their known consensus binding sites. The proposed approach is also applicable to novel organisms for predicting unknown regulons of the transcriptional regulators.
AB - In most sequenced organisms the number of known regulatory genes (e.g., transcription factors (TFs)) vastly exceeds the number of experimentally- verified regulons which could be associated with them. At present, identification of TF regulons is mostly done through comparative genomics approaches. The nature of such methods causes them to frequently miss organism-specific regulatory interactions and often requires expensive and time-consuming experimental techniques to generate the underlying data. One approach to computationally addressing these problems is though discovery of transcription factor binding sites (TFBSs) and inference of corresponding regulons based on the location of such motifs across the genome. In this work, we present an efficient algorithm that aims to identify a given transcription factor's regulon through inference of its unknown binding sites, based on the discovery of its binding motif, which is also unknown and is estimated within the algorithm framework. The proposed approach relies on computational methods that utilize microarray gene expression data sets and fitness data sets which are available or may be straightforwardly obtained for many organisms. We computationally constructed the profiles of putative regulons for the TFs LexA and PurR in E. coli K12 and identified their binding motifs. Comparisons with an experimentally-verified database showed high recovery rates of the known regulon members (93% recovery for LexA regulon), and indicated good predictions for the newly found regulon genes with high biological significances. The results also show that the algorithm can predict genome-wide transcription factor binding motifs, which display high homology to their known consensus binding sites. The proposed approach is also applicable to novel organisms for predicting unknown regulons of the transcriptional regulators.
UR - http://www.scopus.com/inward/record.url?scp=84901275736&partnerID=8YFLogxK
U2 - 10.1109/ACSSC.2013.6810525
DO - 10.1109/ACSSC.2013.6810525
M3 - Conference contribution
AN - SCOPUS:84901275736
SN - 9781479923908
T3 - Conference Record - Asilomar Conference on Signals, Systems and Computers
SP - 1400
EP - 1404
BT - Conference Record of the 47th Asilomar Conference on Signals, Systems and Computers
PB - IEEE Computer Society
T2 - 2013 47th Asilomar Conference on Signals, Systems and Computers
Y2 - 3 November 2013 through 6 November 2013
ER -