TY - JOUR
T1 - Determination and inference of eukaryotic transcription factor sequence specificity.
AU - Weirauch, Matthew T.
AU - Yang, Ally
AU - Albu, Mihai
AU - Cote, Atina G.
AU - Montenegro-Montero, Alejandro
AU - Drewe, Philipp
AU - Najafabadi, Hamed S.
AU - Lambert, Samuel A.
AU - Mann, Ishminder
AU - Cook, Kate
AU - Zheng, Hong
AU - Goity, Alejandra
AU - van Bakel, Harm
AU - Lozano, Jean Claude
AU - Galli, Mary
AU - Lewsey, Mathew G.
AU - Huang, Eryong
AU - Mukherjee, Tuhin
AU - Chen, Xiaoting
AU - Reece-Hoyes, John S.
AU - Govindarajan, Sridhar
AU - Shaulsky, Gad
AU - Walhout, Albertha J.M.
AU - Bouget, François Yves
AU - Ratsch, Gunnar
AU - Larrondo, Luis F.
AU - Ecker, Joseph R.
AU - Hughes, Timothy R.
PY - 2014
Y1 - 2014
N2 - Transcription factor (TF) DNA sequence preferences direct their regulatory activity, but are currently known for only ∼1% of eukaryotic TFs. Broadly sampling DNA-binding domain (DBD) types from multiple eukaryotic clades, we determined DNA sequence preferences for >1,000 TFs encompassing 54 different DBD classes from 131 diverse eukaryotes. We find that closely related DBDs almost always have very similar DNA sequence preferences, enabling inference of motifs for ∼34% of the ∼170,000 known or predicted eukaryotic TFs. Sequences matching both measured and inferred motifs are enriched in chromatin immunoprecipitation sequencing (ChIP-seq) peaks and upstream of transcription start sites in diverse eukaryotic lineages. SNPs defining expression quantitative trait loci in Arabidopsis promoters are also enriched for predicted TF binding sites. Importantly, our motif "library" can be used to identify specific TFs whose binding may be altered by human disease risk alleles. These data present a powerful resource for mapping transcriptional networks across eukaryotes.
AB - Transcription factor (TF) DNA sequence preferences direct their regulatory activity, but are currently known for only ∼1% of eukaryotic TFs. Broadly sampling DNA-binding domain (DBD) types from multiple eukaryotic clades, we determined DNA sequence preferences for >1,000 TFs encompassing 54 different DBD classes from 131 diverse eukaryotes. We find that closely related DBDs almost always have very similar DNA sequence preferences, enabling inference of motifs for ∼34% of the ∼170,000 known or predicted eukaryotic TFs. Sequences matching both measured and inferred motifs are enriched in chromatin immunoprecipitation sequencing (ChIP-seq) peaks and upstream of transcription start sites in diverse eukaryotic lineages. SNPs defining expression quantitative trait loci in Arabidopsis promoters are also enriched for predicted TF binding sites. Importantly, our motif "library" can be used to identify specific TFs whose binding may be altered by human disease risk alleles. These data present a powerful resource for mapping transcriptional networks across eukaryotes.
UR - http://www.scopus.com/inward/record.url?scp=84907413210&partnerID=8YFLogxK
U2 - 10.1016/j.cell.2014.08.009
DO - 10.1016/j.cell.2014.08.009
M3 - Article
C2 - 25215497
AN - SCOPUS:84907413210
SN - 0092-8674
VL - 158
SP - 1431
EP - 1443
JO - Cell
JF - Cell
IS - 6
ER -