TY - GEN
T1 - Graph-based approaches for motif discovery
AU - Zaslavsky, Elena
PY - 2013
Y1 - 2013
N2 - Sequence motif finding is a very important and long-studied problem in computational molecular biology. While various motif representations and discovery methods exist, a recent development of graph-based algorithms has allowed practical concerns, such as positional correlations within motifs, to be taken into account. This survey provides an overview of the multi-partite graph formulation of motif finding, and focuses on algorithmic aspects of various motif discovery methodologies. Motif finding has been recast as a number of different graph substructure identification problems. First we review a formulation as a maximum-weight clique finding problem, and examine two different integer linear programs to model it. The motif finding algorithms use graph pruning techniques and a cutting planes approach in conjunction with linear programming relaxations. Secondly, we discuss a formulation of motif discovery as that of maximum density subgraph finding, and review a maximum flow based algorithm in an appropriately augmented flow network. Finally, we mention the 'subtle' motifs formulation, and define its corresponding graph problem of maximal clique identification. We discuss two different approaches to tackle this problem, one based on winnowing spurious edges and the other on divide-and-conquer sub-clique finding.
AB - Sequence motif finding is a very important and long-studied problem in computational molecular biology. While various motif representations and discovery methods exist, a recent development of graph-based algorithms has allowed practical concerns, such as positional correlations within motifs, to be taken into account. This survey provides an overview of the multi-partite graph formulation of motif finding, and focuses on algorithmic aspects of various motif discovery methodologies. Motif finding has been recast as a number of different graph substructure identification problems. First we review a formulation as a maximum-weight clique finding problem, and examine two different integer linear programs to model it. The motif finding algorithms use graph pruning techniques and a cutting planes approach in conjunction with linear programming relaxations. Secondly, we discuss a formulation of motif discovery as that of maximum density subgraph finding, and review a maximum flow based algorithm in an appropriately augmented flow network. Finally, we mention the 'subtle' motifs formulation, and define its corresponding graph problem of maximal clique identification. We discuss two different approaches to tackle this problem, one based on winnowing spurious edges and the other on divide-and-conquer sub-clique finding.
UR - http://www.scopus.com/inward/record.url?scp=84891285694&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84891285694
SN - 9812771654
SN - 9789812771650
T3 - Clustering Challenges in Biological Networks
SP - 83
EP - 99
BT - Clustering Challenges in Biological Networks
T2 - DIMACS Workshop on Clustering Problems in Biological Networks 2009
Y2 - 9 May 2006 through 11 May 2006
ER -