TY - GEN

T1 - A bit-parallel, general integer-scoring sequence alignment algorithm

AU - Benson, Gary

AU - Hernandez, Yozen

AU - Loving, Joshua

PY - 2013

Y1 - 2013

N2 - Mapping of next-generation sequencing data and other pro-cessor-intensive sequence comparison applications have motivated a continued search for high efficiency sequence alignment algorithms. In one approach, which exploits the inherent parallelism in computer logic calculations, individual cells in an alignment scoring matrix are represented as bits in a computer word and the calculation of scores is emulated by a series of bit operations comprised of AND, OR, XOR, complement, shift, and addition. Bit-parallelism has been successfully applied to the Longest Common Subsequence (LCS) and edit-distance problems, producing solutions which are significantly faster than standard implementations. But, the intensive mental effort required to produce these solutions, which are closely tied to special properties of the problems, has limited efforts to extend bit-parallelism to more general scoring schemes. In this paper, we give the first bit-parallel solution for general, integer-scoring global alignment. Integer-scoring schemes, which are widely used, assign integer weights for match, mismatch, and insertion/deletion or indel. Our method depends on structural properties of the relationship between adjacent scores in the scoring matrix. We utilize these properties to construct a class of efficient algorithms, each designed for a particular set of weights, and we introduce a standard for characterizing the efficiency in terms of the average number of bit-operations per cell of the original scoring matrix.

AB - Mapping of next-generation sequencing data and other pro-cessor-intensive sequence comparison applications have motivated a continued search for high efficiency sequence alignment algorithms. In one approach, which exploits the inherent parallelism in computer logic calculations, individual cells in an alignment scoring matrix are represented as bits in a computer word and the calculation of scores is emulated by a series of bit operations comprised of AND, OR, XOR, complement, shift, and addition. Bit-parallelism has been successfully applied to the Longest Common Subsequence (LCS) and edit-distance problems, producing solutions which are significantly faster than standard implementations. But, the intensive mental effort required to produce these solutions, which are closely tied to special properties of the problems, has limited efforts to extend bit-parallelism to more general scoring schemes. In this paper, we give the first bit-parallel solution for general, integer-scoring global alignment. Integer-scoring schemes, which are widely used, assign integer weights for match, mismatch, and insertion/deletion or indel. Our method depends on structural properties of the relationship between adjacent scores in the scoring matrix. We utilize these properties to construct a class of efficient algorithms, each designed for a particular set of weights, and we introduce a standard for characterizing the efficiency in terms of the average number of bit-operations per cell of the original scoring matrix.

KW - bit-parallelism

KW - global sequence alignment

KW - integer weights

UR - http://www.scopus.com/inward/record.url?scp=84884298304&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-38905-4_7

DO - 10.1007/978-3-642-38905-4_7

M3 - Conference contribution

AN - SCOPUS:84884298304

SN - 9783642389047

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 50

EP - 61

BT - Combinatorial Pattern Matching - 24th Annual Symposium, CPM 2013, Proceedings

T2 - 24th Annual Symposium on Combinatorial Pattern Matching, CPM 2013

Y2 - 17 June 2013 through 19 June 2013

ER -