TY - GEN
T1 - The Einstein Genome Gateway using WASP - A high throughput multi-layered life sciences portal for XSEDE
AU - Golden, Aaron
AU - McLellan, Andrew S.
AU - Dubin, Robert A.
AU - Jing, Qiang
AU - Broin, Pilib Ó
AU - Moskowitz, David
AU - Zhang, Zhengdong
AU - Suzuki, Masako
AU - Hargitai, Joseph
AU - Calder, R. Brent
AU - Greally, John M.
PY - 2012
Y1 - 2012
N2 - Massively-parallel sequencing (MPS) technologies and their diverse applications in genomics and epigenomics research have yielded enormous new insights into the physiology and pathophysiology of the human genome. The biggest hurdle remains the magnitude and diversity of the datasets generated, compromising our ability to manage, organize, process and ultimately analyse data. The Wiki-based Automated Sequence Processor (WASP), developed at the Albert Einstein College of Medicine (hereafter Einstein), uniquely manages to tightly couple the sequencing platform, the sequencing assay, sample metadata and the automated workflows deployed on a heterogeneous high performance computing cluster infrastructure that yield sequenced, quality-controlled and 'mapped' sequence data, all within the one operating environment accessible by a web-based GUI interface.WASP at Einstein processes 4-6 TB of data per week and since its production cycle commenced it has processed ∼ 1 PB of data overall and has revolutionized user interactivity with these new genomic technologies, who remain blissfully unaware of the data storage, management and most importantly processing services they request. The abstraction of such computational complexity for the user in effect makes WASP an ideal middleware solution, and an appropriate basis for the development of a grid-enabled resource - the Einstein Genome Gateway - as part of the Extreme Science and Engineering Discovery Environment (XSEDE) program. In this paper we discuss the existing WASP system, its proposed middleware role, and its planned interaction with XSEDE to form the Einstein Genome Gateway.
AB - Massively-parallel sequencing (MPS) technologies and their diverse applications in genomics and epigenomics research have yielded enormous new insights into the physiology and pathophysiology of the human genome. The biggest hurdle remains the magnitude and diversity of the datasets generated, compromising our ability to manage, organize, process and ultimately analyse data. The Wiki-based Automated Sequence Processor (WASP), developed at the Albert Einstein College of Medicine (hereafter Einstein), uniquely manages to tightly couple the sequencing platform, the sequencing assay, sample metadata and the automated workflows deployed on a heterogeneous high performance computing cluster infrastructure that yield sequenced, quality-controlled and 'mapped' sequence data, all within the one operating environment accessible by a web-based GUI interface.WASP at Einstein processes 4-6 TB of data per week and since its production cycle commenced it has processed ∼ 1 PB of data overall and has revolutionized user interactivity with these new genomic technologies, who remain blissfully unaware of the data storage, management and most importantly processing services they request. The abstraction of such computational complexity for the user in effect makes WASP an ideal middleware solution, and an appropriate basis for the development of a grid-enabled resource - the Einstein Genome Gateway - as part of the Extreme Science and Engineering Discovery Environment (XSEDE) program. In this paper we discuss the existing WASP system, its proposed middleware role, and its planned interaction with XSEDE to form the Einstein Genome Gateway.
KW - Genomics
KW - Grid Computing
KW - Integrative Analysis
KW - Life Science Gateways
KW - Massively Parallel Sequencing
KW - XSEDE
UR - http://www.scopus.com/inward/record.url?scp=84866750381&partnerID=8YFLogxK
U2 - 10.3233/978-1-61499-054-3-182
DO - 10.3233/978-1-61499-054-3-182
M3 - Conference contribution
C2 - 22942009
AN - SCOPUS:84866750381
SN - 9781614990536
T3 - Studies in Health Technology and Informatics
SP - 182
EP - 191
BT - HealthGrid Applications and Technologies Meet Science Gateways for Life Sciences
PB - IOS Press
T2 - 10th HealthGrid Conference and the 4th International Workshop on Science Gateways for Life Sciences, IWSG-Life 2012
Y2 - 21 May 2012 through 25 May 2012
ER -