TY - JOUR
T1 - Clinical PathoScope
T2 - Rapid alignment and filtration for accurate pathogen identification in clinical samples using unassembled sequencing data
AU - Byrd, Allyson L.
AU - Perez-Rogers, Joseph F.
AU - Manimaran, Solaiappan
AU - Castro-Nallar, Eduardo
AU - Toma, Ian
AU - McCaffrey, Tim
AU - Siegel, Marc
AU - Benson, Gary
AU - Crandall, Keith A.
AU - Johnson, William E.
N1 - Publisher Copyright:
© 2014 Byrd et al.; licensee BioMed Central Ltd.
PY - 2014/8/4
Y1 - 2014/8/4
N2 - Background: The use of sequencing technologies to investigate the microbiome of a sample can positively impact patient healthcare by providing therapeutic targets for personalized disease treatment. However, these samples contain genomic sequences from various sources that complicate the identification of pathogens.Results: Here we present Clinical PathoScope, a pipeline to rapidly and accurately remove host contamination, isolate microbial reads, and identify potential disease-causing pathogens. We have accomplished three essential tasks in the development of Clinical PathoScope. First, we developed an optimized framework for pathogen identification using a computational subtraction methodology in concordance with read trimming and ambiguous read reassignment. Second, we have demonstrated the ability of our approach to identify multiple pathogens in a single clinical sample, accurately identify pathogens at the subspecies level, and determine the nearest phylogenetic neighbor of novel or highly mutated pathogens using real clinical sequencing data. Finally, we have shown that Clinical PathoScope outperforms previously published pathogen identification methods with regard to computational speed, sensitivity, and specificity.Conclusions: Clinical PathoScope is the only pathogen identification method currently available that can identify multiple pathogens from mixed samples and distinguish between very closely related species and strains in samples with very few reads per pathogen. Furthermore, Clinical PathoScope does not rely on genome assembly and thus can more rapidly complete the analysis of a clinical sample when compared with current assembly-based methods. Clinical PathoScope is freely available at: http://sourceforge.net/projects/pathoscope/.
AB - Background: The use of sequencing technologies to investigate the microbiome of a sample can positively impact patient healthcare by providing therapeutic targets for personalized disease treatment. However, these samples contain genomic sequences from various sources that complicate the identification of pathogens.Results: Here we present Clinical PathoScope, a pipeline to rapidly and accurately remove host contamination, isolate microbial reads, and identify potential disease-causing pathogens. We have accomplished three essential tasks in the development of Clinical PathoScope. First, we developed an optimized framework for pathogen identification using a computational subtraction methodology in concordance with read trimming and ambiguous read reassignment. Second, we have demonstrated the ability of our approach to identify multiple pathogens in a single clinical sample, accurately identify pathogens at the subspecies level, and determine the nearest phylogenetic neighbor of novel or highly mutated pathogens using real clinical sequencing data. Finally, we have shown that Clinical PathoScope outperforms previously published pathogen identification methods with regard to computational speed, sensitivity, and specificity.Conclusions: Clinical PathoScope is the only pathogen identification method currently available that can identify multiple pathogens from mixed samples and distinguish between very closely related species and strains in samples with very few reads per pathogen. Furthermore, Clinical PathoScope does not rely on genome assembly and thus can more rapidly complete the analysis of a clinical sample when compared with current assembly-based methods. Clinical PathoScope is freely available at: http://sourceforge.net/projects/pathoscope/.
UR - http://www.scopus.com/inward/record.url?scp=84905723235&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-15-262
DO - 10.1186/1471-2105-15-262
M3 - Article
C2 - 25091138
AN - SCOPUS:84905723235
SN - 1471-2105
VL - 15
JO - BMC Bioinformatics
JF - BMC Bioinformatics
IS - 1
M1 - 262
ER -