TY - JOUR
T1 - SMMB
T2 - A stochastic Markov blanket framework strategy for epistasis detection in GWAS
AU - Niel, Clément
AU - Sinoquet, Christine
AU - Dina, Christian
AU - Rocheleau, Ghislain
N1 - Publisher Copyright:
© The Author(s) 2018. Published by Oxford University Press. All rights reserved.
PY - 2018/8/15
Y1 - 2018/8/15
N2 - Motivation: Large scale genome-wide association studies (GWAS) are tools of choice for discovering associations between genotypes and phenotypes. To date, many studies rely on univariate statistical tests for association between the phenotype and each assayed single nucleotide polymorphism (SNP). However, interaction between SNPs, namely epistasis, must be considered when tackling the complexity of underlying biological mechanisms. Epistasis analysis at large scale entails a prohibitive computational burden when addressing the detection of more than two interacting SNPs. In this paper, we introduce a stochastic causal graph-based method, SMMB, to analyze epistatic patterns in GWAS data. Results: We present Stochastic Multiple Markov Blanket algorithm (SMMB), which combines both ensemble stochastic strategy inspired from random forests and Bayesian Markov blanket-based methods. We compared SMMB with three other recent algorithms using both simulated and real datasets. Our method outperforms the other compared methods for a majority of simulated cases of 2-way and 3-way epistasis patterns (especially in scenarii where minor allele frequencies of causal SNPs are low). Our approach performs similarly as two other compared methods for large real datasets, in terms of power, and runs faster. Availability and implementation: Parallel version available on https://ls2n.fr/listelogicielsequipe/DUKe/128/.
AB - Motivation: Large scale genome-wide association studies (GWAS) are tools of choice for discovering associations between genotypes and phenotypes. To date, many studies rely on univariate statistical tests for association between the phenotype and each assayed single nucleotide polymorphism (SNP). However, interaction between SNPs, namely epistasis, must be considered when tackling the complexity of underlying biological mechanisms. Epistasis analysis at large scale entails a prohibitive computational burden when addressing the detection of more than two interacting SNPs. In this paper, we introduce a stochastic causal graph-based method, SMMB, to analyze epistatic patterns in GWAS data. Results: We present Stochastic Multiple Markov Blanket algorithm (SMMB), which combines both ensemble stochastic strategy inspired from random forests and Bayesian Markov blanket-based methods. We compared SMMB with three other recent algorithms using both simulated and real datasets. Our method outperforms the other compared methods for a majority of simulated cases of 2-way and 3-way epistasis patterns (especially in scenarii where minor allele frequencies of causal SNPs are low). Our approach performs similarly as two other compared methods for large real datasets, in terms of power, and runs faster. Availability and implementation: Parallel version available on https://ls2n.fr/listelogicielsequipe/DUKe/128/.
UR - http://www.scopus.com/inward/record.url?scp=85052629151&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/bty154
DO - 10.1093/bioinformatics/bty154
M3 - Article
C2 - 29547902
AN - SCOPUS:85052629151
SN - 1367-4803
VL - 34
SP - 2773
EP - 2780
JO - Bioinformatics
JF - Bioinformatics
IS - 16
ER -