SimPEL: Simulation-based power estimation for sequencing studies of low-prevalence conditions

Lauren Mak, Minghao Li, Chen Cao, Paul Gordon, Maja Tarailo-Graovac, Chad Bousman, Pei Wang, Quan Long

Research output: Contribution to journalArticlepeer-review

Abstract

Power estimations are important for optimizing genotype-phenotype association study designs. However, existing frameworks are designed for common disorders, and thus ill-suited for the inherent challenges of studies for low-prevalence conditions such as rare diseases and infrequent adverse drug reactions. These challenges include small sample sizes and the need to leverage genetic annotation resources in association analyses for the purpose of ranking potential causal genes. We present SimPEL, a simulation-based program providing power estimations for the design of low-prevalence condition studies. SimPEL integrates the usage of gene annotation resources for association analyses. Customizable parameters, including the penetrance of the putative causal allele and the employed pathogenic scoring system, allow SimPEL to realistically model a large range of study designs. To demonstrate the effects of various parameters on power, we estimated the power of several simulated designs using SimPEL and captured power trends in agreement with observations from current literature on low-frequency condition studies. SimPEL, as a tool, provides researchers studying low-frequency conditions with an intuitive and highly flexible avenue for statistical power estimation. The platform-independent “batteries included” executable and default input files are available at https://github.com/precisionomics/SimPEL.

Original languageEnglish
Pages (from-to)480-487
Number of pages8
JournalGenetic Epidemiology
Volume42
Issue number5
DOIs
StatePublished - Jul 2018

Keywords

  • adverse drug reactions
  • association analyses
  • genetic variant annotation
  • genome-wide sequencing
  • power estimation
  • rare disease

Fingerprint

Dive into the research topics of 'SimPEL: Simulation-based power estimation for sequencing studies of low-prevalence conditions'. Together they form a unique fingerprint.

Cite this