Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set

  • Masahiro Kanai
  • , Toshihiro Tanaka
  • , Yukinori Okada

Research output: Contribution to journalArticlepeer-review

81 Scopus citations

Abstract

To assess the statistical significance of associations between variants and traits, genome-wide association studies (GWAS) should employ an appropriate threshold that accounts for the massive burden of multiple testing in the study. Although most studies in the current literature commonly set a genome-wide significance threshold at the level of P=5.0 × 10 -8, the adequacy of this value for respective populations has not been fully investigated. To empirically estimate thresholds for different ancestral populations, we conducted GWAS simulations using the 1000 Genomes Phase 3 data set for Africans (AFR), Europeans (EUR), Admixed Americans (AMR), East Asians (EAS) and South Asians (SAS). The estimated empirical genome-wide significance thresholds were P sig =3.24 × 10 -8 (AFR), 9.26 × 10 -8 (EUR), 1.83 × 10 -7 (AMR), 1.61 × 10 -7 (EAS) and 9.46 × 10 -8 (SAS). We additionally conducted trans-ethnic meta-analyses across all populations (ALL) and all populations except for AFR (ΔAFR), which yielded P sig =3.25 × 10 -8 (ALL) and 4.20 × 10 -8 (ΔAFR). Our results indicate that the current threshold (P=5.0 × 10 -8) is overly stringent for all ancestral populations except for Africans; however, we should employ a more stringent threshold when conducting a meta-analysis, regardless of the presence of African samples.

Original languageEnglish
Pages (from-to)861-866
Number of pages6
JournalJournal of Human Genetics
Volume61
Issue number10
DOIs
StatePublished - 1 Oct 2016
Externally publishedYes

Fingerprint

Dive into the research topics of 'Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set'. Together they form a unique fingerprint.

Cite this