GWAS and enrichment analyses of non-alcoholic fatty liver disease identify new trait-associated genes and pathways across eMERGE Network

Bahram Namjou, Todd Lingren, Yongbo Huang, Sreeja Parameswaran, Beth L. Cobb, Ian B. Stanaway, John J. Connolly, Frank D. Mentch, Barbara Benoit, Xinnan Niu, Wei Qi Wei, Robert J. Carroll, Jennifer A. Pacheco, Isaac T.W. Harley, Senad Divanovic, David S. Carrell, Eric B. Larson, David J. Carey, Shefali Verma, Marylyn D. RitchieAli G. Gharavi, Shawn Murphy, Marc S. Williams, David R. Crosslin, Gail P. Jarvik, Iftikhar J. Kullo, Hakon Hakonarson, Rongling Li, Stavra A. Xanthakos, John B. Harley

Research output: Contribution to journalArticlepeer-review

102 Scopus citations


Background: Non-alcoholic fatty liver disease (NAFLD) is a common chronic liver illness with a genetically heterogeneous background that can be accompanied by considerable morbidity and attendant health care costs. The pathogenesis and progression of NAFLD is complex with many unanswered questions. We conducted genome-wide association studies (GWASs) using both adult and pediatric participants from the Electronic Medical Records and Genomics (eMERGE) Network to identify novel genetic contributors to this condition. Methods: First, a natural language processing (NLP) algorithm was developed, tested, and deployed at each site to identify 1106 NAFLD cases and 8571 controls and histological data from liver tissue in 235 available participants. These include 1242 pediatric participants (396 cases, 846 controls). The algorithm included billing codes, text queries, laboratory values, and medication records. Next, GWASs were performed on NAFLD cases and controls and case-only analyses using histologic scores and liver function tests adjusting for age, sex, site, ancestry, PC, and body mass index (BMI). Results: Consistent with previous results, a robust association was detected for the PNPLA3 gene cluster in participants with European ancestry. At the PNPLA3-SAMM50 region, three SNPs, rs738409, rs738408, and rs3747207, showed strongest association (best SNP rs738409 p = 1.70 × 10- 20). This effect was consistent in both pediatric (p = 9.92 × 10- 6) and adult (p = 9.73 × 10- 15) cohorts. Additionally, this variant was also associated with disease severity and NAFLD Activity Score (NAS) (p = 3.94 × 10- 8, beta = 0.85). PheWAS analysis link this locus to a spectrum of liver diseases beyond NAFLD with a novel negative correlation with gout (p = 1.09 × 10- 4). We also identified novel loci for NAFLD disease severity, including one novel locus for NAS score near IL17RA (rs5748926, p = 3.80 × 10- 8), and another near ZFP90-CDH1 for fibrosis (rs698718, p = 2.74 × 10- 11). Post-GWAS and gene-based analyses identified more than 300 genes that were used for functional and pathway enrichment analyses. Conclusions: In summary, this study demonstrates clear confirmation of a previously described NAFLD risk locus and several novel associations. Further collaborative studies including an ethnically diverse population with well-characterized liver histologic features of NAFLD are needed to further validate the novel findings.

Original languageEnglish
Article number135
JournalBMC Medicine
Issue number1
StatePublished - 17 Jul 2019
Externally publishedYes


  • Fatty liver
  • GWAS
  • Genetic polymorphism
  • PheWAS
  • Polygenic risk score


Dive into the research topics of 'GWAS and enrichment analyses of non-alcoholic fatty liver disease identify new trait-associated genes and pathways across eMERGE Network'. Together they form a unique fingerprint.

Cite this