Non-Alcoholic fatty liver disease (NAFLD) is a complex heterogeneous disease which affects more than 20% of the population worldwide. Some subtypes of NAFLD have been clinically identified using hypothesis-driven methods. In this study, we used data mining techniques to search for subtypes in an unbiased fashion. Using electronic signatures of the disease, we identified a cohort of 13,290 patients with NAFLD from a hospital database. We gathered clinical data from multiple sources and applied unsupervised clustering to identify five subtypes among this cohort. Descriptive statistics and survival analysis showed that the subtypes were clinically distinct and were associated with different rates of death, cirrhosis, hepatocellular carcinoma, chronic kidney disease, cardiovascular disease, and myocardial infarction. Novel disease subtypes identified in this manner could be used to risk-stratify patients and guide management.

Original languageEnglish
Pages (from-to)91-102
Number of pages12
JournalPacific Symposium on Biocomputing
Issue number2020
StatePublished - 2020
Event25th Pacific Symposium on Biocomputing, PSB 2020 - Big Island, United States
Duration: 3 Jan 20207 Jan 2020


  • Clustering
  • Subtypes definition
  • Survival analysis


Dive into the research topics of 'Automated phenotyping of patients with non-Alcoholic fatty liver disease reveals clinically relevant disease subtypes'. Together they form a unique fingerprint.

Cite this