Abstract
Non-Alcoholic fatty liver disease (NAFLD) is a complex heterogeneous disease which affects more than 20% of the population worldwide. Some subtypes of NAFLD have been clinically identified using hypothesis-driven methods. In this study, we used data mining techniques to search for subtypes in an unbiased fashion. Using electronic signatures of the disease, we identified a cohort of 13,290 patients with NAFLD from a hospital database. We gathered clinical data from multiple sources and applied unsupervised clustering to identify five subtypes among this cohort. Descriptive statistics and survival analysis showed that the subtypes were clinically distinct and were associated with different rates of death, cirrhosis, hepatocellular carcinoma, chronic kidney disease, cardiovascular disease, and myocardial infarction. Novel disease subtypes identified in this manner could be used to risk-stratify patients and guide management.
Original language | English |
---|---|
Pages (from-to) | 91-102 |
Number of pages | 12 |
Journal | Pacific Symposium on Biocomputing |
Volume | 25 |
Issue number | 2020 |
State | Published - 2020 |
Event | 25th Pacific Symposium on Biocomputing, PSB 2020 - Big Island, United States Duration: 3 Jan 2020 → 7 Jan 2020 |
Keywords
- Clustering
- NAFLD
- Subtypes definition
- Survival analysis