The goal of this study was to build a machine learning model for early prostate cancer prediction based on healthcare utilization patterns. We examined the frequency and pattern changes of healthcare utilization in 2916 prostate cancer patients 3 years prior to their prostate cancer diagnoses and explored several supervised machine learning techniques to predict possible prostate cancer diagnosis. Analysis of patients' medical activities between 1 year and 2 years prior to their prostate cancer diagnoses using XGBoost model provided the best prediction accuracy with high F1 score (0.9) and AUC score (0.73). These pilot results indicated that application of machine learning to healthcare utilization patterns may result in early identification of prostate cancer diagnosis.

Original languageEnglish
Title of host publicationInformatics and Technology in Clinical Care and Public Health
EditorsJohn Mantas, Arie Hasman, Mowafa S. Househ, Parisis Gallos, Emmanouil Zoulias, Joseph Liasko
PublisherIOS Press BV
Number of pages4
ISBN (Electronic)9781643682501
StatePublished - 2022

Publication series

NameStudies in Health Technology and Informatics
ISSN (Print)0926-9630
ISSN (Electronic)1879-8365


  • Big Data Analytics
  • Machine Learning
  • Prostate Cancer


Dive into the research topics of 'Machine Learning Approaches for Early Prostate Cancer Prediction Based on Healthcare Utilization Patterns'. Together they form a unique fingerprint.

Cite this