Correlates of cancer prevalence across census tracts in the United States: A Bayesian machine learning approach

Li Niu, Liangyuan Hu, Yan Li, Bian Liu

Research output: Contribution to journalArticlepeer-review

Abstract

Preventive measures, health behaviors, environmental exposures, and sociodemographic characteristics affect individual-level cancer risks. It is unclear how they influence neighborhood-level cancer risks. We developed a large-scale neighborhood health dataset for 72,337 census tracts in the United States by combining data from three publicly available sources. We used Bayesian additive regression trees to identify the most important predictors of tract-level cancer prevalence among adults (age ≥18 years), and examined their impact on cancer prevalence using partial dependence plots. The five most important census tract-level correlates of cancer prevalence were the proportion of population who were aged 65 years and older, had routine checkup and were non-Hispanic White, the proportion of houses built before 1960, and the proportion of population living below the poverty line. The identified predictors of neighborhood-level cancer prevalence may inform public health practitioners and policymakers to prioritize the improvement of environmental and neighborhood factors in reducing the cancer burden.

Original languageEnglish
Article number100522
JournalSpatial and Spatio-temporal Epidemiology
Volume42
DOIs
StatePublished - Aug 2022

Keywords

  • Cancer prevalence
  • Machine learning
  • Neighborhood
  • Variable selection

Fingerprint

Dive into the research topics of 'Correlates of cancer prevalence across census tracts in the United States: A Bayesian machine learning approach'. Together they form a unique fingerprint.

Cite this