TY - JOUR
T1 - Correlates of cancer prevalence across census tracts in the United States
T2 - A Bayesian machine learning approach
AU - Niu, Li
AU - Hu, Liangyuan
AU - Li, Yan
AU - Liu, Bian
N1 - Funding Information:
This project was partly supported by two grants from the National Cancer Institute of the National Institutes of Health under Award Number R21CA235153 and R21CA245855 , an award ME2017C3 9041 from the Patient-Centered Outcomes Research Institute, a grant from the National Heart, Lung, and Blood Institute of the National Institutes of Health under Award Number R01HL141427 , and a grant from the National Institute on Minority.
Funding Information:
This project was partly supported by two grants from the National Cancer Institute of the National Institutes of Health under Award Number R21CA235153 and R21CA245855, an award ME2017C3 9041 from the Patient-Centered Outcomes Research Institute, a grant from the National Heart, Lung, and Blood Institute of the National Institutes of Health under Award Number R01HL141427, and a grant from the National Institute on Minority. Health and Health Disparities of the National Institutes of Health under Award Numbers R01MD013886.
Publisher Copyright:
© 2022
PY - 2022/8
Y1 - 2022/8
N2 - Preventive measures, health behaviors, environmental exposures, and sociodemographic characteristics affect individual-level cancer risks. It is unclear how they influence neighborhood-level cancer risks. We developed a large-scale neighborhood health dataset for 72,337 census tracts in the United States by combining data from three publicly available sources. We used Bayesian additive regression trees to identify the most important predictors of tract-level cancer prevalence among adults (age ≥18 years), and examined their impact on cancer prevalence using partial dependence plots. The five most important census tract-level correlates of cancer prevalence were the proportion of population who were aged 65 years and older, had routine checkup and were non-Hispanic White, the proportion of houses built before 1960, and the proportion of population living below the poverty line. The identified predictors of neighborhood-level cancer prevalence may inform public health practitioners and policymakers to prioritize the improvement of environmental and neighborhood factors in reducing the cancer burden.
AB - Preventive measures, health behaviors, environmental exposures, and sociodemographic characteristics affect individual-level cancer risks. It is unclear how they influence neighborhood-level cancer risks. We developed a large-scale neighborhood health dataset for 72,337 census tracts in the United States by combining data from three publicly available sources. We used Bayesian additive regression trees to identify the most important predictors of tract-level cancer prevalence among adults (age ≥18 years), and examined their impact on cancer prevalence using partial dependence plots. The five most important census tract-level correlates of cancer prevalence were the proportion of population who were aged 65 years and older, had routine checkup and were non-Hispanic White, the proportion of houses built before 1960, and the proportion of population living below the poverty line. The identified predictors of neighborhood-level cancer prevalence may inform public health practitioners and policymakers to prioritize the improvement of environmental and neighborhood factors in reducing the cancer burden.
KW - Cancer prevalence
KW - Machine learning
KW - Neighborhood
KW - Variable selection
UR - http://www.scopus.com/inward/record.url?scp=85131403403&partnerID=8YFLogxK
U2 - 10.1016/j.sste.2022.100522
DO - 10.1016/j.sste.2022.100522
M3 - Article
AN - SCOPUS:85131403403
SN - 1877-5845
VL - 42
JO - Spatial and Spatio-temporal Epidemiology
JF - Spatial and Spatio-temporal Epidemiology
M1 - 100522
ER -