TY - JOUR
T1 - Group regularization for zero-inflated negative binomial regression models with an application to health care demand in Germany
AU - Chatterjee, Saptarshi
AU - Chowdhury, Shrabanti
AU - Mallick, Himel
AU - Banerjee, Prithish
AU - Garai, Broti
N1 - Publisher Copyright:
Copyright © 2018 John Wiley & Sons, Ltd.
PY - 2018/9/10
Y1 - 2018/9/10
N2 - In many biomedical applications, covariates are naturally grouped, with variables in the same group being systematically related or statistically correlated. Under such settings, variable selection must be conducted at both group and individual variable levels. Motivated by the widespread availability of zero-inflated count outcomes and grouped covariates in many practical applications, we consider group regularization for zero-inflated negative binomial regression models. Using a least squares approximation of the mixture likelihood and a variety of group-wise penalties on the coefficients, we propose a unified algorithm (Gooogle: Group Regularization for Zero-inflated Count Regression Models) to efficiently compute the entire regularization path of the estimators. We investigate the finite sample performance of these methods through extensive simulation experiments and the analysis of a German health care demand dataset. Finally, we derive theoretical properties of these methods under reasonable assumptions, which further provides deeper insight into the asymptotic behavior of these approaches. The open source software implementation of this method is publicly available at: https://github.com/himelmallick/Gooogle.
AB - In many biomedical applications, covariates are naturally grouped, with variables in the same group being systematically related or statistically correlated. Under such settings, variable selection must be conducted at both group and individual variable levels. Motivated by the widespread availability of zero-inflated count outcomes and grouped covariates in many practical applications, we consider group regularization for zero-inflated negative binomial regression models. Using a least squares approximation of the mixture likelihood and a variety of group-wise penalties on the coefficients, we propose a unified algorithm (Gooogle: Group Regularization for Zero-inflated Count Regression Models) to efficiently compute the entire regularization path of the estimators. We investigate the finite sample performance of these methods through extensive simulation experiments and the analysis of a German health care demand dataset. Finally, we derive theoretical properties of these methods under reasonable assumptions, which further provides deeper insight into the asymptotic behavior of these approaches. The open source software implementation of this method is publicly available at: https://github.com/himelmallick/Gooogle.
KW - bi-level variable selection
KW - group LASSO
KW - group bridge
KW - group regularization
KW - health care demand
KW - zero-inflated negative binomial
UR - http://www.scopus.com/inward/record.url?scp=85051183129&partnerID=8YFLogxK
U2 - 10.1002/sim.7804
DO - 10.1002/sim.7804
M3 - Article
C2 - 29900575
AN - SCOPUS:85051183129
SN - 0277-6715
VL - 37
SP - 3012
EP - 3026
JO - Statistics in Medicine
JF - Statistics in Medicine
IS - 20
ER -