Estimating Local Costs Associated with Clostridium difficile Infection Using Machine Learning and Electronic Medical Records

Theodore R. Pak, Kieran I. Chacko, Timothy O'Donnell, Shirish S. Huprikar, Harm Van Bakel, Andrew Kasarskis, Erick R. Scott

Research output: Contribution to journalArticlepeer-review

7 Scopus citations


BACKGROUND Reported per-patient costs of Clostridium difficile infection (CDI) vary by 2 orders of magnitude among different hospitals, implying that infection control officers need precise, local analyses to guide rational decision making between interventions. OBJECTIVE We sought to comprehensively estimate changes in length of stay (LOS) attributable to CDI at a single urban tertiary-care facility using only data automatically extractable from the electronic medical record (EMR). METHODS We performed a retrospective cohort study of 171,938 visits spanning a 7-year period. In total, 23,968 variables were extracted from EMR data recorded within 24 hours of admission to train elastic-net regularized logistic regression models for propensity score matching. To address time-dependent bias (reverse causation), we separately stratified comparisons by time of infection, and we fit multistate models. RESULTS The estimated difference in median LOS for propensity-matched cohorts varied from 3.1 days (95% CI, 2.2-3.9) to 10.1 days (95% CI, 7.3-12.2) depending on the case definition; however, dependency of the estimate on time to infection was observed. Stratification by time to first positive toxin assay, excluding probable community-Acquired infections, showed a minimum excess LOS of 3.1 days (95% CI, 1.7-4.4). Under the same case definition, the multistate model averaged an excess LOS of 3.3 days (95% CI, 2.6-4.0). CONCLUSIONS In this study, 2 independent time-To-infection adjusted methods converged on similar excess LOS estimates. Changes in LOS can be extrapolated to marginal dollar costs by multiplying by average costs of an inpatient day. Infection control officers can leverage automatically extractable EMR data to estimate costs of CDI at their own institutions. Infect Control Hosp Epidemiol. 2017;38:1478-1486.

Original languageEnglish
Pages (from-to)1478-1486
Number of pages9
JournalInfection Control and Hospital Epidemiology
Issue number12
StatePublished - 1 Dec 2017


Dive into the research topics of 'Estimating Local Costs Associated with Clostridium difficile Infection Using Machine Learning and Electronic Medical Records'. Together they form a unique fingerprint.

Cite this