TY - JOUR
T1 - Using interpretability approaches to update “black-box” clinical prediction models
T2 - an external validation study in nephrology
AU - da Cruz, Harry Freitas
AU - Pfahringer, Boris
AU - Martensen, Tom
AU - Schneider, Frederic
AU - Meyer, Alexander
AU - Böttinger, Erwin
AU - Schapranow, Matthieu P.
N1 - Publisher Copyright:
© 2020 Elsevier B.V.
PY - 2021/1
Y1 - 2021/1
N2 - Despite advances in machine learning-based clinical prediction models, only few of such models are actually deployed in clinical contexts. Among other reasons, this is due to a lack of validation studies. In this paper, we present and discuss the validation results of a machine learning model for the prediction of acute kidney injury in cardiac surgery patients initially developed on the MIMIC-III dataset when applied to an external cohort of an American research hospital. To help account for the performance differences observed, we utilized interpretability methods based on feature importance, which allowed experts to scrutinize model behavior both at the global and local level, making it possible to gain further insights into why it did not behave as expected on the validation cohort. The knowledge gleaned upon derivation can be potentially useful to assist model update during validation for more generalizable and simpler models. We argue that interpretability methods should be considered by practitioners as a further tool to help explain performance differences and inform model update in validation studies.
AB - Despite advances in machine learning-based clinical prediction models, only few of such models are actually deployed in clinical contexts. Among other reasons, this is due to a lack of validation studies. In this paper, we present and discuss the validation results of a machine learning model for the prediction of acute kidney injury in cardiac surgery patients initially developed on the MIMIC-III dataset when applied to an external cohort of an American research hospital. To help account for the performance differences observed, we utilized interpretability methods based on feature importance, which allowed experts to scrutinize model behavior both at the global and local level, making it possible to gain further insights into why it did not behave as expected on the validation cohort. The knowledge gleaned upon derivation can be potentially useful to assist model update during validation for more generalizable and simpler models. We argue that interpretability methods should be considered by practitioners as a further tool to help explain performance differences and inform model update in validation studies.
KW - Clinical predictive modeling
KW - Interpretability methods
KW - Nephrology
KW - Validation
UR - https://www.scopus.com/pages/publications/85098459117
U2 - 10.1016/j.artmed.2020.101982
DO - 10.1016/j.artmed.2020.101982
M3 - Article
C2 - 33461682
AN - SCOPUS:85098459117
SN - 0933-3657
VL - 111
JO - Artificial Intelligence in Medicine
JF - Artificial Intelligence in Medicine
M1 - 101982
ER -