TY - JOUR
T1 - Prognosis in pathology
T2 - Are we “prognosticating” or only establishing correlations between independent variables and survival? A study with various analytics cautions about the overinterpretation of statistical results
AU - Marchevsky, Alberto M.
AU - Diniz, Marcio A.
AU - Manzoor, Daniel
AU - Walts, Ann E.
N1 - Publisher Copyright:
© 2020 Elsevier Inc.
PY - 2020/6
Y1 - 2020/6
N2 - Survival data from 225 patients with resected pulmonary typical carcinoids were analyzed with Kaplan-Meier statistics (K-M) and “deep learning” methods to illustrate the difference between establishing “correlations” and “prognostications”. Cases were stratified into G1 and G2 classes using a ≤5% Ki-67% cut-point. Overall survival, number of patients at risk and 95% confidence intervals (CI) were estimated for the two classes. Seven neural network models (NN) were developed with GMDH Shell 3.8.2 and Statgraphics Centurion 18.1 software, using variable prior probabilities and different numbers of training vs testing cases. The NNs used age, sex, and pTNM, G1 and G2 as input neurons and “alive” and “dead” as output neurons. Areas under the curve (AUC) and other performance measures were evaluated for all models. Log-rank test showed a significant difference in overall survival between G1 and G2 (p < 0.001). However, 95% CI estimates showed considerable variability in survival at different time intervals. Including the number of patients at risk at different time intervals showed that most G2 patients had been censored by 100 weeks. The NN models provided variable “prognostications”, with AUC ranging from 0.5 to 1 and variability in the sensitivity, specificity, and other performance measures. The results illustrate the limitations of survival statistics and NNs in predicting the prognosis of individual patients. The need for pathologists not to overinterpret the finding of significant correlations as “prognostic” or “predictive” for individual patients is discussed.
AB - Survival data from 225 patients with resected pulmonary typical carcinoids were analyzed with Kaplan-Meier statistics (K-M) and “deep learning” methods to illustrate the difference between establishing “correlations” and “prognostications”. Cases were stratified into G1 and G2 classes using a ≤5% Ki-67% cut-point. Overall survival, number of patients at risk and 95% confidence intervals (CI) were estimated for the two classes. Seven neural network models (NN) were developed with GMDH Shell 3.8.2 and Statgraphics Centurion 18.1 software, using variable prior probabilities and different numbers of training vs testing cases. The NNs used age, sex, and pTNM, G1 and G2 as input neurons and “alive” and “dead” as output neurons. Areas under the curve (AUC) and other performance measures were evaluated for all models. Log-rank test showed a significant difference in overall survival between G1 and G2 (p < 0.001). However, 95% CI estimates showed considerable variability in survival at different time intervals. Including the number of patients at risk at different time intervals showed that most G2 patients had been censored by 100 weeks. The NN models provided variable “prognostications”, with AUC ranging from 0.5 to 1 and variability in the sensitivity, specificity, and other performance measures. The results illustrate the limitations of survival statistics and NNs in predicting the prognosis of individual patients. The need for pathologists not to overinterpret the finding of significant correlations as “prognostic” or “predictive” for individual patients is discussed.
KW - Accuracy
KW - Evidence-Based Pathology
KW - Kaplan-Meier
KW - Neural Network
KW - Prognosis
UR - http://www.scopus.com/inward/record.url?scp=85083860817&partnerID=8YFLogxK
U2 - 10.1016/j.anndiagpath.2020.151525
DO - 10.1016/j.anndiagpath.2020.151525
M3 - Article
C2 - 32353712
AN - SCOPUS:85083860817
SN - 1092-9134
VL - 46
JO - Annals of Diagnostic Pathology
JF - Annals of Diagnostic Pathology
M1 - 151525
ER -