TY - JOUR
T1 - Prediction of colorectal cancer microsatellite instability and tumor mutational burden from histopathological images using multiple instance learning
AU - Wang, Wenyan
AU - Shi, Wei
AU - Nie, Chuanqi
AU - Xing, Weipeng
AU - Yang, Hailong
AU - Li, Feng
AU - Liu, Jinyang
AU - Tian, Geng
AU - Wang, Bing
AU - Yang, Jialiang
N1 - Publisher Copyright:
© 2025
PY - 2025/6
Y1 - 2025/6
N2 - Recent advancements in deep learning have enabled the prediction of microsatellite instability (MSI) and tumor mutational burden (TMB) status of colorectal cancer (CRC) patients using whole slide histopathological images (WSIs). However, current methods suffer from poor prediction accuracy and lack interpretability, which hinders their clinical application. To address this, we propose a new cascaded two-stage multiple instance learning (MIL) method called CasNet for predicting MSI and TMB. CasNet employs a supervised ResNet model to extract informative image features from patches within the WSI. It then evaluates the importance of each patch using a gradient-based class activation graph (Grad-CAM) and an attention mechanism. On the CRC dataset from the cancer genome atlas (TCGA), CasNet achieved an area-under-the-curve (AUC) of 0.909 for predicting MSI status and a mean AUC of 0.8818 in 5-fold cross-validation for TMB prediction, outperforming seven other state-of-the-art methods. Furthermore, we demonstrate the robustness of CasNet by achieving AUC scores of 0.88 and 0.84 for MSI and TMB predictions, respectively, using only 40% of the samples for training. To enhance the interpretability of CasNet, a segmentation method based on Hover-Net is utilized to analyze the differences in cell content between MSI and MSS groups. Overall, CasNet is an accurate and interpretable method for predicting MSI and TMB, making it a promising in predicting biomarkers even with limited training data.
AB - Recent advancements in deep learning have enabled the prediction of microsatellite instability (MSI) and tumor mutational burden (TMB) status of colorectal cancer (CRC) patients using whole slide histopathological images (WSIs). However, current methods suffer from poor prediction accuracy and lack interpretability, which hinders their clinical application. To address this, we propose a new cascaded two-stage multiple instance learning (MIL) method called CasNet for predicting MSI and TMB. CasNet employs a supervised ResNet model to extract informative image features from patches within the WSI. It then evaluates the importance of each patch using a gradient-based class activation graph (Grad-CAM) and an attention mechanism. On the CRC dataset from the cancer genome atlas (TCGA), CasNet achieved an area-under-the-curve (AUC) of 0.909 for predicting MSI status and a mean AUC of 0.8818 in 5-fold cross-validation for TMB prediction, outperforming seven other state-of-the-art methods. Furthermore, we demonstrate the robustness of CasNet by achieving AUC scores of 0.88 and 0.84 for MSI and TMB predictions, respectively, using only 40% of the samples for training. To enhance the interpretability of CasNet, a segmentation method based on Hover-Net is utilized to analyze the differences in cell content between MSI and MSS groups. Overall, CasNet is an accurate and interpretable method for predicting MSI and TMB, making it a promising in predicting biomarkers even with limited training data.
KW - Microsatellite instability (MSI)
KW - Multiple instance learning method
KW - colorectal cancer patients (CRCs)
KW - pathological whole slide images (WSI)
KW - tumor mutation burden (TMB)
UR - https://www.scopus.com/pages/publications/85216218063
U2 - 10.1016/j.bspc.2025.107608
DO - 10.1016/j.bspc.2025.107608
M3 - Article
AN - SCOPUS:85216218063
SN - 1746-8094
VL - 104
JO - Biomedical Signal Processing and Control
JF - Biomedical Signal Processing and Control
M1 - 107608
ER -