TY - JOUR
T1 - Machine learning approaches to analyze histological images of tissues from radical prostatectomies
AU - Gertych, Arkadiusz
AU - Ing, Nathan
AU - Ma, Zhaoxuan
AU - Fuchs, Thomas J.
AU - Salman, Sadri
AU - Mohanty, Sambit
AU - Bhele, Sanica
AU - Velásquez-Vacca, Adriana
AU - Amin, Mahul B.
AU - Knudsen, Beatrice S.
N1 - Publisher Copyright:
© 2015 Elsevier Ltd.
PY - 2015/12/1
Y1 - 2015/12/1
N2 - Computerized evaluation of histological preparations of prostate tissues involves identification of tissue components such as stroma (ST), benign/normal epithelium (BN) and prostate cancer (PCa). Image classification approaches have been developed to identify and classify glandular regions in digital images of prostate tissues; however their success has been limited by difficulties in cellular segmentation and tissue heterogeneity. We hypothesized that utilizing image pixels to generate intensity histograms of hematoxylin (H) and eosin (E) stains deconvoluted from H&E images numerically captures the architectural difference between glands and stroma. In addition, we postulated that joint histograms of local binary patterns and local variance (LBPxVAR) can be used as sensitive textural features to differentiate benign/normal tissue from cancer. Here we utilized a machine learning approach comprising of a support vector machine (SVM) followed by a random forest (RF) classifier to digitally stratify prostate tissue into ST, BN and PCa areas. Two pathologists manually annotated 210 images of low- and high-grade tumors from slides that were selected from 20 radical prostatectomies and digitized at high-resolution. The 210 images were split into the training (n=19) and test (n=191) sets. Local intensity histograms of H and E were used to train a SVM classifier to separate ST from epithelium (BN+PCa). The performance of SVM prediction was evaluated by measuring the accuracy of delineating epithelial areas. The Jaccard J=59.5±14.6 and RandRi=62.0±7.5 indices reported a significantly better prediction when compared to a reference method (Chen et al., Clinical Proteomics 2013, 10:18) based on the averaged values from the test set. To distinguish BN from PCa we trained a RF classifier with LBPxVAR and local intensity histograms and obtained separate performance values for BN and PCa: JBN=35.2±24.9, OBN=49.6±32, JPCa=49.5±18.5, OPCa=72.7±14.8 andRi=60.6±7.6 in the test set. Our pixel-based classification does not rely on the detection of lumens, which is prone to errors and has limitations in high-grade cancers and has the potential to aid in clinical studies in which the quantification of tumor content is necessary to prognosticate the course of the disease. The image data set with ground truth annotation is available for public use to stimulate further research in this area.
AB - Computerized evaluation of histological preparations of prostate tissues involves identification of tissue components such as stroma (ST), benign/normal epithelium (BN) and prostate cancer (PCa). Image classification approaches have been developed to identify and classify glandular regions in digital images of prostate tissues; however their success has been limited by difficulties in cellular segmentation and tissue heterogeneity. We hypothesized that utilizing image pixels to generate intensity histograms of hematoxylin (H) and eosin (E) stains deconvoluted from H&E images numerically captures the architectural difference between glands and stroma. In addition, we postulated that joint histograms of local binary patterns and local variance (LBPxVAR) can be used as sensitive textural features to differentiate benign/normal tissue from cancer. Here we utilized a machine learning approach comprising of a support vector machine (SVM) followed by a random forest (RF) classifier to digitally stratify prostate tissue into ST, BN and PCa areas. Two pathologists manually annotated 210 images of low- and high-grade tumors from slides that were selected from 20 radical prostatectomies and digitized at high-resolution. The 210 images were split into the training (n=19) and test (n=191) sets. Local intensity histograms of H and E were used to train a SVM classifier to separate ST from epithelium (BN+PCa). The performance of SVM prediction was evaluated by measuring the accuracy of delineating epithelial areas. The Jaccard J=59.5±14.6 and RandRi=62.0±7.5 indices reported a significantly better prediction when compared to a reference method (Chen et al., Clinical Proteomics 2013, 10:18) based on the averaged values from the test set. To distinguish BN from PCa we trained a RF classifier with LBPxVAR and local intensity histograms and obtained separate performance values for BN and PCa: JBN=35.2±24.9, OBN=49.6±32, JPCa=49.5±18.5, OPCa=72.7±14.8 andRi=60.6±7.6 in the test set. Our pixel-based classification does not rely on the detection of lumens, which is prone to errors and has limitations in high-grade cancers and has the potential to aid in clinical studies in which the quantification of tumor content is necessary to prognosticate the course of the disease. The image data set with ground truth annotation is available for public use to stimulate further research in this area.
KW - Image analysis
KW - Machine learning
KW - Prostate cancer
KW - Tissue classification
KW - Tissue quantification
UR - http://www.scopus.com/inward/record.url?scp=84983119355&partnerID=8YFLogxK
U2 - 10.1016/j.compmedimag.2015.08.002
DO - 10.1016/j.compmedimag.2015.08.002
M3 - Article
C2 - 26362074
AN - SCOPUS:84983119355
SN - 0895-6111
VL - 46
SP - 197
EP - 208
JO - Computerized Medical Imaging and Graphics
JF - Computerized Medical Imaging and Graphics
ER -