TY - JOUR
T1 - Deep Learning-Enabled Automated Quality Control for Liver MR Elastography
T2 - Initial Results
AU - Nieves-Vazquez, Heriberto A.
AU - Ozkaya, Efe
AU - Meinhold, Waiman
AU - Geahchan, Amine
AU - Bane, Octavia
AU - Ueda, Jun
AU - Taouli, Bachir
N1 - Publisher Copyright:
© 2024 The Author(s). Journal of Magnetic Resonance Imaging published by Wiley Periodicals LLC on behalf of International Society for Magnetic Resonance in Medicine.
PY - 2024
Y1 - 2024
N2 - Background: Several factors can impair image quality and reliability of liver magnetic resonance elastography (MRE), such as inadequate driver positioning, insufficient wave propagation and patient-related factors. Purpose: To report initial results on automatic classification of liver MRE image quality using various deep learning (DL) architectures. Study Type: Retrospective, single center, IRB-approved human study. Population: Ninety patients (male = 51, mean age 52.8 ± 14.1 years). Field Strengths/Sequences: 1.5 T and 3 T MRI, 2D GRE, and 2D SE-EPI. Assessment: The curated dataset was comprised of 914 slices obtained from 149 MRE exams in 90 patients. Two independent observers examined the confidence map overlaid elastograms (CMOEs) for liver stiffness measurement and assigned a quality score (non-diagnostic vs. diagnostic) for each slice. Several DL architectures (ResNet18, ResNet34, ResNet50, SqueezeNet, and MobileNetV2) for binary quality classification of individual CMOE slice inputs were evaluated, using an 8-fold stratified cross-validation (800 slices) and a test dataset (114 slices). A majority vote ensemble combining the models' predictions of the highest-performing architecture was evaluated. Statistical Test: The inter-observer agreement and the agreement between DL models and one observer were assessed using Cohen's unweighted Kappa coefficient. Accuracy, precision, and recall of the cross-validation and the ensemble were calculated for the test dataset. Results: The average accuracy across the eight models trained using each architecture ranged from 0.692 to 0.851 for the test dataset. The ensemble of the best performing architecture (SqueezeNet) yielded an accuracy of 0.921. The inter-observer agreement was excellent (Kappa 0.896 [95% CI 0.845–0.947]). The agreement between observer 1 and the predictions of each SqueezeNet model was slight to almost perfect (Kappa range: 0.197–0.831) and almost perfect for the ensemble (Kappa: 0.833). Conclusion: Our initial study demonstrates an automated DL-based approach for classifying liver 2D MRE diagnostic quality with an average accuracy of 0.851 (range 0.675–0.921) across the SqueezeNet models. Evidence Level: 4. Technical Efficacy: Stage 1.
AB - Background: Several factors can impair image quality and reliability of liver magnetic resonance elastography (MRE), such as inadequate driver positioning, insufficient wave propagation and patient-related factors. Purpose: To report initial results on automatic classification of liver MRE image quality using various deep learning (DL) architectures. Study Type: Retrospective, single center, IRB-approved human study. Population: Ninety patients (male = 51, mean age 52.8 ± 14.1 years). Field Strengths/Sequences: 1.5 T and 3 T MRI, 2D GRE, and 2D SE-EPI. Assessment: The curated dataset was comprised of 914 slices obtained from 149 MRE exams in 90 patients. Two independent observers examined the confidence map overlaid elastograms (CMOEs) for liver stiffness measurement and assigned a quality score (non-diagnostic vs. diagnostic) for each slice. Several DL architectures (ResNet18, ResNet34, ResNet50, SqueezeNet, and MobileNetV2) for binary quality classification of individual CMOE slice inputs were evaluated, using an 8-fold stratified cross-validation (800 slices) and a test dataset (114 slices). A majority vote ensemble combining the models' predictions of the highest-performing architecture was evaluated. Statistical Test: The inter-observer agreement and the agreement between DL models and one observer were assessed using Cohen's unweighted Kappa coefficient. Accuracy, precision, and recall of the cross-validation and the ensemble were calculated for the test dataset. Results: The average accuracy across the eight models trained using each architecture ranged from 0.692 to 0.851 for the test dataset. The ensemble of the best performing architecture (SqueezeNet) yielded an accuracy of 0.921. The inter-observer agreement was excellent (Kappa 0.896 [95% CI 0.845–0.947]). The agreement between observer 1 and the predictions of each SqueezeNet model was slight to almost perfect (Kappa range: 0.197–0.831) and almost perfect for the ensemble (Kappa: 0.833). Conclusion: Our initial study demonstrates an automated DL-based approach for classifying liver 2D MRE diagnostic quality with an average accuracy of 0.851 (range 0.675–0.921) across the SqueezeNet models. Evidence Level: 4. Technical Efficacy: Stage 1.
KW - deep learning
KW - image quality control
KW - liver stiffness
KW - magnetic resonance elastography
UR - http://www.scopus.com/inward/record.url?scp=85196546862&partnerID=8YFLogxK
U2 - 10.1002/jmri.29490
DO - 10.1002/jmri.29490
M3 - Article
AN - SCOPUS:85196546862
SN - 1053-1807
JO - Journal of Magnetic Resonance Imaging
JF - Journal of Magnetic Resonance Imaging
ER -