TY - GEN
T1 - VIS-MAE
T2 - 15th International Workshop on Machine Learning in Medical Imaging, MLMI 2024 was held in conjunction with the 27th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2024
AU - Liu, Zelong
AU - Tieu, Andrew
AU - Patel, Nikhil
AU - Soultanidis, George
AU - Deyer, Louisa
AU - Wang, Ying
AU - Huver, Sean
AU - Zhou, Alexander
AU - Mei, Yunhao
AU - Fayad, Zahi A.
AU - Deyer, Timothy
AU - Mei, Xueyan
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - Artificial Intelligence (AI) has the potential to revolutionize diagnosis and segmentation in medical imaging. However, development and clinical implementation face multiple challenges including limited data availability, lack of generalizability, and the necessity to incorporate multi-modal data effectively. A foundation model, which is a large-scale pre-trained AI model, offers a versatile base that can be adapted to a variety of specific tasks and contexts. Here, we present VIsualization and Segmentation Masked AutoEncoder (VIS-MAE), novel model weights specifically designed for medical imaging. Specifically, VIS-MAE is trained on a dataset of 2.5 million unlabeled images from various modalities (CT, MR, PET, X-rays, and ultrasound), using self-supervised learning techniques. It is then adapted to classification and segmentation tasks using explicit labels. VIS-MAE has high label efficiency, outperforming several benchmark models in both in-domain and out-of-domain applications. In addition, VIS-MAE has improved label efficiency as it can achieve similar performance to other models with a reduced amount of labeled training data (50% or 80%) compared to other pre-trained weights. VIS-MAE represents a significant advancement in medical imaging AI, offering a generalizable and robust solution for improving segmentation and classification tasks while reducing the data annotation workload. The source code of this work is available at https://github.com/lzl199704/VIS-MAE.
AB - Artificial Intelligence (AI) has the potential to revolutionize diagnosis and segmentation in medical imaging. However, development and clinical implementation face multiple challenges including limited data availability, lack of generalizability, and the necessity to incorporate multi-modal data effectively. A foundation model, which is a large-scale pre-trained AI model, offers a versatile base that can be adapted to a variety of specific tasks and contexts. Here, we present VIsualization and Segmentation Masked AutoEncoder (VIS-MAE), novel model weights specifically designed for medical imaging. Specifically, VIS-MAE is trained on a dataset of 2.5 million unlabeled images from various modalities (CT, MR, PET, X-rays, and ultrasound), using self-supervised learning techniques. It is then adapted to classification and segmentation tasks using explicit labels. VIS-MAE has high label efficiency, outperforming several benchmark models in both in-domain and out-of-domain applications. In addition, VIS-MAE has improved label efficiency as it can achieve similar performance to other models with a reduced amount of labeled training data (50% or 80%) compared to other pre-trained weights. VIS-MAE represents a significant advancement in medical imaging AI, offering a generalizable and robust solution for improving segmentation and classification tasks while reducing the data annotation workload. The source code of this work is available at https://github.com/lzl199704/VIS-MAE.
KW - Label efficiency
KW - Masked Autoencoder
KW - Medical Image Segmentation and Classification
KW - Self-supervised Learning
UR - http://www.scopus.com/inward/record.url?scp=85208427347&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-73290-4_10
DO - 10.1007/978-3-031-73290-4_10
M3 - Conference contribution
AN - SCOPUS:85208427347
SN - 9783031732928
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 95
EP - 107
BT - Machine Learning in Medical Imaging - 15th International Workshop, MLMI 2024, Held in Conjunction with MICCAI 2024, Proceedings
A2 - Xu, Xuanang
A2 - Cui, Zhiming
A2 - Sun, Kaicong
A2 - Rekik, Islem
A2 - Ouyang, Xi
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 6 October 2024 through 6 October 2024
ER -