TY - JOUR
T1 - Predicting Collision Cross-Section Values for Small Molecules through Chemical Class-Based Multimodal Graph Attention Network
AU - Wang, Cheng
AU - Yuan, Chuang
AU - Wang, Yahui
AU - Shi, Yuying
AU - Zhang, Tao
AU - Patti, Gary J.
N1 - Publisher Copyright:
© 2024 American Chemical Society.
PY - 2024/8/26
Y1 - 2024/8/26
N2 - Libraries of collision cross-section (CCS) values have the potential to facilitate compound identification in metabolomics. Although computational methods provide an opportunity to increase library size rapidly, accurate prediction of CCS values remains challenging due to the structural diversity of small molecules. Here, we developed a machine learning (ML) model that integrates graph attention networks and multimodal molecular representations to predict CCS values on the basis of chemical class. Our approach, referred to as MGAT-CCS, had superior performance in comparison to other ML models in CCS prediction. MGAT-CCS achieved a median relative error of 0.47%/1.14% (positive/negative mode) and 1.40%/1.63% (positive/negative mode) for lipids and metabolites, respectively. When MGAT-CCS was applied to real-world metabolomics data, it reduced the number of false metabolite candidates by roughly 25% across multiple sample types ranging from plasma and urine to cells. To facilitate its application, we developed a user-friendly stand-alone web server for MGAT-CCS that is freely available at https://mgat-ccs-web.onrender.com. This work represents a step forward in predicting CCS values and can potentially facilitate the identification of small molecules when using ion mobility spectrometry coupled with mass spectrometry.
AB - Libraries of collision cross-section (CCS) values have the potential to facilitate compound identification in metabolomics. Although computational methods provide an opportunity to increase library size rapidly, accurate prediction of CCS values remains challenging due to the structural diversity of small molecules. Here, we developed a machine learning (ML) model that integrates graph attention networks and multimodal molecular representations to predict CCS values on the basis of chemical class. Our approach, referred to as MGAT-CCS, had superior performance in comparison to other ML models in CCS prediction. MGAT-CCS achieved a median relative error of 0.47%/1.14% (positive/negative mode) and 1.40%/1.63% (positive/negative mode) for lipids and metabolites, respectively. When MGAT-CCS was applied to real-world metabolomics data, it reduced the number of false metabolite candidates by roughly 25% across multiple sample types ranging from plasma and urine to cells. To facilitate its application, we developed a user-friendly stand-alone web server for MGAT-CCS that is freely available at https://mgat-ccs-web.onrender.com. This work represents a step forward in predicting CCS values and can potentially facilitate the identification of small molecules when using ion mobility spectrometry coupled with mass spectrometry.
UR - http://www.scopus.com/inward/record.url?scp=85197550481&partnerID=8YFLogxK
U2 - 10.1021/acs.jcim.3c01934
DO - 10.1021/acs.jcim.3c01934
M3 - Article
AN - SCOPUS:85197550481
SN - 1549-9596
VL - 64
SP - 6305
EP - 6315
JO - Journal of Chemical Information and Modeling
JF - Journal of Chemical Information and Modeling
IS - 16
ER -