IDSL_MINT: a deep learning framework to predict molecular fingerprints from mass spectra

Sadjad Fakouri Baygi, Dinesh Kumar Barupal

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

The majority of tandem mass spectrometry (MS/MS) spectra in untargeted metabolomics and exposomics studies lack any annotation. Our deep learning framework, Integrated Data Science Laboratory for Metabolomics and Exposomics—Mass INTerpreter (IDSL_MINT) can translate MS/MS spectra into molecular fingerprint descriptors. IDSL_MINT allows users to leverage the power of the transformer model for mass spectrometry data, similar to the large language models. Models are trained on user-provided reference MS/MS libraries via any customizable molecular fingerprint descriptors. IDSL_MINT was benchmarked using the LipidMaps database and improved the annotation rate of a test study for MS/MS spectra that were not originally annotated using existing mass spectral libraries. IDSL_MINT may improve the overall annotation rates in untargeted metabolomics and exposomics studies. The IDSL_MINT framework and tutorials are available in the GitHub repository at https://github.com/idslme/IDSL_MINT . Scientific contribution statement. Structural annotation of MS/MS spectra from untargeted metabolomics and exposomics datasets is a major bottleneck in gaining new biological insights. Machine learning models to convert spectra into molecular fingerprints can help in the annotation process. Here, we present IDSL_MINT, a new, easy-to-use and customizable deep-learning framework to train and utilize new models to predict molecular fingerprints from spectra for the compound annotation workflows.

Original languageEnglish
Article number8
JournalJournal of Cheminformatics
Volume16
Issue number1
DOIs
StatePublished - Dec 2024

Keywords

  • Deep learning
  • LipidMaps
  • Lipidomics
  • Mass spectrometry
  • Metabolomics
  • Molecular fingerprint descriptor
  • PyTorch
  • Transformer

Fingerprint

Dive into the research topics of 'IDSL_MINT: a deep learning framework to predict molecular fingerprints from mass spectra'. Together they form a unique fingerprint.

Cite this