A community effort to identify and correct mislabeled samples in proteogenomic studies

Seungyeul Yoo, Zhiao Shi, Bo Wen, Soon Jye Kho, Renke Pan, Hanying Feng, Hong Chen, Anders Carlsson, Patrik Edén, Weiping Ma, Michael Raymer, Ezekiel J. Maier, Zivana Tezak, Elaine Johanson, Denise Hinton, Henry Rodriguez, Jun Zhu, Emily Boja, Pei Wang, Bing Zhang

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Sample mislabeling or misannotation has been a long-standing problem in scientific research, particularly prevalent in large-scale, multi-omic studies due to the complexity of multi-omic workflows. There exists an urgent need for implementing quality controls to automatically screen for and correct sample mislabels or misannotations in multi-omic studies. Here, we describe a crowdsourced precisionFDA NCI-CPTAC Multi-omics Enabled Sample Mislabeling Correction Challenge, which provides a framework for systematic benchmarking and evaluation of mislabel identification and correction methods for integrative proteogenomic studies. The challenge received a large number of submissions from domestic and international data scientists, with highly variable performance observed across the submitted methods. Post-challenge collaboration between the top-performing teams and the challenge organizers has created an open-source software, COSMO, with demonstrated high accuracy and robustness in mislabeling identification and correction in simulated and real multi-omic datasets.

Original languageEnglish
Article number100245
JournalPatterns
Volume2
Issue number5
DOIs
StatePublished - 14 May 2021

Keywords

  • CPTAC
  • DSML 3: Development/Pre-production: Data science output has been rolled out/validated across multiple domains/problems
  • crowdsourcing challenge
  • mislabeling
  • multi-omics
  • proteomics

Fingerprint

Dive into the research topics of 'A community effort to identify and correct mislabeled samples in proteogenomic studies'. Together they form a unique fingerprint.

Cite this