Prediction of disease-associated functional variants in noncoding regions through a comprehensive analysis by integrating datasets and features

  • Yu Lu
  • , Yiming Wu
  • , Yuan Liu
  • , Yizhou Li
  • , Runyu Jing
  • , Menglong Li

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

One of the greatest challenges in human genetics is deciphering the link between functional variants in noncoding sequences and the pathophysiology of complex diseases. To address this issue, many methods have been developed to sort functional single-nucleotide variants (SNVs) for neutral SNVs in noncoding regions. In this study, we integrated well-established features and commonly used datasets and merged them into large-scale datasets based on a random forest model, which yielded promising performance and outperformed some cutting-edge approaches. Our analyses of feature importance and data coverage also provide certain clues for future research in enhancing the prediction of functional noncoding SNVs.

Original languageEnglish
Pages (from-to)667-684
Number of pages18
JournalHuman Mutation
Volume42
Issue number6
DOIs
StatePublished - Jun 2021

Keywords

  • complex diseases
  • feature importance
  • functional variants
  • noncoding regions
  • random forest

Fingerprint

Dive into the research topics of 'Prediction of disease-associated functional variants in noncoding regions through a comprehensive analysis by integrating datasets and features'. Together they form a unique fingerprint.

Cite this