CMF-Impute: An accurate imputation tool for single-cell RNA-seq data

Junlin Xu, Lijun Cai, Bo Liao, Wen Zhu, Jia Liang Yang

Research output: Contribution to journalArticlepeer-review

83 Scopus citations

Abstract

Motivation: Single-cell RNA-sequencing (scRNA-seq) technology provides a powerful tool for investigating cell heterogeneity and cell subpopulations by allowing the quantification of gene expression at single-cell level. However, scRNA-seq data analysis remains challenging because of various technical noises such as dropout events (i.e. excessive zero counts in the expression matrix). Results: By taking consideration of the association among cells and genes, we propose a novel collaborative matrix factorization-based method called CMF-Impute to impute the dropout entries of a given scRNA-seq expression matrix. We test CMF-Impute and compare it with the other five state-of-the-art methods on six popular real scRNA-seq datasets of various sizes and three simulated datasets. For simulated datasets, CMF-Impute outperforms other methods in imputing the closest dropouts to the original expression values as evaluated by both the sum of squared error and Pearson correlation coefficient. For real datasets, CMF-Impute achieves the most accurate cell classification results in spite of the choice of different clustering methods like SC3 or T-SNE followed by K-means as evaluated by both adjusted rand index and normalized mutual information. Finally, we demonstrate that CMF-Impute is powerful in reconstructing cell-to-cell and gene-to-gene correlation, and in inferring cell lineage trajectories.

Original languageEnglish
Pages (from-to)3139-3147
Number of pages9
JournalBioinformatics
Volume36
Issue number10
DOIs
StatePublished - 1 May 2020
Externally publishedYes

Fingerprint

Dive into the research topics of 'CMF-Impute: An accurate imputation tool for single-cell RNA-seq data'. Together they form a unique fingerprint.

Cite this