TY - JOUR
T1 - A streamlined scRNA-Seq data analysis framework based on improved sparse subspace clustering
AU - Zhuang, Jujuan
AU - Cui, Lingyu
AU - Qu, Tianqi
AU - Ren, Changjing
AU - Xu, Junlin
AU - Li, Tianbao
AU - Tian, Geng
AU - Yang, Jialiang
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2021
Y1 - 2021
N2 - One advantage of single-cell RNA sequencing is its ability in revealing cell heterogeneity by cell clustering. However, cell clustering based on single-cell RNA sequencing is challenging due to the high transcript amplification noise, sparsity and outlier cell populations. In this study, we propose a novel sparse subspace clustering method called Structured Sparse Subspace Clustering and Completion for single-cell RNA sequencing analysis by assuming the cells related together are in the same subspace, and so the relationships among cells can be described within a subspace instead of between cell pairs. The proposed optimization model is solved by the Linearized Alternating Direction Method of Multipliers, in which data completion and spectral clustering are combined as a whole by mutual constraint. It is worth noting that random walk is used in the model to make the coefficient matrix more diagonal in the optimum iterative procedure, and the effect is significant. Our model is applied and compared with 5 state-of-the-art clustering methods on 6 public single cell datasets and a simulated data set with cell numbers varying from 56 to over 3000. As a result, our model outperforms the other clustering methods in clustering accuracy as evaluated by Adjusted Rand Index, Normalized Mutual Information, Homogeneity and Completeness, especially compared with the other improved sparse subspace clustering methods.
AB - One advantage of single-cell RNA sequencing is its ability in revealing cell heterogeneity by cell clustering. However, cell clustering based on single-cell RNA sequencing is challenging due to the high transcript amplification noise, sparsity and outlier cell populations. In this study, we propose a novel sparse subspace clustering method called Structured Sparse Subspace Clustering and Completion for single-cell RNA sequencing analysis by assuming the cells related together are in the same subspace, and so the relationships among cells can be described within a subspace instead of between cell pairs. The proposed optimization model is solved by the Linearized Alternating Direction Method of Multipliers, in which data completion and spectral clustering are combined as a whole by mutual constraint. It is worth noting that random walk is used in the model to make the coefficient matrix more diagonal in the optimum iterative procedure, and the effect is significant. Our model is applied and compared with 5 state-of-the-art clustering methods on 6 public single cell datasets and a simulated data set with cell numbers varying from 56 to over 3000. As a result, our model outperforms the other clustering methods in clustering accuracy as evaluated by Adjusted Rand Index, Normalized Mutual Information, Homogeneity and Completeness, especially compared with the other improved sparse subspace clustering methods.
KW - Linearized alternating direction method of multipliers
KW - low-rank representation
KW - single-cell RNA sequencing
KW - sparse subspace clustering
UR - http://www.scopus.com/inward/record.url?scp=85099579050&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2021.3049807
DO - 10.1109/ACCESS.2021.3049807
M3 - Article
AN - SCOPUS:85099579050
SN - 2169-3536
VL - 9
SP - 9719
EP - 9727
JO - IEEE Access
JF - IEEE Access
M1 - 9316660
ER -