TY - JOUR
T1 - Interpretable and context-free deconvolution of multi-scale whole transcriptomic data with UniCell deconvolve
AU - Charytonowicz, Daniel
AU - Brody, Rachel
AU - Sebra, Robert
N1 - Publisher Copyright:
© 2023, The Author(s).
PY - 2023/12
Y1 - 2023/12
N2 - We introduce UniCell: Deconvolve Base (UCDBase), a pre-trained, interpretable, deep learning model to deconvolve cell type fractions and predict cell identity across Spatial, bulk-RNA-Seq, and scRNA-Seq datasets without contextualized reference data. UCD is trained on 10 million pseudo-mixtures from a fully-integrated scRNA-Seq training database comprising over 28 million annotated single cells spanning 840 unique cell types from 898 studies. We show that our UCDBase and transfer-learning models achieve comparable or superior performance on in-silico mixture deconvolution to existing, reference-based, state-of-the-art methods. Feature attribute analysis uncovers gene signatures associated with cell-type specific inflammatory-fibrotic responses in ischemic kidney injury, discerns cancer subtypes, and accurately deconvolves tumor microenvironments. UCD identifies pathologic changes in cell fractions among bulk-RNA-Seq data for several disease states. Applied to lung cancer scRNA-Seq data, UCD annotates and distinguishes normal from cancerous cells. Overall, UCD enhances transcriptomic data analysis, aiding in assessment of cellular and spatial context.
AB - We introduce UniCell: Deconvolve Base (UCDBase), a pre-trained, interpretable, deep learning model to deconvolve cell type fractions and predict cell identity across Spatial, bulk-RNA-Seq, and scRNA-Seq datasets without contextualized reference data. UCD is trained on 10 million pseudo-mixtures from a fully-integrated scRNA-Seq training database comprising over 28 million annotated single cells spanning 840 unique cell types from 898 studies. We show that our UCDBase and transfer-learning models achieve comparable or superior performance on in-silico mixture deconvolution to existing, reference-based, state-of-the-art methods. Feature attribute analysis uncovers gene signatures associated with cell-type specific inflammatory-fibrotic responses in ischemic kidney injury, discerns cancer subtypes, and accurately deconvolves tumor microenvironments. UCD identifies pathologic changes in cell fractions among bulk-RNA-Seq data for several disease states. Applied to lung cancer scRNA-Seq data, UCD annotates and distinguishes normal from cancerous cells. Overall, UCD enhances transcriptomic data analysis, aiding in assessment of cellular and spatial context.
UR - http://www.scopus.com/inward/record.url?scp=85149990583&partnerID=8YFLogxK
U2 - 10.1038/s41467-023-36961-8
DO - 10.1038/s41467-023-36961-8
M3 - Article
C2 - 36906603
AN - SCOPUS:85149990583
SN - 2041-1723
VL - 14
JO - Nature Communications
JF - Nature Communications
IS - 1
M1 - 1350
ER -