Interpretable and context-free deconvolution of multi-scale whole transcriptomic data with UniCell deconvolve

Daniel Charytonowicz, Rachel Brody, Robert Sebra

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

We introduce UniCell: Deconvolve Base (UCDBase), a pre-trained, interpretable, deep learning model to deconvolve cell type fractions and predict cell identity across Spatial, bulk-RNA-Seq, and scRNA-Seq datasets without contextualized reference data. UCD is trained on 10 million pseudo-mixtures from a fully-integrated scRNA-Seq training database comprising over 28 million annotated single cells spanning 840 unique cell types from 898 studies. We show that our UCDBase and transfer-learning models achieve comparable or superior performance on in-silico mixture deconvolution to existing, reference-based, state-of-the-art methods. Feature attribute analysis uncovers gene signatures associated with cell-type specific inflammatory-fibrotic responses in ischemic kidney injury, discerns cancer subtypes, and accurately deconvolves tumor microenvironments. UCD identifies pathologic changes in cell fractions among bulk-RNA-Seq data for several disease states. Applied to lung cancer scRNA-Seq data, UCD annotates and distinguishes normal from cancerous cells. Overall, UCD enhances transcriptomic data analysis, aiding in assessment of cellular and spatial context.

Original languageEnglish
Article number1350
JournalNature Communications
Volume14
Issue number1
DOIs
StatePublished - Dec 2023

Fingerprint

Dive into the research topics of 'Interpretable and context-free deconvolution of multi-scale whole transcriptomic data with UniCell deconvolve'. Together they form a unique fingerprint.

Cite this