TY - JOUR
T1 - A Statistical Method for Association Analysis of Cell Type Compositions
AU - Huang, Licai
AU - Little, Paul
AU - Huyghe, Jeroen R.
AU - Shi, Qian
AU - Harrison, Tabitha A.
AU - Yothers, Greg
AU - George, Thomas J.
AU - Peters, Ulrike
AU - Chan, Andrew T.
AU - Newcomb, Polly A.
AU - Sun, Wei
N1 - Publisher Copyright:
© 2020, International Chinese Statistical Association.
PY - 2021/12
Y1 - 2021/12
N2 - Gene expression data are often collected from tissue samples that are composed of multiple cell types. Studies of cell type composition based on gene expression data from tissue samples have recently attracted increasing research interest and led to new method development for cell type composition estimation. This new information on cell type composition can be associated with individual characteristics (e.g., genetic variants) or clinical outcomes (e.g., survival time). Such association analysis can be conducted for each cell type separately followed by multiple testing correction. An alternative approach is to evaluate this association using the composition of all the cell types, thus aggregating association signals across cell types. A key challenge of this approach is to account for the dependence across cell types. We propose a new method to quantify the distances between cell types while accounting for their dependencies, and use this information for association analysis. We demonstrate our method in two applied examples: to assess the association between immune cell type composition in tumor samples of colorectal cancer patients versus survival time and SNP genotypes. We found immune cell composition has prognostic value, and our distance metric leads to more accurate survival time prediction than other distance metrics that ignore cell type dependencies. In addition, survival time-associated SNPs are enriched among the SNPs associated with immune cell composition.
AB - Gene expression data are often collected from tissue samples that are composed of multiple cell types. Studies of cell type composition based on gene expression data from tissue samples have recently attracted increasing research interest and led to new method development for cell type composition estimation. This new information on cell type composition can be associated with individual characteristics (e.g., genetic variants) or clinical outcomes (e.g., survival time). Such association analysis can be conducted for each cell type separately followed by multiple testing correction. An alternative approach is to evaluate this association using the composition of all the cell types, thus aggregating association signals across cell types. A key challenge of this approach is to account for the dependence across cell types. We propose a new method to quantify the distances between cell types while accounting for their dependencies, and use this information for association analysis. We demonstrate our method in two applied examples: to assess the association between immune cell type composition in tumor samples of colorectal cancer patients versus survival time and SNP genotypes. We found immune cell composition has prognostic value, and our distance metric leads to more accurate survival time prediction than other distance metrics that ignore cell type dependencies. In addition, survival time-associated SNPs are enriched among the SNPs associated with immune cell composition.
KW - Cell type composition
KW - Genome-wide associations
KW - Survival time
UR - http://www.scopus.com/inward/record.url?scp=85091089919&partnerID=8YFLogxK
U2 - 10.1007/s12561-020-09293-0
DO - 10.1007/s12561-020-09293-0
M3 - Article
AN - SCOPUS:85091089919
SN - 1867-1764
VL - 13
SP - 373
EP - 385
JO - Statistics in Biosciences
JF - Statistics in Biosciences
IS - 3
ER -