TY - GEN
T1 - Correction for hidden confounders in the genetic analysis of gene expression
AU - Listgarten, Jennifer
AU - Kadie, Carl
AU - Schadt, Eric E.
AU - Heckerman, David
PY - 2011
Y1 - 2011
N2 - Understanding the genetic underpinnings of disease is important for screening, treatment, drug development, and basic biological insight. One way of getting at such an understanding is to find out which parts of our DNA, such as single-nucleotide polymorphisms (SNPs), affect particular intermediary processes such as gene expression. Naively, such associations can be identified using a simple statistical test on all paired combinations of genetic variants and gene transcripts. However, a wide variety of confounders lie hidden in the data, leading to both spurious associations and missed associations if not properly addressed. We present a statistical model that jointly corrects for two particular kinds of hidden structure: genetic or population structure (e.g., race, family-relatedness), and microarray expression artifacts (e.g., batch effects), when these confounders are unknown. Applying our method to both real and synthetic, human and mouse data, we demonstrate the need for such a joint correction of confounders, and also the disadvantages of other possible approaches based on those in the current literature.
AB - Understanding the genetic underpinnings of disease is important for screening, treatment, drug development, and basic biological insight. One way of getting at such an understanding is to find out which parts of our DNA, such as single-nucleotide polymorphisms (SNPs), affect particular intermediary processes such as gene expression. Naively, such associations can be identified using a simple statistical test on all paired combinations of genetic variants and gene transcripts. However, a wide variety of confounders lie hidden in the data, leading to both spurious associations and missed associations if not properly addressed. We present a statistical model that jointly corrects for two particular kinds of hidden structure: genetic or population structure (e.g., race, family-relatedness), and microarray expression artifacts (e.g., batch effects), when these confounders are unknown. Applying our method to both real and synthetic, human and mouse data, we demonstrate the need for such a joint correction of confounders, and also the disadvantages of other possible approaches based on those in the current literature.
UR - https://www.scopus.com/pages/publications/80053135149
M3 - Conference contribution
AN - SCOPUS:80053135149
T3 - Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, UAI 2011
SP - 852
BT - Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, UAI 2011
PB - AUAI Press
ER -