TY - JOUR
T1 - Comparison of statistical models for analyzing genotype, inferred haplotype, and molecular haplotype data
AU - Wallenstein, Sylvan
AU - Chen, Jia
AU - Wetmur, James G.
N1 - Funding Information:
This work was supported by Grants R21ES11643, P01ES0009585 and P42ES07384 from the National Institute of Environmental Science, R24CA088282 from the National Cancer Institute, and RD831-711 from the Environmental Protection Agency. We thank the reviewers for their helpful comments.
PY - 2006/11
Y1 - 2006/11
N2 - This report compares statistical models based on molecular and inferred haplotypes of the human paraoxonase-1 gene (PON1). In a study of 402 women comprising three race/ethnicities, 137 women had ambiguous inferred haplotypes. The inferred haplotypes (the one with highest posterior probability) for 20 of these women differed from molecular haplotypes, while based on the posterior distribution from the imputation method, 30 discrepancies were expected. We examined the proportion of the variance in PON1 enzymatic activity (phenotype) explained by genotype, and by inferred and molecular haplotype information. For Caucasians, there was an improvement in adjusted R2 from 16% for the genotype count model, to 29% for imputed haplotypes, and a further improvement to 33% for molecular haplotypes. For Hispanics and African-Americans, there was no indication that haplotypes helped in explaining PON1 activity, and the imputed model gave essentially the same R2 as the molecular model. For African-Americans, none of the models had adjusted R2 that exceeded 4%, while for Hispanics they were all about 21-22%. We propose a new parsimonious model which uses all the genotype information and selected haplotype information. For PON1, this model achieves essentially the same adjusted R2 as the all-haplotype model, with a potential cost savings and without giving the extreme predictions for uncommon haplotype combinations that the all-haplotype models provides.
AB - This report compares statistical models based on molecular and inferred haplotypes of the human paraoxonase-1 gene (PON1). In a study of 402 women comprising three race/ethnicities, 137 women had ambiguous inferred haplotypes. The inferred haplotypes (the one with highest posterior probability) for 20 of these women differed from molecular haplotypes, while based on the posterior distribution from the imputation method, 30 discrepancies were expected. We examined the proportion of the variance in PON1 enzymatic activity (phenotype) explained by genotype, and by inferred and molecular haplotype information. For Caucasians, there was an improvement in adjusted R2 from 16% for the genotype count model, to 29% for imputed haplotypes, and a further improvement to 33% for molecular haplotypes. For Hispanics and African-Americans, there was no indication that haplotypes helped in explaining PON1 activity, and the imputed model gave essentially the same R2 as the molecular model. For African-Americans, none of the models had adjusted R2 that exceeded 4%, while for Hispanics they were all about 21-22%. We propose a new parsimonious model which uses all the genotype information and selected haplotype information. For PON1, this model achieves essentially the same adjusted R2 as the all-haplotype model, with a potential cost savings and without giving the extreme predictions for uncommon haplotype combinations that the all-haplotype models provides.
KW - Genotype
KW - Haplotype
KW - Inferred haplotype
KW - Molecular haplotype
KW - Phenotype
UR - http://www.scopus.com/inward/record.url?scp=33748956799&partnerID=8YFLogxK
U2 - 10.1016/j.ymgme.2006.05.004
DO - 10.1016/j.ymgme.2006.05.004
M3 - Article
C2 - 16782380
AN - SCOPUS:33748956799
SN - 1096-7192
VL - 89
SP - 270
EP - 273
JO - Molecular Genetics and Metabolism
JF - Molecular Genetics and Metabolism
IS - 3
ER -