Identifying relationships among genomic disease regions: Predicting genes at pathogenic SNP associations and rare deletions

Soumya Raychaudhuri, Robert M. Plenge, Elizabeth J. Rossin, Aylwin C.Y. Ng, Shaun M. Purcell, Pamela Sklar, Edward M. Scolnick, Ramnik J. Xavier, David Altshuler, Mark J. Daly, Kristen Ardlie, M. Helena Azevedo, Nicholas Bass, Douglas H.R. Blackwood, Celia Carvalho, Kimberly Chambert, Khalid Choudhury, David Conti, Aiden Corvin, Nick J. CraddockCaroline Crombie, David Curtis, Susmita Datta, Stacey B. Gabrie, Casey Gates, Lucy Georgieva, Michael Gill, Hugh Gurling, Peter A. Holmans, Christina M. Hultman, Ayman Fanous, Gillian Fraser, Elaine Kenny, George K. Kirov, James A. Knowles, Robert Krasucki, Joshua Korn, Leh Kwan Soh, Jacob Lawrence, Paul Lichtenstein, Antonio Macedo, Stuart Macgregor, Alan W. Maclean, Scott Mahon, Pat Malloy, Kevin A. McGhee, Andrew McQuillin, Helena Medeiros, Frank Middleton, Vihra Milanova, Christopher Morley, Derek W. Morris, Walter J. Muir, Ivan Nikolov, N. Norton, Colm T. O'Dushlaine, Michael C. O'Donovan, Michael J. Owen, Carlos N. Pato, Carlos Paz Ferreira, Ben Pickard, Jonathan Pimm, Vinay Puri, Digby Quested, Douglas M. Ruderfer, David St. Clair, Jennifer L. Stone, Patrick F. Sullivan, Emma F. Thelander, Srinivasa Thirumalai, Draga Toncheva, Margaret Van Beck, Peter M. Visscher, John L. Waddington, Nicholas Walker, H. Williams, Nigel M. Williams

Research output: Contribution to journalArticlepeer-review

345 Scopus citations


Translating a set of disease regions into insight about pathogenic mechanisms requires not only the ability to identify the key disease genes within them, but also the biological relationships among those key genes. Here we describe a statistical method, Gene Relationships Among Implicated Loci (GRAIL), that takes a list of disease regions and automatically assesses the degree of relatedness of implicated genes using 250,000 PubMed abstracts. We first evaluated GRAIL by assessing its ability to identify subsets of highly related genes in common pathways from validated lipid and height SNP associations from recent genome-wide studies. We then tested GRAIL, by assessing its ability to separate true disease regions from many false positive disease regions in two separate practical applications in human genetics. First, we took 74 nominally associated Crohn's disease SNPs and applied GRAIL to identify a subset of 13 SNPs with highly related genes. Of these, ten convincingly validated in follow-up genotyping; genotyping results for the remaining three were inconclusive. Next, we applied GRAIL to 165 rare deletion events seen in schizophrenia cases (less than one-third of which are contributing to disease risk). We demonstrate that GRAIL is able to identify a subset of 16 deletions containing highly related genes; many of these genes are expressed in the central nervous system and play a role in neuronal synapses. GRAIL offers a statistically robust approach to identifying functionally related genes from across multiple disease regions-that likely represent key disease pathways. An online version of this method is available for public use ( edu/mpg/grail/).

Original languageEnglish
Article numbere1000534
JournalPLoS Genetics
Issue number6
StatePublished - Jun 2009
Externally publishedYes


Dive into the research topics of 'Identifying relationships among genomic disease regions: Predicting genes at pathogenic SNP associations and rare deletions'. Together they form a unique fingerprint.

Cite this