Genome‐wide Association Studies: The Success, Failure and Future

Abstract

It is the fifth year of genome‐wide association studies (GWAS) after the first study was published in 2005 which identified the association between complement factor H and age‐related macular degeneration. The publication of this landmark study also marked the start of a new era in the genetic studies of human complex diseases. Since then more than 350 GWAS have been published and the associations of greater than 1500 SNPs (single nucleotide polymorphisms) or genetic loci were also reported. Notably, genome‐wide association studies have contributed to significant advances in our knowledge and understanding of the genetic basis of complex diseases and traits compared to the pregenome‐wide era where linkage mapping and candidate gene association studies were broadly applied. Nevertheless, most of the inherited risk remains to be explained for all the phenotypes that have been investigated so far. This suggests that we still have a long way to go to decipher the genetic basis of human complex traits.

Key concepts

  • The primary aim of genome‐wide association studies (GWAS) is to identify novel genetic variants to elucidate the disease biological pathways and eventually lead to identification of new molecular markers for diagnostic application, or drug targets for therapeutic intervention.

  • GWAS is a comprehensive and biologically agnostic approach in searching for unknown disease variants; this method has been very successful in identifying novel genetic loci for various human complex traits.

  • The GWAS findings have also provided new insights into the molecular pathways of complex diseases even when most of the disease causative variants remain to be discerned from the neighbouring correlated markers.

  • Most of the risk alleles that have been identified by GWAS are common (allele frequency >5%) and conferred small effect sizes (odds ratio, OR<1.5).

  • Owing to the small effect sizes, collectively the identified SNPs only explain a small portion of the total inherited risk for the diseases or traits.

  • Only a small number of the risk alleles which are identified by GWAS are nonsynonymous SNPs in exons.

  • Most SNPs are located in either intron, intergenic or gene desert regions, and they may well be functional as their locations may coincide with some regulatory elements such as enhancers, insulators, transcription factor‐binding sites and sequences encoding for microRNAs.

  • The genetic architecture of complex diseases remains elusive; it is unclear how much each type of genetic variant contributes to inherited risk and the relative proportion of rare versus common variants.

  • Other potential aspects to be included in future genetic studies of complex diseases are gene–gene and gene–environment interactions, as well as epigenetic studies.

  • Sequencing‐based methods are more efficient to systematically identify and study both common and rare SNPs and non‐SNP variants, and also for transcript expression and epigenetic studies.

Keywords: genome‐wide association studies; rare variants; copy number variants; 1000 Genomes Project; next‐generation sequencing technologies

References

Ahn SM, Kim TH, Lee S et al. (2009) The first Korean genome sequence and analysis: full genome sequencing for a socio‐ethnic group. Genome Research 19: 1622–1629.

Barrett JC and Cardon LR (2006) Evaluating coverage of genome‐wide association studies. Nature Genetics 38: 659–662.

Barrett JC, Clayton DG, Concannon P et al. (2009) Genome‐wide association study and meta‐analysis find that over 40 loci affect risk of type 1 diabetes. Nature Genetics 41: 703–707.

Barrett JC, Hansoul S, Nicolae DL et al. (2008) Genome‐wide association defines more than 30 distinct susceptibility loci for Crohn's disease. Nature Genetics 40: 955–962.

Bentley DR, Balasubramanian S, Swerdlow HP et al. (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456: 53–59.

Cahan P, Li Y, Izumi M et al. (2009) The impact of copy number variation on local gene expression in mouse hematopoietic stem and progenitor cells. Nature Genetics 41: 430–437.

Cho JH (2008) The genetics and immunopathogenesis of inflammatory bowel disease. Nature Reviews. Immunology 8: 458–466.

Cohen JC, Kiss RS, Pertsemlidis A et al. (2004) Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305: 869–872.

Cookson W, Liang L, Abecasis G et al. (2009) Mapping complex disease traits with global gene expression. Nature Reviews. Genetics 10: 184–194.

Cooper JD, Smyth DJ, Smiles AM et al. (2008) Meta‐analysis of genome‐wide association study data identifies additional type 1 diabetes risk loci. Nature Genetics 40: 1399–1401.

Couzin J (2008) DNA test for breast cancer risk draws criticism. Science 322: 357.

Daly AK, Donaldson PT, Bhatnagar P et al. (2009) HLA‐B*5701 genotype is a major determinant of drug‐induced liver injury due to flucloxacillin. Nature Genetics 41: 816–819.

De Jager PL, Jia X, Wang J et al. (2009) Meta‐analysis of genome scans and replication identify CD6, IRF8 and TNFRSF1A as new multiple sclerosis susceptibility loci. Nature Genetics 41: 776–782.

Duerr RH, Taylor KD, Brant SR et al. (2006) A genome‐wide association study identifies IL23R as an inflammatory bowel disease gene. Science 314: 1461–1463.

Easton DF and Eeles RA (2008) Genome‐wide association studies in cancer. Human Molecular Genetics 17(R2) : R109–R115.

Easton DF, Pooley KA, Dunning AM et al. (2007) Genome‐wide association study identifies novel breast cancer susceptibility loci. Nature 447: 1087–1093.

Fraser HB and Xie X (2009) Common polymorphic transcript variation in human disease. Genome Research 19: 567–575.

Frazer KA, Murray SS, Schork NJ et al. (2009) Human genetic variation and its contribution to complex traits. Nature Reviews. Genetics 10: 241–251.

Gonzalez E, Kulkarni H, Bolivar H et al. (2005) The influence of CCL3L1 gene‐containing segmental duplications on HIV‐1/AIDS susceptibility. Science 307: 1434–1440.

Henrichsen CN, Vinckenbosch N, Zollner S et al. (2009) Segmental copy number variation shapes tissue transcriptomes. Nature Genetics 41: 424–429.

Hinds DA, Kloek AP, Jen M et al. (2006) Common deletions and SNPs are in linkage disequilibrium in the human genome. Nature Genetics 38: 82–85.

Hollox EJ, Huffmeier U, Zeeuwen PL et al. (2008) Psoriasis is associated with increased beta‐defensin genomic copy number. Nature Genetics 40: 23–25.

Houlston RS, Webb E, Broderick P et al. (2008) Meta‐analysis of genome‐wide association data identifies four new susceptibility loci for colorectal cancer. Nature Genetics 40: 1426–1435.

International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437: 1299–1320.

International HapMap Consortium (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449: 851–861.

International Schizophrenia Consortium (2008) Rare chromosomal deletions and duplications increase risk of schizophrenia. Nature 455: 237–241.

International Schizophrenia Consortium (2009) Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460: 748–752.

Ioannidis JP, Thomas G and Daly MJ (2009) Validating, augmenting and refining genome‐wide association signals. Nature Reviews. Genetics 10: 318–329.

Kim JI, Ju YS, Park H et al. (2009) A highly annotated whole‐genome sequence of a Korean individual. Nature 460: 1011–1015.

Klein RJ, Zeiss C, Chew EY et al. (2005) Complement factor H polymorphism in age‐related macular degeneration. Science 308: 385–389.

Kuehn BM (2008) 1000 Genomes Project promises closer look at variation in human genome. JAMA 300: 2715.

Lettre G and Rioux JD (2008) Autoimmune diseases: insights from genome‐wide association studies. Human Molecular Genetics 17(R2): R116–R121.

Levy S, Sutton G, Ng PC et al. (2007) The diploid genome sequence of an individual human. PLoS Biology 5: e254.

Li M, Li C and Guan W (2008) Evaluation of coverage variation of SNP chips for genome‐wide association studies. European Journal of Human Genetics 16: 635–643.

Libioulle C, Louis E, Hansoul S et al. (2007) Novel Crohn disease locus identified by genome‐wide association maps to a gene desert on 5p13.1 and modulates expression of PTGER4. PLoS Genetics 3: e58.

Maher B (2008) The case of the missing heritability. Nature 456: 18–21.

Mathew CG (2008) New links to the pathogenesis of Crohn disease provided by genome‐wide association scans. Nature Reviews. Genetics 9: 9–14.

McCarroll SA, Hadnott TN, Perry GH et al. (2006) Common deletion polymorphisms in the human genome. Nature Genetics 38: 86–92.

McCarroll SA, Huett A, Kuballa P et al. (2008a) Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn's disease. Nature Genetics 40: 1107–1112.

McCarroll SA, Kuruvilla FG, Korn JM et al. (2008b) Integrated detection and population‐genetic analysis of SNPs and copy number variation. Nature Genetics 40: 1166–1174.

Mohlke KL, Boehnke M and Abecasis GR (2008) Metabolic and cardiovascular traits: an abundance of recently identified common genetic variants. Human Molecular Genetics 17(R2): R102–108.

Nejentsev S, Walker N, Riches D et al. (2009) Rare variants of IFIH1, a gene implicated in antiviral responses, protect against type 1 diabetes. Science 324: 387–389.

Parkes M, Barrett JC, Prescott NJ et al. (2007) Sequence variants in the autophagy gene IRGM and multiple other replicating loci contribute to Crohn's disease susceptibility. Nature Genetics 39: 830–832.

Pharoah PD, Antoniou AC, Easton DF et al. (2008) Polygenes, risk prediction, and targeted prevention of breast cancer. New England Journal of Medicine 358: 2796–2803.

Pomerantz MM, Ahmadiyeh N, Jia L et al. (2009) The 8q24 cancer risk variant rs6983267 shows long‐range interaction with MYC in colorectal cancer. Nature Genetics 41: 882–884.

Prokopenko I, McCarthy MI and Lindgren CM (2008) Type 2 diabetes: new genes, new understanding. Trends in Genetics 24: 613–621.

Rafnar T, Sulem P, Stacey SN et al. (2009) Sequence variants at the TERT‐CLPTM1L locus associate with many cancer types. Nature Genetics 41: 221–227.

Rapley EA, Turnbull C, Al Olama AA et al. (2009) A genome‐wide association study of testicular germ cell tumor. Nature Genetics 41: 807–810.

Romeo S, Pennacchio LA, Fu Y et al. (2007) Population‐based resequencing of ANGPTL4 uncovers variations that reduce triglycerides and increase HDL. Nature Genetics 39: 513–516.

Sebat J, Lakshmi B, Malhotra D et al. (2007) Strong association of de novo copy number mutations with autism. Science 316: 445–449.

Sethupathy P and Collins FS (2008) MicroRNA target site polymorphisms and human disease. Trends in Genetics 24: 489–497.

Sladek R, Rocheleau G, Rung J et al. (2007) A genome‐wide association study identifies novel risk loci for type 2 diabetes. Nature 445: 881–885.

Srinivasan BS, Chen J, Cheng C et al. (2009) Methods for analysis in pharmacogenomics: lessons from the Pharmacogenetics Research Network Analysis Group. Pharmacogenomics 10: 243–251.

Stacey SN, Manolescu A, Sulem P et al. (2008) Common variants on chromosome 5p12 confer susceptibility to estrogen receptor‐positive breast cancer. Nature Genetics 40: 703–706.

Stranger BE, Forrest MS, Dunning M et al. (2007) Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315: 848–853.

Thorleifsson G, Magnusson KP, Sulem P et al. (2007) Common sequence variants in the LOXL1 gene confer susceptibility to exfoliation glaucoma. Science 317: 1397–1400.

Tuupanen S, Turunen M, Lethonen R et al. (2009) The common colorectal cancer predisposition SNP rs6983267 at chromosome 8q24 confers potential to enhanced Wnt signaling. Nature Genetics 41: 885–890.

Unoki H, Takahashi A, Kawaguchi T et al. (2008) SNPs in KCNQ1 are associated with susceptibility to type 2 diabetes in East Asian and European populations. Nature Genetics 40: 1098–1102.

Walsh T, McClellan JM, McCarthy SE et al. (2008) Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia. Science 320: 539–543.

Wang J, Wang W, Li R et al. (2008) The diploid genome sequence of an Asian individual. Nature 456: 60–65.

Wheeler DA, Srinivasan M, Egholm M et al. (2008) The complete genome of an individual by massively parallel DNA sequencing. Nature 452: 872–876.

Yasuda K, Miyake K, Horikawa Y et al. (2008) Variants in KCNQ1 are associated with susceptibility to type 2 diabetes mellitus. Nature Genetics 40: 1092–1097.

Zeggini E, Scott LJ, Saxena R et al. (2008) Meta‐analysis of genome‐wide association data and large‐scale replication identifies additional susceptibility loci for type 2 diabetes. Nature Genetics 40: 638–645.

Zheng SL, Sun J, Wiklund F et al. (2008) Cumulative association of five genetic variants with prostate cancer. New England Journal of Medicine 358: 910–919.

Zhernakova A, van Diemen CC and Wijmenga C (2009) Detecting shared pathogenesis from the shared genetics of immune‐related diseases. Nature Reviews. Genetics 10: 43–55.

Further reading

Donnelly P (2008) Progress and challenges in genome‐wide association studies in humans. Nature 456: 728–731.

Goldstein DB (2009) Common genetic variation and human traits. New England Journal of Medicine 360: 1696–1698.

Hardy J and Singleton A (2009) Genomewide association studies and human disease. New England Journal of Medicine 360: 1759–1768.

Hirschhorn JN (2009) Genomewide association studies – illuminating biologic pathways. New England Journal of Medicine 360: 1699–1701.

Kraft P and Hunter DJ (2009) Genetic risk prediction – are we there yet? New England Journal of Medicine 360: 1701–1703.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Chee‐Seng, Ku, En Yun, Loy, Yudi, Pawitan, and Kee‐Seng, Chia(Dec 2009) Genome‐wide Association Studies: The Success, Failure and Future. In: eLS. John Wiley & Sons Ltd, Chichester. http://www.els.net [doi: 10.1002/9780470015902.a0021995]