Genome‐wide Association Studies


The genome‐wide association approach has become a reality as a result of significant advances in genomic resources such as the International HapMap Project, the high‐throughput genotyping technologies, the collection of large number of cases and controls and the development of powerful statistical analysis tools. The strength of this approach is its agnostic genome‐wide search for single nucleotide polymorphisms. The explosion of more than 100 genome‐wide association studies since 2007 has identified numerous novel genetic variants or loci that are associated with many complex diseases. These studies have also linked new molecular pathways to various diseases. This is currently the most successful approach for pinpointing common variants with modest genetic effect that are associated with common complex diseases. Beyond these discoveries, there is a need to demonstrate their biological significance and to incorporate other structural variations, gene–gene, epigenetic and environmental interactions.

Key concepts

  • In the pre‐genomic era, the genetic dissection of complex diseases is done through classical linkage studies and candidate gene‐based association studies.

  • The classical linkage study is a powerful approach to identify rare and high penetrant disease variant or gene.

  • The candidate gene approach in the pre‐genomic era is limited to a few genetic markers for genes that are suspected to be involved in the pathogenesis of the complex disease.

  • The GWA approach was first proposed by Risch and Merikangas in 1996 as a statistically more powerful approach to detect common variant with modest genetic effect compared to linkage study design.

  • International HapMap Project was initiated in 2003 to characterize the haplotype patterns in human genome and subsequently identify tagging SNPs.

  • The human genome can be organized into haplotypes with strong LD among the SNPs.

  • For direct association study design or the gene‐centric approach, SNPs which are likely to be functionally important are selected, e.g. nonsynonymous SNPs.

  • Most of the GWAS was conducted in a two‐stage or multi‐stage design because this design is more cost‐effective as only a fraction of samples was genotyped with several hundred thousand SNPs.

  • The current generation of GWAS has contributed rapidly within the last few years in uncovering novel genes associated with common complex diseases.

  • The future GWAS will have to explore structural variations, gene–gene interactions, epigenetic and gene–environment interactions. The need to include environmental factors will require a prospective study design.

Keywords: genome‐wide association; single nucleotide polymorphisms; International HapMap Project; linkage disequilibrium; complex diseases


de Bakker PI, Burtt NP, Graham RR et al. (2006) Transferability of tag SNPs in genetic association studies in multiple populations. Nature Genetics 38: 1298–1303.

Blauw HM, Veldink JH, van Es MA et al. (2008) Copy‐number variation in sporadic amyotrophic lateral sclerosis: a genome‐wide screen. Lancet Neurology 7: 319–326.

Bottini N, Musumeci L, Alonso A et al. (2004) A functional variant of lymphoid tyrosine phosphatase is associated with type I diabetes. Nature Genetics 36: 337–338.

Cohen JC, Kiss RS, Pertsemlidis A et al. (2004) Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305: 869–872.

Duerr RH, Taylor KD, Brant SR et al. (2006) A genome‐wide association study identifies IL23R as an inflammatory bowel disease gene. Science 314: 1461–1463.

Easton DF, Pooley KA, Dunning AM et al. (2007) Genome‐wide association study identifies novel breast cancer susceptibility loci. Nature 447: 1087–1093.

Fanciulli M, Norsworthy PJ, Petretto E et al. (2007) FCGR3B copy number variation is associated with susceptibility to systemic, but not organ‐specific, autoimmunity. Nature Genetics 39: 721–723.

Frayling TM (2007) Genome‐wide association studies provide new insights into type 2 diabetes aetiology. Nature Reviews Genetics 8: 657–662.

Grant SF, Thorleifsson G, Reynisdottir I et al. (2006) Variant of transcription factor 7‐like 2 (TCF7L2) gene confers risk of type 2 diabetes. Nature Genetics 38: 320–323.

Hampe J, Franke A, Rosentiel P et al. (2007) A genome‐wide association scan of nonsynonymous SNPs identifies a susceptibility variant for Crohn disease in ATG16L1. Nature Genetics 39: 207–211.

Hayden EC (2008) International genome project launched. Nature 451: 378–379.

Hertel JK, Johansson S, Raeder H et al. (2008) Genetic analysis of recently identified type 2 diabetes loci in 1638 unselected patients with type 2 diabetes and 1858 control participants from a Norwegian population‐based cohort (the HUNT study). Diabetologia 51: 971–977.

Hirschhorn JN and Daly MJ (2005) Genome‐wide association studies for common diseases and complex traits. Nature Reviews Genetics 6: 95–108.

Hirschhorn JN, Lohmueller K, Byrne E et al. (2002) A comprehensive review of genetic association studies. Genetics in Medicine 4: 45–61.

Hollox EJ, Huffmeier U, Zeeuwen PL et al. (2008) Psoriasis is associated with increased beta‐defensin genomic copy number. Nature Genetics 40: 23–25.

Hugot JP, Chamaillard M, Zouali H et al. (2004) Association of NOD2 leucine‐rich repeat variants with susceptibility to Crohn's disease. Nature 411: 599–603.

International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437: 1299–1320.

International HapMap Consortium (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449: 851–861.

International Human Genome Sequencing Consortium (2004) Finishing the euchromatic sequence of the human genome. Nature 431: 931–945.

International SNP Map Working Group (2001) A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409: 928–933.

Jorgenson E and White JS (2006) A gene‐centric approach to genome‐wide association studies. Nature Reviews Genetics 7: 885–891.

Kennedy GC, Matsuzaki H, Dong S et al. (2003) Large‐scale genotyping of complex DNA. Nature Biotechnology 21: 1233–1237.

Kidd JM, Cooper GM, Donahue WF et al. (2008) Mapping and sequencing of structural variation from eight human genomes. Nature 453: 56–64.

Klein RJ, Zeiss C, Chew EY et al. (2005) Complement factor H polymorphism in age‐related macular degeneration. Science 308: 385–389.

Libioulle C, Louis E, Hansoul S et al. (2007) Novel Crohn disease locus identified by genome‐wide association maps to a gene desert on 5p13.1 and modulates expression of PTGER4. PLoS Genetics 3: e58.

Marchini J, Cardon LR, Phillips MS et al. (2004) The effects of human population structure on large genetic association studies. Nature Genetics 36: 512–517.

Matarin M, Simon‐Sanchez J, Fung HC et al. (2008) Structural genomic variation in ischemic stroke. Neurogenetics 9: 101–108.

Mathew CG (2008) New links to the pathogenesis of Crohn disease provided by genome‐wide association scans. Nature Reviews Genetics 9: 9–14.

NCI‐NHGRI Working Group on Replication in Association Studies (2007) Replicating genotype‐phenotype associations. Nature 447: 655–660.

Ng MC, Park KS, Oh B et al. (2008) Implication of Genetic Variants near TCF7L2, SLC30A8, HHEX, CDKAL1, CDKN2A/B, IGF2BP2 and FTO in Type 2 Diabetes and Obesity in 6719 Asians. Diabetes 57: 2226–2233.

Parkes M, Barrett JC, Prescott NJ et al. (2007) Sequence variants in the autophagy gene IRGM and multiple other replicating loci contribute to Crohn's disease susceptibility. Nature Genetics 39: 830–832.

Plenge RM, Seielstad M, Padyukov L et al. (2007) TRAF1‐C5 as a risk locus for rheumatoid arthritis – a genome‐wide study. New England Journal of Medicine 357: 1199–1209.

Price AL, Patterson NJ, Plenge RM et al. (2006) Principal components analysis corrects for stratification in genome‐wide association studies. Nature Genetics 38: 904–909.

Ragoussis J, Elvidge GP, Kaur K et al. (2006) Matrix‐assisted laser desorption/ionisation, time‐of‐flight mass spectrometry in genomics research. PLoS Genetics 2: e100.

Redon R, Ishikawa S, Fitch KR et al. (2006) Global variation in copy number in the human genome. Nature 444: 444–454.

Risch N and Merikangas K (1996) The future of genetic studies of complex human diseases. Science 273: 1516–1517.

Romeo S, Pennacchio LA, Fu Y et al. (2007) Population‐based re‐sequencing of ANGPTL4 uncovers variations that reduce triglycerides and increase HDL. Nature Genetics 39: 513–516.

Shen F, Huang J, Fitch KR et al. (2008) Improved detection of global copy number variation using high density, non‐polymorphic oligonucleotide probes. BMC Genetics 9: 27.

Skol AD, Scott LJ, Abecasis GR et al. (2006) Joint analysis is more efficient than replication‐based analysis for two‐stage genome‐wide association studies. Nature Genetics 38: 209–213.

Steemers FJ and Gunderson KL (2007) Whole genome genotyping technologies on the BeadArray platform. Biotechnology Journal 2: 41–49.

The ENCODE Project Consortium (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447: 799–816.

The GAIN Collaborative Research Group (2007) New models of collaboration in genome‐wide association studies: the Genetic Association Information Network. Nature Genetics 39: 1045–1051.

Todd JA, Walker NM, Cooper JD et al. (2007) Robust associations of four new chromosome regions from genome‐wide analyses of type 1 diabetes. Nature Genetics 39: 857–864.

Walsh T, McClellan JM, McCarthy SE et al. (2008) Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia. Science 320: 539–543.

Wang WY, Barratt BJ, Clayton DG et al. (2005) Genome‐wide association studies: theoretical and practical concerns. Nature Reviews Genetics 6: 109–118.

Wellcome Trust Case Control Consortium (2007a) Genome‐wide association study of 14 000 cases of seven common diseases and 3000 shared controls. Nature 447: 661–678.

Wellcome Trust Case Control Consortium (2007b) Association scan of 14 500 nonsynonymous SNPs in four diseases identifies autoimmunity variants. Nature Genetics 39: 1329–1337.

Xing J, Witherspoon DJ, Watkins WS et al. (2008) HapMap tagSNP transferability in multiple populations: general guidelines. Genomics 92: 41–51.

Xu B, Roos JL, Levy S et al. (2008) Strong association of de novo copy number mutations with sporadic schizophrenia. Nature Genetics 40: 880–885.

Further Reading

Grant SF and Hakonarson H (2008) Microarray technology and applications in the arena of genome‐wide association. Clinical Chemistry 54: 1116–1124.

Li M, Li C and Guan W (2008) Evaluation of coverage variation of SNP chips for genome‐wide association studies. European Journal of Human Genetics 16: 635–643.

Manolio TA, Brooks LD and Collins FS (2008) A HapMap harvest of insights into the genetics of common disease. The Journal of Clinical Investigation 118: 1590–1605.

McCarthy MI, Abecasis GR, Cardon LR et al. (2008) Genome‐wide association studies for complex traits: consensus, uncertainty and challenges. Nature Reviews Genetics 9: 356–369.

Neale BM and Purcell S (2008) The positives, protocols, and perils of genome‐wide association. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics 147B: 1288–1294.

Teo YY (2008) Common statistical issues in genome‐wide association studies: a review on power, data quality control, genotype calling and population structure. Current Opinion in Lipidology 19: 133–143.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Ku, Chee‐Seng, Pawitan, Yudi, and Chia, Kee‐Seng(Mar 2009) Genome‐wide Association Studies. In: eLS. John Wiley & Sons Ltd, Chichester. [doi: 10.1002/9780470015902.a0021458]