Haplotype Sharing Methods


A convenient way to incorporate haplotypes into statistical analysis of complex diseases is the use of haplotype sharing measures. Statistical methods summarise evolutionary events such as mutation, recombination and coalescence into simple scores to improve the power of association tests. Existing methods provide flexible tools for various study designs such as pedigree data and case‐control data, in candidate gene analysis and for genome‐wide association analysis. Although haplotype sharing methods were powerful in detecting disease mutations in isolated populations, their applicability for complex diseases in general population deserves further investigation as their potential for possible extensions using a variety of genomic variants, such as copy number variation and uncommon sequence mutations.

Key Concepts:

  • Statistical methods for haplotype sharing analysis in candidate gene association analysis and genome‐wide studies have been developed that incorporate information of local haplotypes to improve the power to identify disease susceptibility variants.

  • Haplotype sharing analysis of complex disease relies on population genetic assumptions and incorporates in a convenient way of mutations and recombinations.

  • Statistical methods based on haplotype sharing are available for various study designs, types of trait variable and genetic and nongenetic data.

  • Haplotype sharing analysis extends the identical by descent concept, successfully applied in linkage analysis, to population‐based association studies.

  • Reducing a potentially large number of haplotypes to simple similarity scores may reduce degrees of freedom for hypothesis testing, and thus may improve the power.

  • Haplotypes with low frequencies can easily be considered.

  • Haplotype sharing analysis may be more powerful than conventional methods in detecting rare variants.

Keywords: haplotype similarity; haplotype cluster; haplotype scores; haplotype association; nonparametric linkage; isolated populations

Figure 1.

Coalescence tree of haplotypes with respect to a disease locus (DL). MRCA, Most Recent Common Ancestor. ‘t’ are the times between the coalescent events and the observed haplotypes. T1, time to the MRCA of the haplotypes, which do not carry the disease mutation and T2, time to the MRCA of the haplotypes, which carry the disease mutation.

Figure 2.

Presentation of the different measures of haplotype sharing for two pairs of haplotypes. Red arrow indicates the disease locus between SNP4 and SNP5. Shared regions and markers IBS are marked blue.

Figure 3.

(a) Mean shared length for a sample of 500 cases and controls in kilobase for the three types of comparisons: (i) case haplotypes versus case haplotypes (red), (ii) control haplotypes versus control haplotypes and the (iii) discordant comparison case haplotypes versus control haplotypes. Grey areas denote the identified blocks. The vertical line corresponds to the disease locus. (b) Range of the sharing for all comparisons of haplotypes. For each marker, the mean shared length (red) as well as the maximum range (black) is presented. The vertical line corresponds to the disease locus.

Figure 4.

−Log10 of the p‐values for the sample of 500 cases and controls. Horizontal line corresponds to the p‐value p=.05; the vertical line corresponds to the disease locus. Test statistic was evaluated with the Mantel statistics based on haplotype sharing by Beckmann et al..

Figure 5.

Dendogram of the 18 haplotypes with frequencies >1% from the study of Heid et al.. Red arrows indicate the haplotypes carrying the disease allele as well as the corresponding position between SNP4 and SNP5. The red rectangle indicates a possible cluster, which includes the risk haplotypes. The black rectangles denote clusters with haplotypes that do not carry the disease allele.



