Non‐B DNA Structure and Mutations Causing Human Genetic Disease


In addition to the canonical right‐handed double helix, several noncanonical deoxyribonucleic acid (DNA) secondary structures have been characterised, including quadruplexes, triplexes, slipped/hairpins, Z‐DNA and cruciforms. The formation of these structures is mediated by repetitive sequence motifs, such as G‐rich sequences, purine/pyrimidine tracts, direct (tandem) repeats, alternating purine–pyrimidines and inverted repeats, respectively. Such repeats are abundant in the human genome and are found in association with specific classes of genes, supporting a role for them in gene regulation or protein function. Repetitive sequence motifs are also commonly found at sites of chromosomal alteration, including gross rearrangements and copy number variations (CNVs) associated with both disease and phenotypic variation. Finally, variable number tandem repeats (VNTRs) or microsatellites are present in many gene regulatory regions. Characterised by an inherent capacity to expand spontaneously, such sequences are not only known to cause >30 neurological diseases but may also contribute to human disease susceptibility. The formation of alternative non‐B DNA structures is believed to promote genomic alterations by impeding efficient DNA replication and repair.

Key Concepts:

  • The structure of DNA is polymorphic as well as its sequence; in addition to the canonical right‐handed double helix (B‐DNA), repetitive sequences can also adopt alternative (non‐B DNA) conformations such as quadruplexes, triplexes, slipped/hairpins, Z‐DNA and cruciforms.

  • Repetitive DNA sequences are found at locations within many human genes that suggest they can either affect transcription or alternatively encode homopolymeric amino acid runs that could be important for either protein–protein or protein–DNA/RNA interactions.

  • The integrity of the Y‐chromosome depends on large inverted repeats, which have the capacity to form cruciform structures that may potentiate intrachromosomal recombination.

  • Copy number variation (CNV) is a form of genetic alteration that, by involving thousands of loci in the genome, contributes to human individuality.

  • Repetitive sequences capable of forming non‐B DNA are found at sites of chromosomal breaks, CNVs and other rearrangements such as translocations and gene conversion events, which can contribute to human genetic disease.

  • The recurrent translocation t(22;11) events associated with Emanuel syndrome are mediated by cruciform structures that occur at inverted repeats.

  • Tandem repeats (microsatellites) may expand within gene sequences, contributing to more than 30 neurological diseases. Present in variable number in genes in the population, they may contribute to human disease susceptibility.

  • Experiments in model systems and bioinformatic analyses support the conclusion that repetitive sequences trigger genomic instability by adopting non‐B DNA conformations.

  • Non‐B DNA structures stimulate mutations via mechanisms that alter DNA synthesis and repair.

Keywords: non‐B DNA; microsatellites; copy number variation (CNV); triplet repeat diseases; polyglutamine expansion; translocations; DNA repair; DNA replication; double strand breaks (DSB); gene expression regulation

Figure 1.

Non‐B DNA structures formed by genomic repetitive sequences. (a) Most common non‐B DNA conformations, ribbon models of helical foldings, repetitive motifs requirement and example of sequences. Center dot, Watson–Crick hydrogen bond interactions; x,y, nucleotides in the spacer between repeats; L, lateral loop; D, diagonal loop and CR, chain reversal loop. For cruciform DNA, an extended conformation is shown. For triplex DNA, a 3′ RRY isomer is depicted in which the 3′‐half of the purine‐rich strand folds back to form the Hoogsteen‐bound third strand. For quadruplex DNA, an idealised structure is drawn to highlight the loop characteristics and the relative orientation of the syn and anti N‐glycosidic configurations. (b) RRY base triplets showing the Hoogsteen‐bound base (left). Thymine can be incorporated into RRY triplexes due to the symmetry of the carbonyl groups. (c) YRY base triplets showing the Hoogsteen‐bound pyrimidine and the stabilisation afforded by cytosine protonation. (d) G‐tetrad.

Figure 2.

Non‐B DNA‐forming repeats and genome‐wide gene expression. (a) The gene expression profile of 13 237 nonredundant annotated human (RefSeq) genes (y‐axis) was examined in 79 tissues/cancer/cell types (x‐axis) and the average values plotted for the 8124 genes that contained quadruplex‐forming repeats (filled squares) within ±500 base pairs of the main transcription start site (TSS) and the remaining genes that did not contain such elements within ±500 base pairs of the main TSS (open squares). (b) The gene expression levels for 16 146 gene probes in the 70 human tissues/cell types included in (a) was plotted as percent of data (y‐axis) falling within specific intervals of gene expression (x‐axis) for control genes (black) and for 190 genes (red) that contained triplex‐forming tetranucleotide repeats ≥72 base pairs long. With kind permission from Springer Science+Business Media (Zhao et al., ).

Figure 3.

Cruciform‐mediated chromosomal t(11;22) translocation and the Emanuel syndrome. The PATRR sequences on human chromosome 11 (green) and 22 (black) are proposed to fold into large cruciform structures at some frequency during gametogenesis and be cleaved at the single‐stranded tips, resulting in double‐strand breaks (left insets). The broken chromosomal ends (middle) join aberrantly, yielding the derivative chromosomes der(11) and der(22) (right). Occasional inheritance of der(22), in addition to a normal karyotype, is responsible for the Emanuel syndrome in the offspring.

Figure 4.

Triplet repeat expansion alters mRNA function. (a) In DM1, CTG expansion in the 3′‐UTR of the DMPK gene causes the ensuing mRNA to fold into a large and stable double‐stranded hairpin stabilised by U•U and G•C base pairs, which recruits muscleblind‐like (Drosophila) (MBNL1), a mediator of pre‐mRNA alternative splicing regulation. CUG‐hairpins also stimulate CUG RNA‐binding protein 1 (CUGBP1) hyperphosphorylation and stabilisation, which alter several events related to alternative splicing, mRNA tanslation and mRNA decay. (b) Sequestration of MBNL1 and CUGBP1 activation shift alternative splicing programs from the adult stage towards embryonic‐specific patterns, including activation of exon 5 inclusion of cardiac isoforms of TNNT2 (cTNT) during heart remodeling, exclusion of exon 11 in the insulin receptor (IR) pre‐mRNA and inclusion of stop‐containing exon in chloride channel 1 transcripts. Adapted from Lee and Cooper with kind permission by Portland Press Ltd. Copyright © the Biochemical Society.

Figure 5.

Modulators of triplet repeat expansion in mouse models of MRDs. The mismatch repair proteins Msh2 and Msh3 are required for triplet repeat instability throughout all developmental stages. DNAcytosine‐5‐methyltransferase 1 (Dnmt1) protects against expansions in an expanded CAG model in the germline. Ataxia talangiectasia and Rad3 related (Atr) protein prevent expansion in the female germline and in somatic tissues of a CGG repeat mouse model. Post‐meiotic segregation increased 2 (Pms2) and 8‐oxoguanine glycosylase (Ogg1) are selectively involved in age‐dependent somatic instability. Reproduced from Dion and Wilson with permission from Elsevier.



Bacolla A, Collins JR, Gold B et al. (2006) Long homopurine*homopyrimidine sequences are characteristic of genes expressed in brain and the pseudoautosomal region. Nucleic Acids Research 34: 2663–2675.

Bacolla A, Jaworski A, Larson JE et al. (2004) Breakpoints of gross deletions coincide with non‐B DNA conformations. Proceedings of the National Academy of Sciences of the USA 101: 14162–14167.

Bacolla A, Larson JE and Collins JR (2008) Abundance and length of simple repeats in vertebrate genomes are determined by their structural properties. Genome Research 18: 1545–1553.

Bacolla A and Wells RD (2004) Non‐B DNA conformations, genomic rearrangements, and human disease. Journal of Biological Chemistry 279: 47411–47414.

Bena F, Gimelli S, Migliavacca E et al. (2010) A recurrent 14q32.2 microdeletion mediated by expanded TGG repeats. Human Molecular Genetics 19: 1967–1973.

Bonaglia MC, Giorda R, Massagli A et al. (2009) A familial inverted duplication/deletion of 2p25.1‐25.3 provides new clues on the genesis of inverted duplications. European Journal of Human Genetics 17: 179–186.

Brouwer JR, Willemsen R and Oostra BA (2009) Microsatellite repeat instability and neurological disease. BioEssays 31: 71–83.

Burrows CJ and Muller JG (1998) Oxidative nucleobase modifications leading to strand scission. Chemical Reviews 98: 1109–1152.

Christophe D, Cabrer B, Bacolla A et al. (1985) An unusually long poly(purine)‐poly(pyrimidine) sequence is located upstream from the human thyroglobulin gene. Nucleic Acids Research 13: 5127–5144.

Chuzhanova N, Chen JM, Bacolla A et al. (2009) Gene conversion causing human inherited disease: evidence for involvement of non‐B‐DNA‐forming sequences and recombination‐promoting motifs in DNA breakage and repair. Human Mutation 30: 1189–1198.

Conrad DF, Pinto D, Redon R et al. (2010) Origins and functional impact of copy number variation in the human genome. Nature 464: 704–712.

Dion V and Wilson JH (2009) Instability and chromatin structure of expanded trinucleotide repeats. Trends in Genetics 25: 288–297.

Du Z, Zhao Y and Li N (2008) Genome‐wide analysis reveals regulatory role of G4 DNA in gene transcription. Genome Research 18: 233–241.

Felsenfeld G and Rich A (1957) Studies on the formation of two‐ and three‐stranded polyribonucleotides. Biochimica et Biophysica Acta 26: 457–468.

Fernando H, Sewitz S, Darot J et al. (2009) Genome‐wide analysis of a G‐quadruplex‐specific single‐chain antibody that regulates gene expression. Nucleic Acids Research 37: 6716–6722.

Gessner RV, Frederick CA, Quigley GJ, Rich A and Wang AH (1989) The molecular structure of the left‐handed Z‐DNA double helix at 1.0‐Å atomic resolution. Geometry, conformation, and ionic interactions of d(CGCGCG). Journal of Biological Chemistry 264: 7921–7935.

Glickman BW and Ripley LS (1984) Structural intermediates of deletion mutagenesis: a role for palindromic DNA. Proceedings of the National Academy of Sciences of the USA 81: 512–516.

Ha SC, Lowenhaupt K, Rich A, Kim YG and Kim KK (2005) Crystal structure of a junction between B‐DNA and Z‐DNA reveals two extruded bases. Nature 437: 1183–1186.

Hannan AJ (2010) Tandem repeat polymorphisms: modulators of disease susceptibility and candidates for ‘missing heritability’. Trends in Genetics 26: 59–65.

Huppert JL, Bugaut A, Kumari S and Balasubramanian S (2008) G‐quadruplexes: the beginning and end of UTRs. Nucleic Acids Research 36: 6260–6268.

Jain A, Wang G and Vasquez KM (2008) DNA triple helices: biological consequences and therapeutic potential. Biochimie 90: 1117–1130.

Kha DT, Wang G, Natrajan N, Harrison L and Vasquez KM (2010) Pathways for double‐strand break repair in genetically unstable Z‐DNA‐forming sequences. Journal of Molecular Biology 398: 471–480.

Kornreich R, Bishop DF and Desnick RJ (1990) Alpha‐galactosidase A gene rearrangements causing Fabry disease. Identification of short direct repeats at breakpoints in an Alu‐rich gene. Journal of Biological Chemistry 265: 9319–9326.

Kurahashi H and Emanuel BS (2001) Long AT‐rich palindromes and the constitutional t(11;22) breakpoint. Human Molecular Genetics 10: 2605–2617.

Kurahashi H, Inagaki H, Ohye T et al. (2010) The constitutional t(11;22): implications for a novel mechanism responsible for gross chromosomal rearrangements. Clinical Genetics. [Epub ahead of print; doi: 10.1111/j.1399‐0004.2010.01445.x].

Lander ES, Linton LM, Birren B et al. (2001) Initial sequencing and analysis of the human genome. Nature 409: 860–921.

Lange J, Skaletsky H, van Daalen SK et al. (2009) Isodicentric Y chromosomes and sex disorders as byproducts of homologous recombination that maintains palindromes. Cell 138: 855–869.

Lee JE and Cooper TA (2009) Pathogenic mechanisms of myotonic dystrophy. Biochemical Society Transactions 37: 1281–1286.

Liu J, Perumal NB, Oldfield CJ et al. (2006) Intrinsic disorder in transcription factors. Biochemistry 45: 6873–6888.

Lopez Castel A, Cleary JD and Pearson CE (2010) Repeat instability as the basis for human diseases and as a potential target for therapy. Nature Reviews. Molecular Cell Biology 11: 165–170.

Lyamichev VI, Mirkin SM and Frank‐Kamenetskii MD (1985) A pH‐dependent structural transition in the homopurine‐homopyrimidine tract in superhelical DNA. Journal of Biomolecular Structure and Dynamics 3: 327–338.

Messaed C and Rouleau GA (2009) Molecular mechanisms underlying polyalanine diseases. Neurobiology of Disease 34: 397–405.

Mirkin SM (2007) Expandable DNA repeats and human disease. Nature 447: 932–940.

Neidle S (2009) The structures of quadruplex nucleic acids and their drug complexes. Current Opinion in Structural Biology 19: 239–250.

Perry GH, Ben‐Dor A, Tsalenko A et al. (2008) The fine‐scale and complex architecture of human copy‐number variation. American Journal of Human Genetics 82: 685–695.

Quemener S, Chen JM, Chuzhanova N et al. (2010) Complete ascertainment of intragenic copy number mutations (CNMs) in the CFTR gene and its implications for CNM formation at other autosomal loci. Human Mutation 31: 421–428.

Quental R, Azevedo L, Rubio V, Diogo L and Amorim A (2009) Molecular mechanisms underlying large genomic deletions in ornithine transcarbamylase (OTC) gene. Clinical Genetics 75: 457–464.

Repping S, Skaletsky H, Lange J et al. (2002) Recombination between palindromes P5 and P1 on the human Y chromosome causes massive deletions and spermatogenic failure. American Journal of Human Genetics 71: 906–922.

Rooms L, Reyniers E and Kooy RF (2007) Diverse chromosome breakage mechanisms underlie subtelomeric rearrangements, a common cause of mental retardation. Human Mutation 28: 177–182.

Rozen S, Skaletsky H, Marszalek JD et al. (2003) Abundant gene conversion between arms of palindromes in human and ape Y chromosomes. Nature 423: 873–876.

Skaletsky H, Kuroda‐Kawaguchi T, Minx PJ et al. (2003) The male‐specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature 423: 825–837.

Uversky VN, Oldfield CJ, Midic U et al. (2009) Unfoldomics of human diseases: linking protein intrinsic disorder with diseases. BMC Genomics 10(suppl. 1): S7.

Van Raay TJ, Burn TC, Connors TD et al. (1996) A 2.5 kb polypyrimidine tract in the PKD1 gene contains at least 23 H‐DNA‐forming sequences. Microbial and Comparative Genomics 1: 317–327.

Vissers LE, Bhatt SS, Janssen IM et al. (2009) Rare pathogenic microdeletions and tandem duplications are microhomology‐mediated and stimulated by local genomic architecture. Human Molecular Genetics 18: 3579–3593.

Wang G and Vasquez KM (2004) Naturally occurring H‐DNA‐forming sequences are mutagenic in mammalian cells. Proceedings of the National Academy of Sciences of the USA 101: 13448–13453.

Warburton PE, Giordano J, Cheung F, Gelfand Y and Benson G (2004) Inverted repeat structure of the human genome: the X‐chromosome contains a preponderance of large, highly homologous inverted repeats that contain testes genes. Genome Research 14: 1861–1869.

Watson JD and Crick FH (1953) Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature 171: 737–738.

Wells RD and Ashizawa T (2006) Genetic Instabilities and Neurological Diseases, 2nd edn. San Diego, CA: Elsevier/Academic Press.

Wells RD, Collier DA, Hanvey JC, Shimizu M and Wohlrab F (1988) The chemistry and biology of unusual DNA structures adopted by oligopurine.oligopyrimidine sequences. FASEB Journal 2: 2939–2949.

Zhang F, Seeman P, Liu P et al. (2010) Mechanisms for nonrecurrent genomic rearrangements associated with CMT1A or HNPP: rare CNVs as a cause for missing heritability. American Journal of Human Genetics 86: 892–903.

Zhao J, Bacolla A, Wang G and Vasquez KM (2010) Non‐B DNA structure‐induced genetic instability and evolution. Cellular and Molecular Life Sciences 67: 43–62.

Zheng M, Huang X, Smith GK, Yang X and Gao X (1996) Genetically unstable CXG repeats are structurally dynamic and have a high propensity for folding. An NMR and UV spectroscopic study. Journal of Molecular Biology 264: 323–336.

Further Reading

Balasubramanian S and Neidle S (2009) G‐quadruplex nucleic acids as therapeutic targets. Current Opinion in Chemical Biology 13: 345–353.

D'Angelo CS, Gajecka M, Kim CA et al. (2009) Further delineation of nonhomologous‐based recombination and evidence for subtelomeric segmental duplications in 1p36 rearrangements. Human Genetics 125: 551–563.

Karlin S, Brocchieri L, Bergman A, Mrazek J and Gentles AJ (2002) Amino acid runs in eukaryotic proteomes and disease associations. Proceedings of the National Academy of Sciences of the USA 99: 333–338.

Orr HT and Zoghbi HY (2007) Trinucleotide repeat disorders. Annual Review of Neuroscience 30: 575–621.

Rich A and Zhang S (2003) Timeline: Z‐DNA: the long road to biological function. Nature Reviews. Genetics 4: 566–572.

Sinden RR (1994) DNA Structure and Function. San Diego: Academic Press.

Stankiewicz P and Lupski JR (2010) Structural variation in the human genome and its role in disease. Annual Review of Medicine 61: 437–455.

Wang G, Christensen LA and Vasquez KM (2006) Z‐DNA‐forming sequences generate large‐scale deletions in mammalian cells. Proceedings of the National Academy of Sciences of the USA 103: 2677–2682.

Wang G and Vasquez KM (2009) Models for chromosomal replication‐independent non‐B DNA structure‐induced genetic instability. Molecular Carcinogenesis 48: 286–298.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Bacolla, Albino, Cooper, David N, and Vasquez, Karen M(Oct 2010) Non‐B DNA Structure and Mutations Causing Human Genetic Disease. In: eLS. John Wiley & Sons Ltd, Chichester. [doi: 10.1002/9780470015902.a0022657]