Non‐B DNA Structure and Mutations Causing Human Genetic Disease

Abstract

Besides the canonical right‐handed double helix, biologically important noncanonical deoxyribonucleic acid (DNA) secondary structures have been characterised, including quadruplexes, triplexes, slipped/hairpins, Z‐DNA and cruciforms, collectively termed non‐B DNA. Formation of non‐B DNA is mediated by repetitive sequence motifs, such as G‐rich sequences, purine/pyrimidine tracts, direct (tandem) repeats, alternating purine–pyrimidines and inverted repeats, respectively. Such repeats are abundant in the human genome and non‐B DNA occurs at specific genomic locations, supporting a role in gene regulation, RNA translation and protein function. Repetitive motifs are also found at sites of chromosomal alterations associated with both human genetic disease and cancer. Characterised by an inherent capacity to expand spontaneously, such sequences not only cause >30 neurological diseases but may also contribute to disease susceptibility. The formation of non‐B DNA structures is believed to promote genomic alterations by impeding efficient and error‐free DNA replication, transcription and repair.

Key Concepts

  • The structure of DNA is polymorphic as well as its sequence; besides canonical right‐handed double helix (B‐DNA), repetitive sequences can also adopt alternative (non‐B DNA) conformations such as quadruplexes, triplexes, slipped/hairpins, Z‐DNA and cruciforms.
  • Repetitive DNA sequences are found at particular locations within many human genes, suggesting that they can affect transcription and encode homopolymeric amino acid runs that are important for protein–protein and protein–DNA/RNA interactions.
  • G4 and Z‐DNA structures have been detected in cells through specific antibodies, mostly in correspondence of actively transcribed genes and, in the case of G4, at telomeres.
  • Copy number variation (CNV) is a form of genetic alteration that, by involving thousands of loci in the genome, contributes to human individuality.
  • Repetitive sequences capable of forming non‐B DNA are found at sites of chromosomal breaks, CNVs and other rearrangements such as translocations, deletions and gene conversion events, which can contribute to human genetic disease and cancer.
  • The recurrent translocation t(22;11) events associated with Emanuel syndrome are mediated by cruciform structures that occur at inverted repeats.
  • Tandem repeats (microsatellites) may expand within gene sequences, contributing to more than 30 neurological diseases; present in a variable number in genes in the population, they may contribute to human disease susceptibility.
  • An increasing number of enzymes are being reported that resolve non‐B DNA and RNA structures, and whose mutations lead to genomic instability and human disease; however, in some cases, the recognition of non‐B DNA structures is the cause of genetic instability.
  • lncRNAs repress gene expression by forming triplex structures with their target duplex DNA.
  • Non‐B DNA structures stimulate mutations via mechanisms that alter DNA synthesis, transcription and repair.

Keywords: non‐B DNA; microsatellites; copy number variation (CNV); triplet repeat diseases; translocations; DNA repair; double strand breaks (DSB); gene expression regulation; cancer genomes; repeat‐associated non‐AUG (RAN) translation

Figure 1. Non‐B DNA (deoxyribonucleic acid) structures formed by genomic repetitive sequences. (a) Most common non‐B DNA conformations, ribbon models of helical folding, repetitive motifs requirement and example of sequences. Centre dot, Watson–Crick hydrogen bond interactions; x,y, nucleotides in the spacer between repeats; L, lateral loop; D, diagonal loop and CR, chain reversal loop. For cruciform DNA, an extended conformation is shown. For triplex DNA, a 3′ RRY isomer is depicted in which the 3′‐half of the purine‐rich strand folds back to form the Hoogsteen‐bound third strand. For quadruplex DNA, an idealised structure is drawn to highlight the loop characteristics and the relative orientation of the syn and anti N‐glycosidic configurations. (b) RRY base triplets showing the Hoogsteen‐bound base (left). Thymine can be incorporated into RRY triplexes because of the symmetry of the carbonyl groups. (c) YRY base triplets showing the Hoogsteen‐bound pyrimidine and the stabilisation afforded by cytosine protonation. (d) G‐tetrad.
Figure 2. Cruciform‐mediated chromosomal t(11;22) translocation and the Emanuel syndrome. The PATRR sequences on human chromosome 11 (green) and 22 (black) are proposed to fold into large cruciform structures at some frequency during gametogenesis and be cleaved at the single‐stranded tips, resulting in double‐strand breaks (left insets). The broken chromosomal ends (middle) join aberrantly, yielding the derivative chromosomes der(11) and der(22) (right). Occasional inheritance of der(22), in addition to a normal karyotype, is responsible for the Emanuel syndrome in the offspring.
Figure 3. Triplet‐repeat expansion alters mRNA (messenger ribonucleic acid) function. (a) In DM1, CTG expansion in the 3′‐UTR (untranslated region) of the DMPK gene causes the ensuing mRNA to fold into a large and stable double‐stranded hairpin stabilised by U · U and G · C base pairs, which recruits muscle blind‐like (Drosophila) (MBNL1), a mediator of pre‐mRNA alternative splicing regulation. CUG‐hairpins also stimulate CUG RNA‐binding protein 1 (CUGBP1) hyperphosphorylation and stabilisation, which alter several events related to alternative splicing, mRNA translation and mRNA decay. (b) Sequestration of MBNL1 and CUGBP1 activation shift alternative splicing programs from the adult stage towards embryonic‐specific patterns, including activation of exon 5 inclusion of cardiac isoforms of TNNT2 (cTNT) during heart remodelling, exclusion of exon 11 in the insulin receptor (IR) pre‐mRNA and inclusion of stop‐containing exon in chloride channel 1 transcripts. Adapted from Lee and Cooper 2009 with kind permission © the Biochemical Society, Portland Press Ltd.
Figure 4. Gain‐of‐function by expanded G4 DNA‐forming repeats at the C9orf72 locus. (a) The C9orf72 gene is transcribed from two alternative transcription start sites (TSSs; Ex 1b and 1a) in three gene isoforms. The G4C2 repeat is located in intron 1 (white boxes, UTRs; gray boxes, coding exons; thin lines, introns) on the nontranscribed strand (thus, it is present on the sense RNA strand) between Ex 1a and 1b. (b) Normal alleles containing two to eight G4‐forming repeats are transcribed normally (top). In expanded alleles (bottom) transcription is reduced. The nontranscribed strand forms an ‘island’ of antiparallel G4 structures. The transcribed strand yields sense RNA with multiple parallel G4 DNA structures. Antisense transcription also takes place through the island, yielding antisense RNAs with potential secondary structure‐forming sequences. (c) The aberrant transcripts sequester RNA‐binding proteins, forming nuclear protein‐RNA foci (left); they also bind nucleolin in the nucleoli, where they prevent biogenesis of new ribosomal ribonucleic acids (rRNA; right). (d) Once transported to the cytoplasm, both sense and antisense transcripts undergo RAN translation in all three possible reading frames, thereby producing dipeptide repeat proteins (DPR) prone to aggregation in the cytoplasm, nucleus and the nucleolus. Reproduced with permission from Simone et al. 2015 © John Wiley and Sons.
Figure 5. Translocation and deletion breakpoints occur near non‐B DNA‐forming sequences in cancer genomes. (a) Schematic of a 1‐kb interval (bin) with the site of rearrangement (breakpoint) at the centre and 500 bps of flanking DNA sequence. The genomic location at each breakpoint identified in cancer patients by high‐throughput whole‐genome DNA sequencing and resolved at bp resolution was first mapped to the human reference genome, and 500 bps on either side of each breakpoint were sought for the occurrence of non‐B DNA‐forming sequences. (b) Number of triplex DNA‐forming repeats. (c) Number of inverted repeats. (d) Number of direct (tandem) repeats. (e) Number of G4 DNA‐forming repeats. (f) Number of Z‐DNA‐forming repeats. Contr1, 20 222 randomly generated sites throughout the human genome; Trans, 19 947 chromosomal translocation breakpoints; Delet, 46 365 deletion breakpoints. In most cases, the number of non‐B DNA‐forming repeats peaked at the breakpoint position, implying their involvement in triggering DNA strand breaks that may have elicited the genomic rearrangements. Reproduced with permission from Bacolla et al. 2016 © Oxford University Press.
close

References

Bacolla A and Wells RD (2004) Non‐B DNA conformations, genomic rearrangements, and human disease. Journal of Biological Chemistry 279: 47411–47414.

Bacolla A, Tainer JA, Vasquez KM and Cooper DN (2016) Translocation and deletion breakpoints in cancer genomes are associated with potential non‐B DNA‐forming sequences. Nucleic Acids Research 44: 5673–5688.

Barthelemy J, Hanenberg H and Leffak M (2016) FANCJ is essential to maintain microsatellite structure genome‐wide during replication stress. Nucleic Acids Research 44: 6803–6816.

Bena F, Gimelli S, Migliavacca E, et al. (2010) A recurrent 14q32.2 microdeletion mediated by expanded TGG repeats. Human Molecular Genetics 19: 1967–1973.

Bonaglia MC, Giorda R, Massagli A, et al. (2009) A familial inverted duplication/deletion of 2p25.1‐25.3 provides new clues on the genesis of inverted duplications. European Journal of Human Genetics 17: 179–186.

Brouwer JR, Willemsen R and Oostra BA (2009) Microsatellite repeat instability and neurological disease. BioEssays 31: 71–83.

Chen X, Shen Y, Zhang F, et al. (2013) Molecular analysis of a deletion hotspot in the NRXN1 region reveals the involvement of short inverted repeats in deletion CNVs. The American Journal of Human Genetics 92: 375–386.

Chuzhanova N, Chen JM, Bacolla A, et al. (2009) Gene conversion causing human inherited disease: evidence for involvement of non‐B‐DNA‐forming sequences and recombination‐promoting motifs in DNA breakage and repair. Human Mutation 30: 1189–1198.

Conrad DF, Pinto D, Redon R, et al. (2010) Origins and functional impact of copy number variation in the human genome. Nature 464: 704–712.

Dong DW, Pereira F, Barrett SP, et al. (2014) Association of G‐quadruplex forming sequences with human mtDNA deletion breakpoints. BMC Genomics 15: 677.

Francois M, Leifert WR, Hecker J, Faunt J and Fenech MF (2016) Guanine‐quadruplexes are increased in mild cognitive impairment and correlate with cognitive function and chromosomal DNA damage. DNA Repair 46: 29–36.

Gray LT, Vallur AC, Eddy J and Maizels N (2014) G quadruplexes are genomewide targets of transcriptional helicases XPB and XPD. Nature Chemical Biology 10: 313–318.

Hannan AJ (2010) Tandem repeat polymorphisms: modulators of disease susceptibility and candidates for ‘missing heritability’. Trends in Genetics 26: 59–65.

Hansel‐Hertsch R, Beraldi D, Lensing SV, et al. (2016) G‐quadruplex structures mark human regulatory chromatin. Nature Genetics 48: 1267–1272.

Husain A, Begum NA, Taniguchi T, et al. (2016) Chromatin remodeler SMARCA4 recruits topoisomerase I and suppresses transcription‐associated genomic instability. Nature Communications 7: 10549.

Ishiguro A, Kimura N, Watanabe Y, Watanabe S and Ishihama A (2016) TDP‐43 binds and transports G‐quadruplex‐containing mRNAs into neurites for local translation. Genes to Cells 21: 466–481.

Jain A, Bacolla A, Del Mundo IM, et al. (2013) DHX9 is involved in preventing genomic instability induced by alternative structured DNA in human cells. Nucleic Acids Research 41: 10345–10357.

Kamat MA, Bacolla A, Cooper DN and Chuzhanova N (2015) A role for non‐B DNA forming sequences in mediating microlesions causing human inherited disease. Human Mutation 37: 65–73.

Kha DT, Wang G, Natrajan N, Harrison L and Vasquez KM (2010) Pathways for double‐strand break repair in genetically unstable Z‐DNA‐forming sequences. Journal of Molecular Biology 398: 471–480.

Kornreich R, Bishop DF and Desnick RJ (1990) Alpha‐galactosidase A gene rearrangements causing Fabry disease. Identification of short direct repeats at breakpoints in an Alu‐rich gene. Journal of Biological Chemistry 265: 9319–9326.

Kurahashi H, Inagaki H, Ohye T, et al. (2010) The constitutional t(11;22): implications for a novel mechanism responsible for gross chromosomal rearrangements. Clinical Genetics 78 (4): 299–309. DOI: 10.1111/j.1399-0004.2010.01445.x.

Lange J, Skaletsky H, van Daalen SK, et al. (2009) Isodicentric Y chromosomes and sex disorders as byproducts of homologous recombination that maintains palindromes. Cell 138: 855–869.

Lee JE and Cooper TA (2009) Pathogenic mechanisms of myotonic dystrophy. Biochemical Society Transactions 37: 1281–1286.

Liu HY, Zhao Q, Zhang TP, et al. (2016) Conformation selective antibody enables genome profiling and leads to discovery of parallel G‐quadruplex in human telomeres. Cell Chemical Biology 23: 1261–1270.

Lopez Castel A, Cleary JD and Pearson CE (2010) Repeat instability as the basis for human diseases and as a potential target for therapy. Nature Reviews. Molecular Cell Biology 11: 165–170.

Lu S, Wang G, Bacolla A, et al. (2015) Short inverted repeats are hotspots for genetic instability: relevance to cancer genomes. Cell Reports 10: 1674–1680.

Maizels N (2015) G4‐associated human diseases. EMBO Reports 16: 910–922.

Matsuzaki K, Borel V, Adelman CA, Schindler D and Boulton SJ (2015) FANCJ suppresses microsatellite instability and lymphomagenesis independent of the Fanconi anemia pathway. Genes & Development 29: 2532–2546.

Mondal T, Subhash S, Vaid R, et al. (2015) MEG3 long noncoding RNA regulates the TGF‐β pathway genes through formation of RNA‐DNA triplex structures. Nature Communications 6: 7743.

Neidle S (2009) The structures of quadruplex nucleic acids and their drug complexes. Current Opinion in Structural Biology 19: 239–250.

Nie J, Jiang M, Zhang X, et al. (2015) Post‐transcriptional regulation of Nkx2‐5 by RHAU in heart development. Cell Reports 13 (723–732): 2015.

Paeschke K, Bochman ML, Garcia PD, et al. (2013) Pif1 family helicases suppress genome instability at G‐quadruplex motifs. Nature 497: 458–462.

Perry GH, Ben‐Dor A, Tsalenko A, et al. (2008) The fine‐scale and complex architecture of human copy‐number variation. American Journal of Human Genetics 82: 685–695.

Polleys EJ, House NCM and Freudenreich CH (2017) Role of recombination and replication fork restart in repeat instability. DNA Repair 56: 156–165.

Quemener S, Chen JM, Chuzhanova N, et al. (2010) Complete ascertainment of intragenic copy number mutations (CNMs) in the CFTR gene and its implications for CNM formation at other autosomal loci. Human Mutation 31: 421–428.

Quental R, Azevedo L, Rubio V, Diogo L and Amorim A (2009) Molecular mechanisms underlying large genomic deletions in ornithine transcarbamylase (OTC) gene. Clinical Genetics 75: 457–464.

Scheibye‐Knudsen M, Tseng A, Jensen MB, et al. (2016) Cockayne syndrome group A and B proteins converge on transcription‐linked resolution of non‐B DNA. Proceedings of the National Academy of Sciences of the United States of America 113: 12502–12507.

Shin S, Ham S, Park J, et al. (2016) Z‐DNA‐forming sites identified by ChIP‐Seq are associated with actively transcribed regions in the human genome. DNA Research 23: 477–486.

Simone R, Fratta P, Neidle S, Parkinson GN and Isaacs AM (2015) G‐quadruplexes: emerging roles in neurodegenerative diseases and the non‐coding transcriptome. FEBS Letters 589: 1653–1668.

Skaletsky H, Kuroda‐Kawaguchi T, Minx PJ, et al. (2003) The male‐specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature 423: 825–837.

Smida J, Xu H, Zhang Y, et al. (2017) Genome‐wide analysis of somatic copy number alterations and chromosomal breakages in osteosarcoma. International Journal of Cancer 141: 816–828.

Tsutakawa SE, Thompson MJ, Arvai AS, et al. (2017) Phosphate steering by Flap Endonuclease 1 promotes 5′‐flap specificity and incision to prevent genome instability. Nature Communications 8: 15855.

Verdin H, D'haene B, Beysen D, et al. (2013) Microhomoly‐mediated mechanisms underlie non‐recurrent disease‐causing microdeletions of the FOXL2 gene or its regulatory domain. PLoS Genetics 9: e1003358.

Vissers LE, Bhatt SS, Janssen IM, et al. (2009) Rare pathogenic microdeletions and tandem duplications are microhomology‐mediated and stimulated by local genomic architecture. Human Molecular Genetics 18: 3579–3593.

Wells RD and Ashizawa T (2006) Genetic Instabilities and Neurological Diseases, 2nd edn. San Diego, CA: Elsevier/Academic Press.

Wolfe AL, Singh K, Zhong Y, et al. (2014) RNA G‐quadruplexes cause eIF4A‐dependent oncogene translation in cancer. Nature 513: 65–70.

Zhang F, Seeman P, Liu P, et al. (2010) Mechanisms for nonrecurrent genomic rearrangements associated with CMT1A or HNPP: rare CNVs as a cause for missing heritability. American Journal of Human Genetics 86: 892–903.

Zhang N and Ashizawa T (2017) RNA toxicity and foci formation in microsatellite expansion diseases. Current Opinion in Genetics & Development 44: 17–29.

Zhao J, Bacolla A, Wang G and Vasquez KM (2010) Non‐B DNA structure‐induced genetic instability and evolution. Cellular and Molecular Life Sciences 67: 43–62.

Zhao XN, Kumari D, Gupta S, et al. (2015) Mutsβ generates both expansions and contractions in a mouse model of the Fragile X‐associated disorders. Human Molecular Genetics 24: 7087–7096.

Further Reading

Bacolla A, Wang G and Vasquez KM (2015) New perspectives on DNA and RNA triplexes as effectors of biological activity. PLoS Genetics 11: e1005696.

Boyer AS, Grgurevic S, Cazaux C and Hoffmann JS (2013) The human specialized DNA polymerases and non‐B DNA: vital relationships to preserve genomic integrity. Journal of Molecular Biology 425: 4767–4781.

Du X, Gertz EM, Wojtowicz D, et al. (2014) Potential non‐B DNA regions in the human genome are associated with greater rates of nucleotide mutation and gene variation. Nucleic Acids Research 42: 12367–12379.

D'Angelo CS, Gajecka M, Kim CA, et al. (2009) Further delineation of nonhomologous‐based recombination and evidence for subtelomeric segmental duplications in 1p36 rearrangements. Human Genetics 125: 551–563.

Karlin S, Brocchieri L, Bergman A, Mrazek J and Gentles AJ (2002) Amino acid runs in eukaryotic proteomes and disease associations. Proceedings of the National Academy of Sciences of the United States of America 99: 333–338.

Orr HT and Zoghbi HY (2007) Trinucleotide repeat disorders. Annual Review of Neuroscience 30: 575–621.

Repping S, Skaletsky H, Lange J, et al. (2002) Recombination between palindromes P5 and P1 on the human Y chromosome causes massive deletions and spermatogenic failure. American Journal of Human Genetics 71: 906–922.

Schofield JPR, Cowan JL and Coldwell MJ (2015) G‐quadruplexes mediate local translation in neurons. Biochemical Society Transactions 43: 338–342.

Stankiewicz P and Lupski JR (2010) Structural variation in the human genome and its role in disease. Annual Review of Medicine 61: 437–455.

Wang G and Vasquez KM (2009) Models for chromosomal replication‐independent non‐B DNA structure‐induced genetic instability. Molecular Carcinogenesis 48: 286–298.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Bacolla, Albino, Cooper, David N, Vasquez, Karen M, and Tainer, John A(Jan 2018) Non‐B DNA Structure and Mutations Causing Human Genetic Disease. In: eLS. John Wiley & Sons Ltd, Chichester. http://www.els.net [doi: 10.1002/9780470015902.a0022657.pub2]